视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
IaaSProviderilandchoosesCassandraoverMongoDB
2020-11-09 13:10:42 责编:小采
文档

iland chose Apache Cassandra over MongoDB because Cassandra provides constant time writes no matter how big the data set grows and for its distributed nature as well as its massive scalability, reliability, performance, availability, consi

“iland chose Apache Cassandra over MongoDB because Cassandra provides constant time writes no matter how big the data set grows and for its distributed nature as well as its “massive” scalability, reliability, performance, availability, consistency and simplicity.”

-Julien Anguenot, ?Director of Software Engineering at iland

Julien Anguenot ?Director of Software Engineering at iland

Follow @anguenot

iland
iland internet solutions, founded in 1995, is a pure IaaS player providing enterprise cloud infrastructure and services with several datacenters in North America, Europe and Asia.

iland platform with Cassandra

Cassandra is the sole database leveraged by the land platform, distributed across iland’s datacenters, which is the foundation of the customer-facing ECS2 iland portal.

The platform stores time series: real-time (20 seconds samples coming from vSphere) and historical rollups (1m, 1h, 1d, 1w and 1month) for dozens of virtual machine’s performance counters, corresponding resource pools and networks.

Also, Cassandra stores usage corresponding real-time and historical billing information as well as infrastructure configuration, user information etc. The platform also provides predictive analytics that help companies monitor performance, achieve consistency and anticipate growth requirements.

The iland portal is essentially an easy to use and understand front end (web and mobile) for the iland platform solutions – it covers a wealth of functionality including offering visibility into resource consumption, billing, performance, the impact of change and other key areas. It also provides usage and billing based alerts as well as cloud management features.

Evaluating MongoDB and Cassandra

iland chose Apache Cassandra over MongoDB because Cassandra provides constant time writes no matter how big the data set grows and for its distributed nature as well as its “massive” scalability, reliability, performance, availability, consistency and simplicity.

Constant-time writes no matter how big the data is a must for our real-time performance counters collection since the amount of virtual machines to collect from will increase to ten of thousands along with workers concurrently performing operations at the application level.

Multi-datacenter deployment

We use a Cassandra 2.0.x cluster distributed across 5 datacenters (Los Angeles, CA – Reston, VA – London, UK – Manchester, UK – Singapore)

iland uses the 20x Debian deb packages hosted by the Apache foundation on Ubuntu 12.04 LTS.

We use CQL3 over thrift at the moment, using Astyanax, but we are planning to switch to the DataStax CQL Java driver when Astyanax 2.0 will be released.

Each datacenter has at least one rack of 3 nodes and all data is replicated across all nodes in the cluster.

To date: total cluster nodes is 18 and and we are getting close to 1TB of data (application has been deployed empty in September 2013 with the iland ECS2 brand new offering with no legacy data to migrate over to Cassandra)

?

(RF = replication factor)

?

RF = 3, W – LOCAL_QUORUM (2 nodes), R – LOCAL_QUORUM (2 nodes)

?

This configuration allows for a single node to fail while still serving both READS and WRITES. This setup comes at a cost of having a larger data footprint in terms of storage size. This allows us to use Cassandra’s tunable consistency to our advantage and ensure that all reads are consistent, yet keeping our availability as high as possible when running on 3 nodes.

?

Cassandra nodes are running off Ubuntu powered virtual machines (in vSphere). Each node has 16GB of RAM and 8 vCPUs

?

Getting started

My advice would be to start w/ a single-node instance to avoid clustering related concerns initially and use CQL3 (vs thrift) from the start.

?

Documentation is great, issue tracker and mailing list are great source of information, upgrade and maintenance of Cassandra are painless and drivers such as the DataStax CQL drivers for Python or Java as well as Netflix’s Astyanax have been working just great for us.

下载本文
显示全文
专题