Cassandra is a distributed datastore that combines ideas from the DynamoG. DeCandia et al., “Dynamo: Amazon’s Highly Available Key-value Store,” in Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, 2007. and the BigtableF. Chang et al., “Bigtable: A Distributed Storage System for Structured Data,” in Proceedings of 7th {USENIX} Symposium on Operating Systems Design and Implementation (OSDI), 2006. paper.

Note: Besides Dynamo there is also a separate distributed system, called DynamoDB. This is commercially available, but details around its internal architecture have not been shared publicly yet. However, this system has a lot of similarities with Cassandra, such as the data model and tunable consistency.

CassandraA. Lakshman and P. Malik, “Cassandra — A Decentralized Structured Storage System,” Operating Systems Review, 2010. was originally developed by Facebook, but it was then open-sourced and became an Apache project.During this period, it has evolved significantly from its original implementation.

Note: The information in this chapter refers to the state of this project at the time of writing this course.

Design goals of Cassandra

The main design goals of Cassandra are:

  • Extremely high availability
  • Performance (high throughput/low latency with emphasis on write-heavy workloads) with unbounded, incremental scalability

Note: In order to achieve these goals Cassandra trades off some other properties, such as strong consistency.

Get hands-on with 1200+ tech skills courses.