![]() ![]() We also worked to improve ScyllaDB performance for our use cases. ![]() We also wanted to gain more experience with ScyllaDB in production, using it in anger and learning its pitfalls. Additionally, we wanted to make sure our new database could be the best it could be as we worked to tune its performance. With trillions of messages and nearly 200 nodes, any migration was going to be an involved effort. Why hadn’t we migrated it yet? To start with, it’s a big cluster. The last one? Our friend, cassandra-messages. While this decision could be a blog post in itself, the short version is that by 2020, we had migrated every database but one to ScyllaDB. These issues were a huge source of on-call toil, and the root of many stability issues within our messages cluster.Īfter experimenting with ScyllaDB and observing improvements in testing, we made the decision to migrate all of our databases. Historically, our team has had many issues with the garbage collector on Cassandra, from GC pauses affecting latency, all the way to super long consecutive GC pauses that got so bad that an operator would have to manually reboot and babysit the node in question back to health. Its promise of better performance, faster repairs, stronger workload isolation via its shard-per-core architecture, and a garbage collection-free life sounded quite appealing.Īlthough ScyllaDB is most definitely not void of issues, it is void of a garbage collector, since it’s written in C++ rather than Java. In our previous iteration of this post, we mentioned being intrigued by ScyllaDB, a Cassandra-compatible database written in C++. We had several other clusters, and each exhibited similar (though perhaps not as severe) faults. Our messages cluster wasn’t our only Cassandra database. We also spent a large amount of time tuning the JVM’s garbage collector and heap settings, because GC pauses would cause significant latency spikes. We frequently performed an operation we called the “gossip dance”, where we’d take a node out of rotation to let it compact without taking traffic, bring it back in to pick up hints from Cassandra’s hinted handoff, and then repeat until the compaction backlog was empty. Not only were our reads then more expensive, but we’d also see cascading latency as a node tried to compact. ![]() We were prone to falling behind on compactions, where Cassandra would compact SSTables on disk for more performant reads. Since we perform reads and writes with quorum consistency level, all queries to the nodes that serve the hot partition suffer latency increases, resulting in broader end-user impact.Ĭluster maintenance tasks also frequently caused trouble. Other queries to this node were affected as the node couldn’t keep up. One channel and bucket pair received a large amount of traffic, and latency in the node would increase as the node tried harder and harder to serve traffic and fell further and further behind. When we encountered a hot partition, it frequently affected latency across our entire database cluster. The size of our dataset when combined with these access patterns led to struggles for our cluster. Lots of concurrent reads as users interact with servers can hotspot a partition, which we refer to imaginatively as a “hot partition”. Reads, however, need to query the memtable and potentially multiple SSTables (on-disk files), a more expensive operation. Writes are appended to a commit log and written to an in memory structure called a memtable that is eventually flushed to disk. In Cassandra, reads are more expensive than writes. Within this partitioning lies a potential performance pitfall: a server with just a small group of friends tends to send orders of magnitude fewer messages than a server with hundreds of thousands of people. This partitioning means that, in Cassandra, all messages for a given channel and bucket will be stored together and replicated across three nodes (or whatever you’ve set the replication factor). We partition our messages by the channel they’re sent in, along with a bucket, which is a static time window. Every ID we use is a Snowflake, making it chronologically sortable. The CQL statement above is a minimal version of our message schema. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |