Recently I added the ability for KCache to be configured with different types of backing caches. KCache, by providing an ordered key-value abstraction for a compacted topic in Kafka, can be used as a foundation for a highly-available service for storing metadata, similar to ZooKeeper, etcd, and Consul.
For such a metadata store, I wanted the following features:
- Ordered key-value model
- APIs available through gRPC
- Transactional support using multiversion concurrency control (MVCC)
- High availability through leader election and failover
- Ability to be notified of changes to key-value tuples
- Ability to expire key-value tuples
It turns out that etcd already has these features, so I decided to use the same gRPC APIs and data model as in etcd v3. In addition, a comprehensive set of Jepsen tests have been built for etcd, so by using the same APIs, I could make use of the same Jepsen tests.
etcdctl command line client that ships with etcd.
To get started with Keta, download a release, unpack it, and then modify
config/keta.properties to point to an existing Kafka broker. Then run the following:
$ bin/keta-start config/keta.properties
Next download etcd as described here. At a separate terminal, start
$ etcdctl put mykey "this is awesome" $ etcdctl get mykey
The etcd APIs have a concise way for expressioning transactions.
$ etcdctl put user1 bad $ etcdctl txn --interactive compares: value("user1") = "bad" success requests (get, put, delete): del user1 failure requests (get, put, delete): put user1 good
To expire key-value tuples, use a lease.
$ etcdctl lease grant 300 # lease 2be7547fbc6a5afa granted with TTL(300s) $ etcdctl put sample value --lease=2be7547fbc6a5afa $ etcdctl get sample $ etcdctl lease keep-alive 2be7547fbc6a5afa $ etcdctl lease revoke 2be7547fbc6a5afa # or after 300 seconds $ etcdctl get sample
To receive change notifications, use a watch.
$ etcdctl watch stock --prefix
Then at a separate terminal, enter the following:
$ etcdctl put stock1 10 $ etcdctl put stock2 20
If you prefer a GUI, you can use etcdmanager when working with Keta.
Leader Election using the Magical Rebalance Protocol
Keta achieves high availability by allowing any number of Keta instances to be run as a cluster. One instance is chosen as the leader, and all other instances act as followers. The followers will forward both reads and writes to the leader. If the leader dies, another leader is chosen.
Leader election is accomplished by using the rebalance protocol of Kafka, which is the same protocol that is used to assign topic-partitions to consumers in a consumer group.
As mentioned, one nice aspect of using the etcd APIs is that a set of Jepsen tests are already available. This allowed me to use Jepsen-Driven Development (JDD) when developing Keta, which is like Test-Driven Development (TDD), but on steroids.
Jepsen is an awesome framework written in Clojure for testing (or more like breaking) distributed systems. It comes with an in-depth tutorial for writing new Jepsen tests.
I was able to modify the existing Jepsen tests for etcd by having the tests install and start Keta instead of etcd. The client code in the test, which uses the native Java client for etcd, remained untouched. The modified tests can be found here.
I was able to successfully run three types of etcd tests:
- A set test, which uses a compare-and-set transaction to concurrently read a set of integers from a single key and append a value to that set. This test is designed to measure stale reads.
- An append test, which uses transactions to concurrently read and append to lists of unique integers. This test is designed to verify strict serializability.
- A register test, which concurrently performs randomized reads, writes, and compare-and-set operations over single keys. This test is designed to verify linearizability.
Jepsen has a built-in component called nemesis that can inject faults into the system during test runs. For the etcd tests, nemesis was used to kill the leader and to create network partitions.
One challenge of running Keta with Jepsen is that leader election using the rebalance protocol can take several seconds, whereas leader election in etcd, which uses Raft, only takes a second or less. This means that the number of unsuccessful requests is higher in the tests when using Keta than when using etcd, but this is to be expected.
In any case, Keta passes all of the above tests.2
# Run the Jepsen set test $ lein run test --concurrency 2n --workload set --nemesis kill,partition ... Everything looks good! ヽ(‘ー`)ノ # or sometimes ... Errors occurred during analysis, but no anomalies found. ಠ~ಠ
What’s Not in the Box
Keta only provides a subset of the functionality of etcd and so is not a substitute for it. In particular, it is missing
- A lock service for clients
- An election service for clients
- An immutable version history for key-value tuples
- Membership reconfiguration
For example, Keta only keeps the latest value for a specific key, and not its prior history as is available with etcd. However, if you’re interested in a highly-available transactional metadata store backed by Apache Kafka that provides most of the features of etcd, please give Keta a try.