The following post originally appeared in the Yammer Engineering blog on September 10, 2014.
The Yammer architecture has evolved from a monolithic Rails application to a micro-service based architecture. Our services share many commonalities, including the use of Dropwizard and Metrics. However, as with many micro-service based architectures, each micro-service may have very different needs in terms of persistence. This has led to the adoption of polyglot persistence here at Yammer.
As such, engineers at Yammer are very interested in understanding and evaluating the differences between various databases. One particularly enlightening exposition has been the Call Me Maybe series by Aphyr. This series demonstrates how various databases behave in the presence of network partitions caused by an automated test framework called Jepsen(*). Some of the databases it covers are Postgres, Redis, MongoDB, Riak, Cassandra, and FoundationDB. I have often wondered how HBase would behave under Jepsen. Let’s try augmenting Jepsen to find out.
Jepsen actually consists of two parts. The first part is a provisioning framework written using salticid. This provisions a five-node Ubuntu cluster, where the nodes are referred to as n1, n2, n3, n4, and n5. It can then install and setup the desired database. The second part is a set of runtime tests written using Clojure. (Aphyr also has a great tutorial on Clojure called Clojure from the ground up.)
As I don’t have much experience with HBase, I decided to forego the salticid customizations and simply use Cloudera Manager to set up a five-node HBase cluster on EC2. I used CDH 5.1.2, which bundles hbase-0.98.1+cdh5.1.2+70. I then modified the /etc/hosts file on each of the nodes to add entries for n1 through n5 (with n1 being the ZooKeeper master). With that step done, it was time to write the actual tests, which are available here.
The first test I wrote, hbase-app, is based on one of the Postgres tests. It simply adds a single row for each number. While it is running, Jepsen uses iptables to simulate a network partition within the cluster. Let’s see how it behaves.
lein run hbase -n 2000 ... 0 unrecoverable timeouts Collecting results. Writes completed in 200.047 seconds 2000 total 2000 acknowledged 2000 survivors All 2000 writes succeeded. :-D
All 2000 writes succeeded and there was no data loss. So far, so good.
The second test I wrote, hbase-append-app, is based on one of the FoundationDB tests. It repeatedly writes to the same cell by attempting to append to a list stored as a blob while a network partition occurs. The test uses the checkAndPut call, which allows for atomic read-modify-write operations.
lein run hbase-append -n 2000 ... 0 unrecoverable timeouts Collecting results. Writes completed in 200.05 seconds 2000 total 282 acknowledged 282 survivors all 282 acked writes out of 2000 succeeded. :-)
This time not all writes succeeded, since all clients are attempting to write to the same cell, and the chance is high that another client will write to the cell between the read and write of a given client’s read-modify-write operation. However, no data loss occurred, as all 282 successful writes are apparent in the final result.
The third test I wrote, hbase-isolation-app, is based on one of the Cassandra tests. It modifies two cells in the same row while a network partition occurs, to test if row updates are atomic.
lein run hbase-isolation -n 2000 ... 0 unrecoverable timeouts Collecting results. () Writes completed in 200.043 seconds 2000 total 2000 acknowledged 2000 survivors All 2000 writes succeeded. :-D
Nice. Row updates are atomic as HBase claims. However, the above test modifies two cells in the same column family. Let’s try modifying two cells in different column families, but in the same row. From what I understand of HBase, cells in different column families are stored in different HBase “stores”. I ran a fourth test, hbase-isolation-multiple-cf-app, to see if it made a difference.
lein run hbase-isolation-multiple-cf -n 2000 ... 0 unrecoverable timeouts Collecting results. () Writes completed in 200.036 seconds 2000 total 2000 acknowledged 2000 survivors All 2000 writes succeeded. :-D
Move along, nothing to see here.
Finally, HBase claims to have atomic counters. In Dynamo-based systems such as Cassandra and Riak, a counter needs to be a CRDT to behave properly. Let’s see how HBase counters behave. I used a fifth test, hbase-counter-app, that is also based on one of the Cassandra tests.
lein run hbase-counter -n 2000 ... 0 unrecoverable timeouts Collecting results. Writes completed in 200.045 seconds 2000 total 2000 acknowledged 2000 survivors All 2000 writes succeeded. :-D
No counter increments were lost.
HBase performed well in all five of the above tests. The claims in its documentation concerning atomic row updates and counter operations held up. I’m looking forward to learning more about HBase, which so far appears to be a very solid database.
Update: see the addendum here.
(*) Aphyr is now working on Jepsen 2. This blog refers to the original Jepsen.