September 2020 – Robert Yokota

KCache is a library that provides an ordered key-value store (OKVS) abstraction for a compacted topic in Kafka. As an OKVS, it can be used to treat Kafka as a multi-model database, allowing Kafka to represent graphs, documents, and relational data.

Initially KCache stored data from the compacted topic in an in-memory cache. In newer releases, KCache can be configured to use a persistent cache that stores data to disk. This allows KCache to handle larger data sets, and also improves startup times. The persistent cache is itself an embedded OKVS, and can be configured to be one of the following implementations:

Berkeley DB JE
LMDB
MapDB
RocksDB

Here is a quick comparison of the different embedded OKVS libraries before I go into more detail.

Embedded OKVS	Data Structure	Language	Transactions	Secondary Indexes	License
BDB JE	B+ tree	Java	Yes	Yes	Apache
LMDB	B+ tree	C	Yes	No	BSD-like
MapDB	B+ tree	Java	Yes	No	Apache
RocksDB	LSM tree	C++	Yes	No	Apache

Below are additional details on the various libraries. I also add some historical notes that I find interesting.

Berkeley DB JE

Berkeley DB JE is the Java Edition of the Berkeley DB library. It is similar but not compatible to the C edition that predates it. The C edition is simply referred to as Berkeley DB.

Berkeley DB grew out of efforts at the University of California, Berkeley as part of BSD to replace the popular dbm library that existed in AT&T Unix, due to patent issues. It was first released in 1991.

Berkeley DB JE is the core library in Oracle NoSQL, which extends the capabilities of Berkeley DB JE to a sharded, clustered environment. Berkeley DB JE supports transactions, and is unique in that it also supports secondary indexes. It has additional advanced features such as replication and hot backups. Internally it uses a B+ tree to store data.

LMDB

LMDB, short for Lightening Memory-Mapped Database, is another OKVS that uses the B++ tree data structure. It was initially designed in 2009 to replace Berkeley DB in the OpenLDAP project.

LMDB supports transactions but not secondary indexes. LMDB uses a copy-on-write semantics that allows it to not use a transaction log.

MapDB

MapDB is a pure Java implementation of an OKVS. It evolved from a project started in 2001 called JDBM, which was meant to be a pure Java implementation of the dbm library in AT&T Unix. MapDB provides several collection APIs, including maps, sets, lists, and queues.

MapDB uses a B+ tree data structure and supports transactions, but not secondary indexes. MapDB also supports snapshots and incremental backups.

RocksDB

RocksDB was created by Facebook in 2012 as a fork of LevelDB. LevelDB is a library created by Google in 2011 based on ideas from BigTable, the inspiration for HBase. Both BigTable and HBase can be viewed as distributed OKVSs.

Unlike the OKVSs mentioned above, RocksDB uses an LSM tree to store data. It supports different compaction styles for merging SST files. It adds many features that do not exist in LevelDB, including column families, transactions, backups, and checkpoints. RocksDB is written in C++.

Selecting a Persistent Cache

When selecting a persistent cache for KCache, the first consideration is whether your application is read-heavy vs write-heavy. In general, an OKVS based on a B+ tree is faster for reads, while one based on an LSM tree is faster for writes. There’s a good discussion of the pros and cons of B+ trees and LSM trees in Chapter 3 of Designing Data-Intensive Applications, by Martin Kleppmann.

For further performance comparisons, the LMDB project has some good benchmarks here, although they don’t include Berkeley DB JE. I’ve ported the LMDB benchmarks for KCache and included Berkeley DB JE, so that you can try the benchmarks for yourself on your platform of choice.

Robert Yokota

Month: September 2020

Using KCache with a Persistent Cache

Berkeley DB JE

LMDB

MapDB

RocksDB

Selecting a Persistent Cache