Understanding JSON Schema Compatibility

Confluent Schema Registry provides a centralized repository for an organization’s schemas and a version history of the schemas as they evolve over time.  The first format supported by Schema Registry was Avro. Avro was developed with schema evolution in mind, and its specification clearly states the rules for backward compatibility, where a schema used to read an Avro record may be different from the schema used to write it.

In addition to Avro, today Schema Registry supports both Protobuf and JSON Schema. JSON Schema does not explicitly define compatibility rules, so in this article I will explain some nuances of how compatibility works for JSON Schema.

Grammar-Based and Rule-Based Schema Languages

In general, schema languages can be either grammar-based or rule-based.1 Grammar-based languages are used to specify the structure of a document instance. Both Avro and Protobuf are grammar-based schema languages. Rule-based languages typically specify a set of boolean constraints that the document must satisfy.

JSON Schema combines aspects of both a grammar-based language and a rule-based one. Its rule-based nature can be seen by its use of conjunction (allOf), disjunction (oneOf), negation (not), and conditional logic (if/then/else). The elements of these boolean operators tend to be grammar-based constraints, such as constraining the type of a property.

Open and Closed Content Models

A JSON Schema can be represented as a JSON object or a boolean. In the case of a boolean, the value true will match any valid JSON document, whereas the value false will match no documents. The value true is a synonym for {}, which is a JSON schema (represented as an empty JSON object) with no constraints. Likewise, the value false is a synonym for { "not": {} }.

By default, a JSON schema provides an open content model. For example, the following JSON schema constrains the properties “foo” and “bar” to be of type “string”, but allows any additional properties of arbitrary type.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  }
}

The above schema would accept the following JSON document, containing a property named “zap” that does not appear in the schema:

{ 
  "foo": "hello",
  "bar": "world",
  "zap": 123
}

In order to specify a closed content model, in which additional properties such as “zap” would not be accepted, the schema can be specified with “additionalProperties” as false.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  "additionalProperties": false
}

Backward, Forward, and Full Compatibility

In terms of schema evolution, there are three types of compatibility2:

  1. Backward compatibility – all documents that conform to the previous version of the schema are also valid according to the new version
  2. Forward compatibility – all documents that conform to the new version are also valid according to the previous version of the schema
  3. Full compatibility – the previous version of the schema and the new version are both backward compatible and forward compatible

For the schemas above, Schema 1 is backward compatible with Schema 2, which implies that Schema 2 is forward compatible with Schema 1. This is because any document that conforms to Schema 2 will also conform to Schema 1. Since the default value of “additionalProperties” is true, Schema 1 is equivalent to

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  "additionalProperties": true
}

Note that the schema true is backward compatible with false. In fact

  1. The schema true (or {}) is backward compatible with all schemas.
  2. The only schema backward compatible with true is true.
  3. The schema false (or { "not": {} }) is forward compatible with all schemas.
  4. The only schema forward compatible with false is false.

Partially Open Content Models

You may want to allow additional unspecified properties, but only of a specific type. In these scenarios, you can use a partially open content model. One way to specify a partially open content model is to specify a schema other than true or false for “additionalProperties”.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  "additionalProperties": { "type": "string" }
}

The above schema would accept a document containing a string value for “zap”:

{ 
  "foo": "hello",
  "bar": "world",
  "zap": "champ"
}

but not a document containing an integer value for “zap”:

{ 
  "foo": "hello",
  "bar": "world",
  "zap": 123
}

Later one could explicitly specify “zap” as a property with type “string”:

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" },
    "zap": { "type": "string" }
  },
  "additionalProperties": { "type": "string" }
}

Schema 5 is backward compatible with Schema 4.

One could even accept other types for “zap”, using a oneOf for example.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" },
    "zap": { 
      "oneOf": [ { "type": "string" }, { "type": "integer" } ] 
    }
  },
  "additionalProperties": { "type": "string" }
}

Schema 6 is also backward compatible with Schema 4.

Another type of partially open content model is one that constrains the additional properties with a regular expression for matching the property name, using a patternProperties construct.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  patternProperties": {
    "^s_": { "type": "string" }
  },
  "additionalProperties": false
}

The above schema allows any other properties other than “foo” and “bar” to appear, as long as the property name starts with “s_” and the type is “string”.

Understanding Full Compatibility

When evolving a schema in a backward compatible manner, it’s easy to add properties to a closed content model, or to remove properties from an open content model. In general, there are two rules to follow to evolve a schema in a backward compatible manner:

  1. When adding a property in a backward compatible manner, the schema of the property being added must be backward compatible with the schema of “additionalProperties” in the previous version of the schema.
  2. When removing a property in a backward compatible manner, the schema of “additionalProperties” in the new version of the schema must be backward compatible with the schema of the property being removed.

The rules for forward compatibility are similar.

  1. When adding a property in a forward compatible manner, the schema of the property being added must be forward compatible with the schema of “additionalProperties” in the previous version of the schema.
  2. When removing a property in a forward compatible manner, the schema of “additionalProperties” in the new version of the schema must be forward compatible with the schema of the property being removed.

For example, to add a property to an open content model, such as Schema 3, in a backward compatible manner, one can add it with type true, since true is the only schema that is backward compatible with true, as previously mentioned.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" },
    "zap": true
  }
  "additionalProperties": true
}

The property “zap” has been added, but it’s been specified with type true, which means that it can match any valid JSON. Schema 3 and Schema 8 are also fully compatible, since they both accept the same set of documents.

This leads to a way to evolve a closed content model, such as Schema 2, in a fully compatible manner, by adding a property of type false.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" },
    "zap": false
  }
  "additionalProperties": false
}

Admittedly, Schema 9 is not very interesting, because in this case the property “zap” matches nothing.

The rules for full compatibility can now be stated as follows.

  1. When adding a property in a fully compatible manner, the schema of the property being added must be fully compatible with the schema of “additionalProperties” in the previous version of the schema.
  2. When removing a property in a fully compatible manner, the schema of “additionalProperties” in the new version of the schema must be fully compatible with the schema of the property being removed.

Using Partially Open Content Models for Full Compatibility

The previous examples of full compatibility are of limited use, since they only allow new properties to match anything using true, in the case of an open content model, or to match nothing using false, in the case of a closed content model. To achieve full compatibility in a meaningful manner, one can use a partially open content model, such as Schema 4, which I repeat below.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  "additionalProperties": { "type": "string" }
}

Schema 4 allows one to add and remove properties of type “string” in a fully compatible manner. What if you want to add properties of either type “string” or “integer”? You could specify additionalProperties with a oneOf, as in the following schema:

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  "additionalProperties": { 
    "oneOf": [ { "type": "string" }, { "type": "integer" } ] 
  }
}

But with the above schema, every fully compatible schema that adds a new property would have to specify the type of the property as a oneOf as well:

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" },
    "zap": { 
      "oneOf": [ { "type": "string" }, { "type": "integer" } ] 
    }
  },
  "additionalProperties": { 
    "oneOf": [ { "type": "string" }, { "type": "integer" } ] 
  }
}

An alternative would be to use patternProperties. The rules in the previous section regarding adding and removing properties do not apply when using patternProperties.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  "patternProperties": {
    "^s_": { "type": "string" },
    "^i_": { "type": "integer" }
  },
  "additionalProperties": false
}

With Schema 12, one can add properties of type “string” that start with “s_”, or properties of type “integer” that start with “i_”, in a fully compatible manner, as shown below with “s_zap” and “i_zap”.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" },
    "s_zap": { "type": "string" },
    "i_zap": { "type": "integer" }
  },
  patternProperties": {
    "^s_": { "type": "string" },
    "^i_": { "type": "integer" }
  },
  "additionalProperties": false
}

Achieving full compatibility in a meaningful way is possible, but requires some up-front planning, possibly with the use of patternProperties.

Summary

JSON Schema is unique when compared to other schema languages like Avro and Protobuf in that it has aspects of both a rule-based language and a grammar-based language. A better understanding of open, closed, and partially-open content models can help you when evolving schemas in a backward, forward, or fully compatible manner.

Understanding JSON Schema Compatibility

My 10 Favorite Computer Science Books

I’ve seen other lists of favorite books over the last few months, so I thought I’d jump into the fray.  If I were stuck on an island (or in my room during a pandemic), here are the ten computer science books that I would want by my side.  These ten books also provide a great overview of the entire field of computer science.

Structure and Interpretation of Computer Programs

I was fortunate enough to take the course for which this book was developed when I was an undergraduate at MIT.   Since then this book has become a classic.  If you can invest the time, the text has the potential to expand your mind and make you a better programmer.

Artificial Intelligence: A Modern Approach

When I was at MIT, the textbook for the Artificial Intelligence class was written by Professor Winston.  While that was a great book, since then the above textbook by Russell and Norvig has become the bible for AI, and for good reason.  Not only is the text comprehensive, but it is extremely clear and easy to read.

The Language of Machines

I met Professor Floyd while working toward a Master’s degree at Stanford.  He taught an advanced class on automata and computability that later led to the above textbook.  The text is unique in that it introduces a unified model of computation that ties together seemingly disparate concepts.  The class was also memorable in that Professor Floyd was one of the kindest, gentlest teachers I have ever known.  Unfortunately he passed away in 2001.

Abstraction and Specification in Program Development

Professor Liskov is famous for the Liskov Substitution Principle, as well as being a Turing Award winner.  Professor Guttag was my undergraduate advisor, and later became head of the Electrical Engineering and Computer Science department at MIT.  While this is probably the least known book on my list, its influence is greater than is recognized. Although the book uses the CLU programming language to convey its ideas, the ideas were carried over to a subsequent textbook, Program Development in Java.   The above textbook also influenced Introduction to Computation and Programming in Python, which is the foundational text for today’s MIT undergraduate program in computer science.

Compilers: Principles, Techniques, and Tools

The above book is often simply referred to as the Dragon Book.  It remains the bible for compiler theory.  Professor Lam taught an advanced compiler course that I took at Stanford.

The Design and Implementation of the FreeBSD OS

The above textbook is not only comprehensive, but easy to read.  It gives an in-depth view of an influential operating system that continues to be relevant today.

Introduction to Algorithms

When I was at MIT, the authors were just starting to develop this text.  From the chapter notes that were provided during class, I could already tell that a great textbook was taking shape.  Of course, this book is now considered the bible of its field.

TCP/IP Illustrated, Volume 1

This is probably the best book from which to learn about modern day networking.  The level of depth is unmatched and the text continues to reward those who return to it again and again.  W. Richard Stevens passed away in 1999, but fortunately his books continue to be revised for today’s readers.

Transaction Processing: Concepts and Techniques

The authors introduced the ACID (atomicity, consistency, isolation, durability) properties of database transactions, and this book is a comprehensive summary of their work.  The database community lost a true giant when Jim Gray was lost at sea in 2007, while on a short trip to scatter his mother’s ashes near San Francisco.

Concrete Mathematics: A Foundation for Computer Science

This text started as an expanded treatment of the “Mathematical Preliminaries” chapter of The Art of Computer Programming.  While I have not yet read The Art of Computer Programming, I was fortunate enough to take the Stanford course that used the above book, taught by Professor Knuth.  The book provides a more leisurely introduction to the mathematical analysis of algorithms, in a manner that is both challenging and fun.

 

My 10 Favorite Computer Science Books

Keta: A Metadata Store Backed By Apache Kafka

Recently I added the ability for KCache to be configured with different types of backing caches. KCache, by providing an ordered key-value abstraction for a compacted topic in Kafka, can be used as a foundation for a highly-available service for storing metadata, similar to ZooKeeper, etcd, and Consul.

For such a metadata store, I wanted the following features:

  • Ordered key-value model
  • APIs available through gRPC
  • Transactional support using multiversion concurrency control (MVCC)
  • High availability through leader election and failover
  • Ability to be notified of changes to key-value tuples
  • Ability to expire key-value tuples

It turns out that etcd already has these features, so I decided to use the same gRPC APIs and data model as in etcd v3.  In addition, a comprehensive set of Jepsen tests have been built for etcd, so by using the same APIs, I could make use of the same Jepsen tests.

The resulting system is called Keta1.

Hello, Keta

By adopting the etcd v3 APIs, Keta can be used by any client that supports these APIs. Etcd clients are available in go, Java, Python, JavaScript, Ruby, C++, Erlang, and .NET.  In addition, Keta can be used with the etcdctl command line client that ships with etcd.

To get started with Keta, download a release, unpack it, and then modify config/keta.properties to point to an existing Kafka broker.  Then run the following:

$ bin/keta-start config/keta.properties
 

Next download etcd as described here. At a separate terminal, start etcdctl:

$ etcdctl put mykey "this is awesome"
$ etcdctl get mykey
 

The etcd APIs have a concise way for expressioning transactions.

$ etcdctl put user1 bad
$ etcdctl txn --interactive

compares:
value("user1") = "bad"      

success requests (get, put, delete):
del user1  

failure requests (get, put, delete):
put user1 good
 

To expire key-value tuples, use a lease.

$ etcdctl lease grant 300
# lease 2be7547fbc6a5afa granted with TTL(300s)

$ etcdctl put sample value --lease=2be7547fbc6a5afa
$ etcdctl get sample

$ etcdctl lease keep-alive 2be7547fbc6a5afa
$ etcdctl lease revoke 2be7547fbc6a5afa
# or after 300 seconds
$ etcdctl get sample
 

To receive change notifications, use a watch.

$ etcdctl watch stock --prefix

Then at a separate terminal, enter the following:

$ etcdctl put stock1 10
$ etcdctl put stock2 20

If you prefer a GUI, you can use etcdmanager when working with Keta.

Leader Election using the Magical Rebalance Protocol

Keta achieves high availability by allowing any number of Keta instances to be run as a cluster. One instance is chosen as the leader, and all other instances act as followers. The followers will forward both reads and writes to the leader. If the leader dies, another leader is chosen.

Leader election is accomplished by using the rebalance protocol of Kafka, which is the same protocol that is used to assign topic-partitions to consumers in a consumer group.

Jepsen-Driven Development

As mentioned, one nice aspect of using the etcd APIs is that a set of Jepsen tests are already available. This allowed me to use Jepsen-Driven Development (JDD) when developing Keta, which is like Test-Driven Development (TDD), but on steroids.

Jepsen is an awesome framework written in Clojure for testing (or more like breaking) distributed systems. It comes with an in-depth tutorial for writing new Jepsen tests.

I was able to modify the existing Jepsen tests for etcd by having the tests install and start Keta instead of etcd. The client code in the test, which uses the native Java client for etcd, remained untouched. The modified tests can be found here.

I was able to successfully run three types of etcd tests:

  1. A set test, which uses a compare-and-set transaction to concurrently read a set of integers from a single key and append a value to that set.   This test is designed to measure stale reads.
  2. An append test, which uses transactions to concurrently read and append to lists of unique integers.  This test is designed to verify strict serializability.
  3. A register test, which concurrently performs randomized reads, writes, and compare-and-set operations over single keys.  This test is designed to verify linearizability.

Jepsen has a built-in component called nemesis that can inject faults into the system during test runs. For the etcd tests, nemesis was used to kill the leader and to create network partitions.

One challenge of running Keta with Jepsen is that leader election using the rebalance protocol can take several seconds, whereas leader election in etcd, which uses Raft, only takes a second or less. This means that the number of unsuccessful requests is higher in the tests when using Keta than when using etcd, but this is to be expected.

In any case, Keta passes all of the above tests.2

# Run the Jepsen set test
$ lein run test --concurrency 2n --workload set --nemesis kill,partition
...
Everything looks good! ヽ(‘ー`)ノ

# or sometimes
...
Errors occurred during analysis, but no anomalies found. ಠ~ಠ
 

What’s Not in the Box

Keta only provides a subset of the functionality of etcd and so is not a substitute for it. In particular, it is missing

  • A lock service for clients
  • An election service for clients
  • An immutable version history for key-value tuples
  • Membership reconfiguration

For example, Keta only keeps the latest value for a specific key, and not its prior history as is available with etcd.  However, if you’re interested in a highly-available transactional metadata store backed by Apache Kafka that provides most of the features of etcd, please give Keta a try.

Keta: A Metadata Store Backed By Apache Kafka

Using KCache with a Persistent Cache

KCache is a library that provides an ordered key-value store (OKVS) abstraction for a compacted topic in Kafka. As an OKVS, it can be used to treat Kafka as a multi-model database, allowing Kafka to represent graphs, documents, and relational data.

Initially KCache stored data from the compacted topic in an in-memory cache. In newer releases, KCache can be configured to use a persistent cache that stores data to disk. This allows KCache to handle larger data sets, and also improves startup times. The persistent cache is itself an embedded OKVS, and can be configured to be one of the following implementations:

  • Berkeley DB JE
  • LMDB
  • MapDB
  • RocksDB

Here is a quick comparison of the different embedded OKVS libraries before I go into more detail.

Embedded OKVS Data Structure Language Transactions Secondary Indexes License
BDB JE B+ tree Java Yes Yes Apache
LMDB B+ tree C Yes No BSD-like
MapDB B+ tree Java Yes No Apache
RocksDB LSM tree C++ Yes No Apache

Below are additional details on the various libraries.  I also add some historical notes that I find interesting.

Berkeley DB JE

Berkeley DB JE is the Java Edition of the Berkeley DB library.  It is similar but not compatible to the C edition that predates it.  The C edition is simply referred to as Berkeley DB.

Berkeley DB grew out of efforts at the University of California, Berkeley as part of BSD to replace the popular dbm library that existed in AT&T Unix, due to patent issues.  It was first released in 1991.

Berkeley DB JE is the core library in Oracle NoSQL, which extends the capabilities of Berkeley DB JE to a sharded, clustered environment.  Berkeley DB JE supports transactions, and is unique in that it also supports secondary indexes.  It has additional advanced features such as replication and hot backups.  Internally it uses a B+ tree to store data.

LMDB

LMDB, short for Lightening Memory-Mapped Database, is another OKVS that uses the B++ tree data structure.  It was initially designed in 2009 to replace Berkeley DB in the OpenLDAP project.

LMDB supports transactions but not secondary indexes.  LMDB uses a copy-on-write semantics that allows it to not use a transaction log.

MapDB

MapDB is a pure Java implementation of an OKVS.  It evolved from a project started in 2001 called JDBM, which was meant to be a pure Java implementation of the dbm library in AT&T Unix. MapDB provides several collection APIs, including maps, sets, lists, and queues.

MapDB uses a B+ tree data structure and supports transactions, but not secondary indexes.  MapDB also supports snapshots and incremental backups.

RocksDB

RocksDB was created by Facebook in 2012 as a fork of LevelDB.  LevelDB is a library created by Google in 2011 based on ideas from BigTable, the inspiration for HBase.  Both BigTable and HBase can be viewed as distributed OKVSs.

Unlike the OKVSs mentioned above, RocksDB uses an LSM tree to store data.  It supports different compaction styles for merging SST files.  It adds many features that do not exist in LevelDB, including column families, transactions, backups, and checkpoints.  RocksDB is written in C++.

Selecting a Persistent Cache

When selecting a persistent cache for KCache, the first consideration is whether your application is read-heavy vs write-heavy.  In general, an OKVS based on a B+ tree is faster for reads, while one based on an LSM tree is faster for writes.  There’s a good discussion of the pros and cons of B+ trees and LSM trees in Chapter 3 of Designing Data-Intensive Applications, by Martin Kleppmann.

For further performance comparisons, the LMDB project has some good benchmarks here, although they don’t include Berkeley DB JE.  I’ve ported the LMDB benchmarks for KCache and included Berkeley DB JE, so that you can try the benchmarks for yourself on your platform of choice.

Using KCache with a Persistent Cache

Putting Several Event Types in the Same Topic – Revisited

The following post originally appeared in the Confluent blog on July 8, 2020.

In the article Should You Put Several Event Types in the Same Kafka Topic?, Martin Kleppmann discusses when to combine several event types in the same topic and introduces new subject name strategies for determining how Confluent Schema Registry should be used when producing events to an Apache Kafka® topic.

Schema Registry now supports schema references in Confluent Platform 5.5, and this blog post presents an alternative means of putting several event types in the same topic using schema references, discussing the advantages and disadvantages of this approach.

Constructs and constraints

Apache Kafka, which is an event streaming platform, can also act as a system of record or a datastore, as seen with ksqlDB. Datastores are composed of constructs and constraints. For example, in a relational database, the constructs are tables and rows, while the constraints include primary key constraints and referential integrity constraints. Kafka does not impose constraints on the structure of data, leaving that role to Confluent Schema Registry. Below are some constructs when using both Kafka and Schema Registry:

  • Message: a data item that is made up of a key (optional) and value
  • Topic: a collection of messages, where ordering is maintained for those messages with the same key (via underlying partitions)
  • Schema (or event type): a description of how data should be structured
  • Subject: a named, ordered history of schema versions

The following are some constraints that are maintained when using both Kafka and Schema Registry:

  • Schema-message constraints: A schema constrains the structure of the message. The key and value are typically associated with different schemas. The association between a schema and the key or value is embedded in the serialized form of the key or value.
  • Subject-schema constraints: A subject constrains the ordered history of schema versions, also known as the evolution of the schema. This constraint is called a compatibility level. The compatibility level is stored in Schema Registry along with the history of schema versions.
  • Subject-topic constraints: When using the default TopicNameStrategy, a subject can constrain the collection of messages in a topic. The association between the subject and the topic is by convention, where the subject name is {topic}-key for the message key and {topic}-value for the message value.

Using Apache Avro™ unions before schema references

As mentioned, the default subject name strategy, TopicNameStrategy, uses the topic name to determine the subject to be used for schema lookups, which helps to enforce subject-topic constraints. The newer subject-name strategies, RecordNameStrategy and TopicRecordNameStrategy, use the record name (along with the topic name for the latter strategy) to determine the subject to be used for schema lookups. Before these newer subject-name strategies were introduced, there were two options for storing multiple event types in the same topic:

  • Disable subject-schema constraints by setting the compatibility level of a subject to NONE and allowing any schema to be saved in the subject, regardless of compatibility
  • Use an Avro union

The second option of using an Avro union was preferred, but still had the following issues:

  • The resulting Avro union could become unwieldy
  • It was difficult to independently evolve the event types contained within the Avro union

By using either RecordNameStrategy or TopicRecordNameStrategy, you retain subject-schema constraints, eliminate the need for an Avro union, and gain the ability to evolve types independently. However, you lose subject-topic constraints, as now there is no constraint on the event types that can be stored in the topic, which means the set of event types in the topic can grow unbounded.

Using Avro unions with schema references

Introduced in Confluent Platform 5.5, a schema reference is comprised of:

  • A reference name: part of the schema that refers to an entirely separate schema
  • A subject and version: used to identify and look up the referenced schema

When registering a schema to Schema Registry, an optional set of references can be specified, such as this Avro union containing reference names:

[
  "io.confluent.examples.avro.Customer",
  "io.confluent.examples.avro.Product",
  "io.confluent.examples.avro.Payment"
]

When registering this schema to Schema Registry, an array of reference versions is also sent, which might look like the following:

[
  { 
    "name": "io.confluent.examples.avro.Customer",
    "subject": "customer",
    "version": 1
  },
  {
    "name": "io.confluent.examples.avro.Product",
    "subject": "product",
    "version": 1
  },
  {
    "name": "io.confluent.examples.avro.Order",
    "subject": "order",
    "version": 1
  }
]

As you can see, the Avro union is no longer unwieldy. It is just a list of event types that will be sent to a topic. The event types can evolve independently, similar to when using RecordNameStrategy and TopicRecordNameStrategy. Plus, you regain subject-topic constraints, which were missing when using the newer subject name strategies.

However, in order to take advantage of these newfound gains, you need to configure your serializers a little differently. This has to do with the fact that when an Avro object is serialized, the schema associated with the object is not the Avro union, but just the event type contained within the union. When the Avro serializer is given the Avro object, it will either try to register the event type as a newer schema version than the union (if auto.register.schemas is true), or try to find the event type in the subject (if auto.register.schemas is false), which will fail. Instead, you want the Avro serializer to use the Avro union for serialization and not the event type. In order to accomplish this, set these two configuration properties on the Avro serializer:

  • auto.register.schemas=false
  • use.latest.version=true

Setting auto.register.schemas to false disables automatic registration of the event type, so that it does not override the union as the latest schema in the subject. Setting use.latest.version to true causes the Avro serializer to look up the latest schema version in the subject (which will be the union) and use that for serialization; otherwise, if set to false, the serializer will look for the event type in the subject and fail to find it.

Using JSON Schema and Protobuf with schema references

Now that Confluent Platform supports both JSON Schema and Protobuf, both RecordNameStrategy and TopicRecordNameStrategy can be used with these newer schema formats as well. In the case of JSON Schema, the equivalent of the name of the Avro record is the title of the JSON object. In the case of Protobuf, the equivalent is the name of the Protobuf message.

Also like Avro, instead of using the newer subject-name strategies to combine multiple event types in the same topic, you can use unions. The Avro union from the previous section can also be modeled in JSON Schema, where it is referred to as a "oneof":

{
  "oneOf": [
     { "$ref": "Customer.schema.json" },
     { "$ref": "Product.schema.json" },
     { "$ref": "Order.schema.json }
  ]
}

In the above schema, the array of reference versions that would be sent might look like this:

[
  { 
    "name": "Customer.schema.json",
    "subject": "customer",
    "version": 1
  },
  {
    "name": "Product.schema.json",
    "subject": "product",
    "version": 1
  },
  {
    "name": "Order.schema.json",
    "subject": "order",
    "version": 1
  }
]

As with Avro, automatic registration of JSON schemas that contain a top-level oneof won’t work, so you should configure the JSON Schema serializer in the same manner as the Avro serializer, with auto.register.schemas set to false and use.latest.version set to true, as described in the previous section.

In Protobuf, top-level oneofs are not permitted, so you need to wrap the oneof in a message:

syntax = "proto3";

package io.confluent.examples.proto;

import "Customer.proto";
import "Product.proto";
import "Order.proto";

message AllTypes {
    oneof oneof_type {
        Customer customer = 1;
        Product product = 2;
        Order order = 3;
    }
}

Here are the corresponding reference versions that could be sent with the above schema:

[
  { 
    "name": "Customer.proto",
    "subject": "customer",
    "version": 1
  },
  {
    "name": "Product.proto",
    "subject": "product",
    "version": 1
  },
  {
    "name": "Order.proto",
    "subject": "order",
    "version": 1
  }
]

One advantage of wrapping the oneof with a message is that automatic registration of the top-level schema will work properly. In the case of Protobuf, all referenced schemas will also be auto registered, recursively.

You can do something similar with Avro by wrapping the union with an Avro record:

{
 "type": "record",
 "namespace": "io.confluent.examples.avro",
 "name": "AllTypes",
 "fields": [
   {
     "name": "oneof_type",
     "type": [
       "io.confluent.examples.avro.Customer",
       "io.confluent.examples.avro.Product",
       "io.confluent.examples.avro.Order"
     ]
   }
 ]
}

This extra level of indirection allows automatic registration of the top-level Avro schema to work properly. However, unlike Protobuf, with Avro, the referenced schemas still need to be registered manually beforehand, as the Avro object does not have the necessary information to allow referenced schemas to be automatically registered.

Wrapping a oneof with a JSON object won’t work with JSON Schema, since a POJO being serialized to JSON doesn’t have the requisite metadata. Instead, optionally annotate the POJO with a @Schema annotation to provide the complete top-level JSON Schema to be used for both automatic registration and serialization. As with Avro, and unlike Protobuf, referenced schemas need to be registered manually beforehand.

Getting started with schema references

Schema references are a means of modularizing a schema and its dependencies. While this article shows how to use them with unions, they can be used more generally to model the following:

  • Nested records in Avro
  • import statements in Protobuf
  • $ref statements in JSON Schema

As mentioned in the previous section, if you’re using Protobuf, the Protobuf serializer can automatically register the top-level schema and all referenced schemas, recursively, when given a Protobuf object. This is not possible with the Avro and JSON Schema serializers. With those schema formats, you must first manually register the referenced schemas and then the top-level schema. Manual registration can be accomplished with the REST APIs or with the Schema Registry Maven Plugin.

As an example of using the Schema Registry Maven Plugin, below are schemas specified for the subjects named all-types-value, customer, and product in a Maven POM.

<plugin>
  <groupId>io.confluent</groupId>
  <artifactId>kafka-schema-registry-maven-plugin</artifactId>
  <version>${confluent.version}</version>
  <configuration>
    <schemaRegistryUrls>
      <param>http://127.0.0.1:8081</param>
    </schemaRegistryUrls>
    <subjects>
      <all-types-value>src/main/avro/AllTypes.avsc</all-types-value>
      <customer>src/main/avro/Customer.avsc</customer>
      <product>src/main/avro/Product.avsc</product>
    </subjects>
    <schemaTypes>
      <all-types-value>AVRO</all-types-value>
      <customer>AVRO</customer>
      <product>AVRO</product>
    </schemaTypes>
    <references>
      <all-types-value>
        <reference>
          <name>io.confluent.examples.avro.Customer</name>
          <subject>customer</subject>
        </reference>
        <reference>
          <name>io.confluent.examples.avro.Product</name>
          <subject>product</subject>
        </reference>
      </all-types-value>
    </references>
  </configuration>
  <goals>
    <goal>register</goal>
  </goals>
</plugin>

Each reference can specify a name, subject, and version. If the version is omitted, as with the example above, and the referenced schema is also being registered at the same time, the referenced schema’s version will be used; otherwise, the latest version of the schema in the subject will be used.

Here is the content of AllTypes.avsc, which is a simple union:

[
    "io.confluent.examples.avro.Customer",
    "io.confluent.examples.avro.Product"
]

Here is Customer.avsc, which contains a Customer record:

{
 "type": "record",
 "namespace": "io.confluent.examples.avro",
 "name": "Customer",

 "fields": [
     {"name": "customer_id", "type": "int"},
     {"name": "customer_name", "type": "string"},
     {"name": "customer_email", "type": "string"},
     {"name": "customer_address", "type": "string"}
 ]
}

And here is Product.avsc, which contains a Product record:

{
 "type": "record",
 "namespace": "io.confluent.examples.avro",
 "name": "Product",

 "fields": [
     {"name": "product_id", "type": "int"},
     {"name": "product_name", "type": "string"},
     {"name": "product_price", "type": "double"}
 ]
}

Next, register the schemas above using the following command:

mvn schema-registry:register

The above command will register referenced schemas before registering the schemas that depend on them. The output of the command will contain the ID of each schema that is registered. You can use the schema ID of the top-level schema with the console producer when producing data.

Next, use the console tools to try it out. First, start the Avro console consumer. Note that you should specify the topic name as all-types since the corresponding subject is all-types-value according to TopicNameStrategy.

./bin/kafka-avro-console-consumer --topic all-types --bootstrap-server localhost:9092

In a separate console, start the Avro console producer. Pass the ID of the top-level schema as the value of value.schema.id.

./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic all-types --property value.schema.id={id} --property auto.register=false --property use.latest.version=true

At the same command line as the producer, input the data below, which represent two different event types. The data should be wrapped with a JSON object that specifies the event type. This is how the Avro console producer expects data for unions to be represented in JSON.

{ "io.confluent.examples.avro.Product": { "product_id": 1, "product_name" : "rice", "product_price" : 100.00 } }
{ "io.confluent.examples.avro.Customer": { "customer_id": 100, "customer_name": "acme", "customer_email": "acme@google.com", "customer_address": "1 Main St" } }

The data will appear at the consumer. Congratulations, you’ve successfully sent two different event types to a topic! And unlike the newer subject name strategies, the union will prevent event types other than Product and Customer from being produced to the same topic, since the producer is configured with the default TopicNameStrategy.

Summary

Now there are two modular ways to store several event types in the same topic, both of which allow event types to evolve independently. The first, using the newer subject-name strategies, is straightforward but drops subject-topic constraints. The second, using unions (or oneofs) and schema references, maintains subject-topic constraints but adds further structure and drops automatic registration of schemas in the case of a top-level union or oneof.

If you’re interested in querying topics that combine multiple event types with ksqlDB, the second method, using a union (or oneof) is the only option. By maintaining subject-topic constraints, the method of using a union (or oneof) allows ksqlDB to deal with a bounded set of event types as defined by the union, instead of a potentially unbounded set. Modeling a union (also known as a sum type) by a relational table is a solved problem, and equivalent functionality will most likely land in ksqlDB in the future.

Putting Several Event Types in the Same Topic – Revisited

Playing Chess with Confluent Schema Registry

Previously, the Confluent Schema Registry only allowed you to manage Avro schemas. With Confluent Platform 5.5, the schema management within Schema Registry has been made pluggable, so that custom schema types can be added. In addition, schema plugins have been developed for both Protobuf and JSON Schema.

Now Schema Registry has two main extension points:

  1. REST Extensions
  2. Schema Plugins

In reality, the schema management within Schema Registry is really just a versioned history mechanism, with specific rules for how versions can evolve. To demonstrate both of the above extension points, I’ll show how Confluent Schema Registry can be turned into a full-fledged chess engine.1

A Schema Plugin for Chess

A game of chess is also a versioned history. In this case, it is a history of chess moves. The rules of chess determine whether a move can be applied to a given version of the game.

To represent a version of a game of chess, I’ll use Portable Game Notation (PGN), a format in which moves are described using algebraic notation.

However, when registering a new version, we won’t require that the client send the entire board position represented as PGN. Instead, the client will only need to send the latest move. When the schema plugin receives the latest move, it will retrieve the current version of the game, check if the move is compatible with the current board position, and only then apply the move. The new board position will be saved in PGN format as the current version.

So far, this would allow the client to switch between making moves for white and making moves for black. To turn Schema Registry into a chess engine, after the schema plugin applies a valid move from the client, it will generate a move for the opposing color and apply that move as well.

In order to take back a move, the client just needs to delete the latest version of the game, and then make a new move. The new move will be applied to the latest version of the game that is not deleted.

Finally, in order to allow the client to play a game of chess with the black pieces, the client will send a special move of the form {player as black}. This is a valid comment in PGN format. When this special move is received, the schema plugin will simply generate a move for white and save that as the first version.

Let’s try it out. Assuming that the chess schema plugin has been built and placed on the CLASSPATH for the Schema Registry2, the following properties need to be added to schema-registry.properties3

schema.providers=io.yokota.schemaregistry.chess.schema.ChessSchemaProvider
resource.static.locations=static
resource.extension.class=io.yokota.schemaregistry.chess.SchemaRegistryChessResourceExtension
  

The above properties not only configure the chess schema plugin, but also the chess resource extension that will be used in the next section. Once Schema Registry is up, you can verify that the chess schema plugin was registered.

$ curl http://localhost:8081/schemas/types
["CHESS","JSON","PROTOBUF","AVRO"]

Let’s make the move d4.

$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  --data '{"schema": "d4", "schemaType": "CHESS"}'  \
  http://localhost:8081/subjects/newgame/versions
{"id":1}

Schema Registry returns the ID of the new version. Let’s examine what the version actually looks like.

$ curl http://localhost:8081/subjects/newgame/versions/latest
{
  "subject": "newgame",
  "version": 1,
  "id": 1,
  "schemaType": "CHESS",
  "schema": "1.d4 d5"
}

Schema Registry replied with the move d5.

We can continue playing chess with Schema Registry in this fashion, but of course it isn’t the best user experience. Let’s see if a REST extension will help.

A REST Extension for Chess

A single-page application (SPA) with an actual chess board as the interface would provide a much better experience. Therefore, I’ve created a REST Extension that wraps a Vue.js SPA for playing chess. When the user makes a move on the chess board, the SPA sends the move to Schema Registry, retrieves the new board position, determines the last move played by the computer opponent, and makes that move on the board as well.

With the REST extension configured as described in the previous section, you can navigate to http://localhost:8081/index.html to see the chess engine UI presented by the REST extension. When playing a chess game, the game history will appear below the chess board, showing how the board position evolves over time.

Here is an example of the REST extension in action.

As you can see, schema plugins in conjunction with REST extensions can provide a powerful combination. Hopefully, you are now inspired to customize Confluent Schema Registry in new and creative ways. Have fun!

Playing Chess with Confluent Schema Registry

Building A Graph Database Using Kafka

I previously showed how to build a relational database using Kafka. This time I’ll show how to build a graph database using Kafka. Just as with KarelDB, at the heart of our graph database will be the embedded key-value store, KCache.

Kafka as a Graph Database

The graph database that I’m most familiar with is HGraphDB, a graph database that uses HBase as its backend. More specifically, it uses the HBase client API, which allows it to integrate with not only HBase, but also any other data store that implements the HBase client API, such as Google BigTable. This leads to an idea. Rather than trying to build a new graph database around KCache entirely from scratch, we can try to wrap KCache with the HBase client API.

HBase is an example of a wide column store, also known as an extensible record store. Like its predecessor BigTable, it allows any number of column values to be associated with a key, without requiring a schema. For this reason, a wide column store can also be seen as two-dimensional key-value store.1

I’ve implemented KStore as a wide column store (or extensible record store) abstraction for Kafka that relies on KCache under the covers. KStore implements the HBase client API, so it can be used wherever the HBase client API is supported.

Let’s try to use KStore with HGraphDB. After installing and starting the Gremlin console, we install KStore and HGraphDB.

$ ./bin/gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph

gremlin> :install org.apache.hbase hbase-client 2.2.1
gremlin> :install org.apache.hbase hbase-common 2.2.1
gremlin> :install org.apache.hadoop hadoop-common 3.1.2
gremlin> :install io.kstore kstore 0.1.0
gremlin> :install io.hgraphdb hgraphdb 3.0.0
gremlin> :plugin use io.hgraphdb
 

After we restart the Gremlin console, we configure HGraphDB with the KStore connection class and the Kafka bootstrap servers.2 We can then issue Gremlin commands against Kafka.

$ ./bin/gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: io.hgraphdb
plugin activated: tinkerpop.tinkergraph

gremlin> cfg = new HBaseGraphConfiguration()\
......1> .set("hbase.client.connection.impl", "io.kstore.KafkaStoreConnection")\
......2> .set("kafkacache.bootstrap.servers", "localhost:9092")
==>io.hgraphdb.HBaseGraphConfiguration@41b0ae4c

gremlin> graph = new HBaseGraph(cfg)
==>hbasegraph[hbasegraph]

gremlin> g = graph.traversal()
==>graphtraversalsource[hbasegraph[hbasegraph], standard]

gremlin> v1 = g.addV('person').property('name','marko').next()
==>v[0371a1db-8768-4910-94e3-7516fc65dab3]

gremlin> v2 = g.addV('person').property('name','stephen').next()
==>v[3bbc9ce3-24d3-41cf-bc4b-3d95dbac6589]

gremlin> g.V(v1).addE('knows').to(v2).property('weight',2).iterate()
  

It works! HBaseGraph is now using Kafka as its storage backend.

Kafka as a Document Database

Now that we have a wide column store abstraction for Kafka in the form of KStore, let’s see what else we can do with it. Another database that uses the HBase client API is HDocDB, a document database for HBase. To use KStore with HDocDB, first we need to set hbase.client.connection.impl in our hbase-site.xml as follows.

<configuration>
    <property>
        <name>hbase.client.connection.impl</name>
        <value>io.kstore.KafkaStoreConnection</value>
    </property>
    <property>
        <name>kafkacache.bootstrap.servers</name>
        <value>localhost:9092</value>
    </property>
</configuration>

Now we can issue MongoDB-like commands against Kafka, using HDocDB.3

$ jrunscript -cp <hbase-conf-dir>:target/hdocdb-1.0.1.jar:../kstore/target/kstore-0.1.0.jar -f target/classes/shell/hdocdb.js -f -

nashorn> db.mycoll.insert( { _id: "jdoe", first_name: "John", last_name: "Doe" } )

nashorn> var doc = db.mycoll.find( { last_name: "Doe" } )[0]

nashorn> print(doc)
{"_id":"jdoe","first_name":"John","last_name":"Doe"}

nashorn> db.mycoll.update( { last_name: "Doe" }, { $set: { first_name: "Jim" } } )

nashorn> var doc = db.mycoll.find( { last_name: "Doe" } )[0]

nashorn> print(doc)
{"_id":"jdoe","first_name":"Jim","last_name":"Doe"}
  

Pretty cool, right?

Kafka as a Wide Column Store

Of course, there is no requirement to wrap KStore with another layer in order to use it. KStore can be used directly as a wide column store abstraction on top of Kafka. I’ve integrated KStore with the HBase Shell so that one can work directly with KStore from the command line.

$ ./kstore-shell.sh localhost:9092

hbase(main):001:0> create 'test', 'cf'
Created table test
Took 0.2328 seconds
=> Hbase::Table - test

hbase(main):003:0* list
TABLE
test
1 row(s)
Took 0.0192 seconds
=> ["test"]

hbase(main):004:0> put 'test', 'row1', 'cf:a', 'value1'
Took 0.1284 seconds

hbase(main):005:0> put 'test', 'row2', 'cf:b', 'value2'
Took 0.0113 seconds

hbase(main):006:0> put 'test', 'row3', 'cf:c', 'value3'
Took 0.0096 seconds

hbase(main):007:0> scan 'test'
ROW                                COLUMN+CELL
 row1                              column=cf:a, timestamp=1578763986780, value=value1
 row2                              column=cf:b, timestamp=1578763992567, value=value2
 row3                              column=cf:c, timestamp=1578763996677, value=value3
3 row(s)
Took 0.0233 seconds

hbase(main):008:0> get 'test', 'row1'
COLUMN                             CELL
 cf:a                              timestamp=1578763986780, value=value1
1 row(s)
Took 0.0106 seconds

hbase(main):009:0>

There’s no limit to the type of fun one can have with KStore. 🙂

Back to Graphs

Getting back to graphs, another popular graph database is JanusGraph, which is interesting because it has a pluggable storage layer. Some of the storage backends that it supports through this layer are HBase, Cassandra, and BerkeleyDB.

Of course, KStore can be used in place of HBase when configuring JanusGraph. Again, it’s simply a matter of configuring the KStore connection class in the JanusGraph configuration.

storage.hbase.ext.hbase.client.connection.impl: io.kstore.KafkaStoreConnection
storage.hbase.ext.kafkacache.bootstrap.servers: localhost:9092

However, we can do better when integrating JanusGraph with Kafka. JanusGraph can be integrated with any storage backend that supports a wide column store abstraction. When integrating with key-value stores such as BerkeleyDB, JanusGraph provides its own adapter for mapping a key-value store to a wide column store. Thus we can simply provide KCache to JanusGraph as a key-value store, and it will perform the mapping to a wide column store abstraction for us automatically.

I’ve implemented a new storage plugin for JanusGraph called janusgraph-kafka that does exactly this. Let’s try it out. After following the instructions here, we can start the Gremlin console.

$ ./bin/gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.tinkergraph
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.spark
plugin activated: tinkerpop.utilities
plugin activated: janusgraph.imports

gremlin>  graph = JanusGraphFactory.open('conf/janusgraph-kafka.properties')
==>standardjanusgraph[io.kcache.janusgraph.diskstorage.kafka.KafkaStoreManager:[127.0.0.1]]

gremlin> g = graph.traversal()
==>graphtraversalsource[standardjanusgraph[io.kcache.janusgraph.diskstorage.kafka.KafkaStoreManager:[127.0.0.1]], standard]

gremlin> v1 = g.addV('person').property('name','marko').next()
==>v[4320]

gremlin> v2 = g.addV('person').property('name','stephen').next()
==>v[4104]

gremlin> g.V(v1).addE('knows').to(v2).property('weight',2).iterate()
  

Works like a charm.

Summary

In this and the previous post, I’ve shown how Kafka can be used as

I guess I could have titled this post “Building a Graph Database, Document Database, and Wide Column Store Using Kafka”, although that’s a bit long. In any case, hopefully I’ve shown that Kafka is a lot more versatile than most people realize.


Building A Graph Database Using Kafka

Building A Relational Database Using Kafka

In a previous post, I showed how Kafka can be used as the persistent storage for an embedded key-value store, called KCache. Once you have a key-value store, it can be used as the basis for other models such as documents, graphs, and even SQL. For example, CockroachDB is a SQL layer built on top of the RocksDB key-value store and YugaByteDB is both a document and SQL layer built on top of RocksDB. Other databases such as FoundationDB claim to be multi-model, because they support several types of models at once, using the key-value store as a foundation.

In this post I will show how KCache can be extended to implement a fully-functional relational database, called KarelDB1. In addition, I will show how today a database architecture can be assembled from existing open-source components, much like how web frameworks like Dropwizard came into being by assembling components such as a web server (Jetty), RESTful API framework (Jersey), JSON serialization framework (Jackson), and an object-relational mapping layer (JDBI or Hibernate).

Hello, KarelDB

Before I drill into the components that comprise KarelDB, first let me show you how to quickly get it up and running. To get started, download a release, unpack it, and then modify config/kareldb.properties to point to an existing Kafka broker. Then run the following:

$ bin/kareldb-start config/kareldb.properties

While KarelDB is still running, at a separate terminal, enter the following command to start up sqlline, a command-line utility for accessing JDBC databases.

$ bin/sqlline
sqlline version 1.8.0

sqlline> !connect jdbc:avatica:remote:url=http://localhost:8765 admin admin

sqlline> create table books (id int, name varchar, author varchar);
No rows affected (0.114 seconds)

sqlline> insert into books values (1, 'The Trial', 'Franz Kafka');
1 row affected (0.576 seconds)

sqlline> select * from books;
+----+-----------+-------------+
| ID |   NAME    |   AUTHOR    |
+----+-----------+-------------+
| 1  | The Trial | Franz Kafka |
+----+-----------+-------------+
1 row selected (0.133 seconds)

KarelDB is now at your service.

Kafka for Persistence

At the heart of KarelDB is KCache, an embedded key-value store that is backed by Kafka.  Many components use Kafka as a simple key-value store, including Kafka Connect and Confluent Schema Registry.  KCache not only generalizes this functionality, but provides a simple Map based API for ease of use.  In addition, KCache can use different implementations for the embedded key-value store that is backed by Kafka.

In the case of KarelDB, by default KCache is configured as a RocksDB cache that is backed by Kafka. This allows KarelDB to support larger datasets and faster startup times. KCache can also be configured to use an in-memory cache instead of RocksDB if desired.

Avro for Serialization and Schema Evolution

Kafka has pretty much adopted Apache Avro as its de facto data format, and for good reason.  Not only does Avro provide a compact binary format, but it has excellent support for schema evolution.  Such support is why the Confluent Schema Registry has chosen Avro as the first format for which it provides schema management.

KarelDB uses Avro to both define relations (tables), and serialize the data for those relations.  By using Avro, KarelDB gets schema evolution for free when executing an ALTER TABLE command.

sqlline> !connect jdbc:avatica:remote:url=http://localhost:8765 admin admin 

sqlline> create table customers (id int, name varchar);
No rows affected (1.311 seconds)

sqlline> alter table customers add address varchar not null;
Error: Error -1 (00000) : 
Error while executing SQL "alter table customers add address varchar not null": 
org.apache.avro.SchemaValidationException: Unable to read schema:
{
  "type" : "record",
  "name" : "CUSTOMERS",
  "fields" : [ {
    "name" : "ID",
    "type" : "int",
    "sql.key.index" : 0
  }, {
    "name" : "NAME",
    "type" : [ "null", "string" ],
    "default" : null
  } ]
}
using schema:
{
  "type" : "record",
  "name" : "CUSTOMERS",
  "fields" : [ {
    "name" : "ID",
    "type" : "int",
    "sql.key.index" : 0
  }, {
    "name" : "NAME",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "ADDRESS",
    "type" : "string"
  } ]
}

sqlline> alter table customers add address varchar null;
No rows affected (0.024 seconds)

As you can see above, when we first try to add a column with a NOT NULL constraint, Avro rejects the schema change, because adding a new field with a NOT NULL constraint would cause deserialization to fail for older records that don’t have that field. When we instead add the same column with a NULL constraint, the ALTER TABLE command succeeds.

By using Avro for deserialization, a field (without a NOT NULL constraint) that is added to a schema will be appropriately populated with a default, or null if the field is optional. This is all automatically handled by the underlying Avro framework.

Another important aspect of Avro is that it defines a standard sort order for data, as well as a comparison function that operates directly on the binary-encoded data, without first deserializing it. This allows KarelDB to efficiently handle key range queries, for example.

Calcite for SQL

Apache Calcite is a SQL framework that handles query parsing, optimization, and execution, but leaves out the data store. Calcite allows for relational expressions to be pushed down to the data store for more efficient processing. Otherwise, Calcite can process the query using a built-in enumerable calling convention, that allows the data store to be represented as a set of tuples that can be accessed through an iterator interface. An embedded key-value store is a perfect representation for such a set of tuples, so KarelDB will handle key lookups and key range filtering (using Avro’s sort order support) but otherwise defer query processing to Calcite’s enumerable convention. One nice aspect of the Calcite project is that it continues to develop optimizations for the enumerable convention, which will automatically benefit KarelDB moving forward.

Calcite supports ANSI-compliant SQL, including some newer functions such as JSON_VALUE and JSON_QUERY.

sqlline> create table authors (id int, json varchar);
No rows affected (0.132 seconds)

sqlline> insert into authors 
       > values (1, '{"name":"Franz Kafka", "book":"The Trial"}');
1 row affected (0.086 seconds)

sqlline> insert into authors 
       > values (2, '{"name":"Karel Capek", "book":"R.U.R."}');
1 row affected (0.036 seconds)

sqlline> select json_value(json, 'lax $.name') as author from authors;
+-------------+
|   AUTHOR    |
+-------------+
| Franz Kafka |
| Karel Capek |
+-------------+
2 rows selected (0.027 seconds)

Omid for Transactions and MVCC

Although Apache Omid was originally designed to work with HBase, it is a general framework for supporting transactions on a key-value store. In addition, Omid uses the underlying key-value store to persist metadata concerning transactions. This makes it especially easy to integrate Omid with an existing key-value store such as KCache.

Omid actually requires a few features from the key-value store, namely multi-versioned data and atomic compare-and-set capability. KarelDB layers these features atop KCache so that it can take advantage of Omid’s support for transaction management. Omid utilizes these features of the key-value store in order to provide snapshot isolation using multi-version concurrency control (MVCC). MVCC is a common technique used to implement snapshot isolation in other relational databases, such as Oracle and PostgreSQL.

Below we can see an example of how rolling back a transaction will restore the state of the database before the transaction began.

sqlline> !autocommit off

sqlline> select * from books;
+----+-----------+-------------+
| ID |   NAME    |   AUTHOR    |
+----+-----------+-------------+
| 1  | The Trial | Franz Kafka |
+----+-----------+-------------+
1 row selected (0.045 seconds)

sqlline> update books set name ='The Castle' where id = 1;
1 row affected (0.346 seconds)

sqlline> select * from books;
+----+------------+-------------+
| ID |    NAME    |   AUTHOR    |
+----+------------+-------------+
| 1  | The Castle | Franz Kafka |
+----+------------+-------------+
1 row selected (0.038 seconds)

sqlline> !rollback
Rollback complete (0.059 seconds)

sqlline> select * from books;
+----+-----------+-------------+
| ID |   NAME    |   AUTHOR    |
+----+-----------+-------------+
| 1  | The Trial | Franz Kafka |
+----+-----------+-------------+
1 row selected (0.032 seconds)

Transactions can of course span multiple rows and multiple tables.

Avatica for JDBC

KarelDB can actually be run in two modes, as an embedded database or as a server. In the case of a server, KarelDB uses Apache Avatica to provide RPC protocol support. Avatica provides both a server framework that wraps KarelDB, as well as a JDBC driver that can communicate with the server using Avatica RPC.

One advantage of using Kafka is that multiple servers can all “tail” the same set of topics. This allows multiple KarelDB servers to run as a cluster, with no single-point of failure. In this case, one of the servers will be elected as the leader while the others will be followers (or replicas). When a follower receives a JDBC request, it will use the Avatica JDBC driver to forward the JDBC request to the leader. If the leader fails, one of the followers will be elected as a new leader.

Database by Components

Today, open-source libraries have achieved what component-based software development was hoping to do many years ago. With open-source libraries, complex systems such as relational databases can be assembled by integrating a few well-designed components, each of which specializes in one thing that it does particularly well.

Above I’ve shown how KarelDB is an assemblage of several existing open-source components:

Currently, KarelDB is designed as a single-node database, which can be replicated, but it is not a distributed database. Also, KarelDB is a plain-old relational database, and does not handle stream processing. For a distributed, stream-relational database, please consider using KSQL instead, which is production-proven.

KarelDB is still in its early stages, but give it a try if you’re interesting in using Kafka to back your plain-old relational data.

Building A Relational Database Using Kafka

Machine Learning with Kafka Graphs

As data has become more prevalent from the rise of cloud computing, mobile devices, and big data systems, methods to analyze that data have become more and more advanced, with machine learning and artificial intelligence algorithms epitomizing the state-of-the-art. There are many ways to use machine learning to analyze data in Kafka. Below I will show how machine learning can be performed with Kafka Graphs, a graph analytics library that is layered on top of Kafka Streams.

Graph Modeling

As described in “Using Apache Kafka to Drive Cutting-Edge Machine Learning“, a machine learning lifecycle is comprised of modeling and prediction. That article goes on to describe how to integrate models from libraries like TensorFlow and H20, whether through RPC or by embedding the model in a Kafka application. With Kafka Graphs, the graph is the model. Therefore, when using Kafka Graphs for machine learning, there is no need to integrate with an external machine learning library for modeling.

Recommender Systems

As an example of machine learning with Kafka Graphs, I will show how Kafka Graphs can be used as a recommender system1. Recommender systems are commonly used by companies such as Amazon, Netflix, and Spotify to predict the rating or preference a user would give to an item. In fact, the Netflix Prize was a competition that Netflix started to determine if an external party could devise an algorithm that could provide a 10% improvement over Netflix’s own algorithm. The competition resulted in a wave of innovation in algorithms that use collaborative filtering2, which is a method of prediction based on the ratings or behavior of other users in the system.

Singular Value Decomposition

Singular value decomposition (SVD) is a type of matrix factorization popularized by Simon Funk for use in a recommender system during the Netflix competition. When using Funk SVD3, also called regularized SVD, the user-item rating matrix is viewed as the product of two lower-dimensional matrices, one with a row for each user, and another with a column for each item. For example, a 5×5 ratings matrix might be factored into a 5×2 user-feature matrix and a 2×5 item-feature matrix.

\begin{bmatrix}      r_{11} & r_{12} & r_{13} & r_{14} & r_{15} \\      r_{21} & r_{22} & r_{23} & r_{24} & r_{25} \\      r_{31} & r_{32} & r_{33} & r_{34} & r_{35} \\      r_{41} & r_{42} & r_{43} & r_{44} & r_{45} \\      r_{51} & r_{52} & r_{53} & r_{54} & r_{55}  \end{bmatrix}  =  \begin{bmatrix}      u_{11} & u_{12} \\      u_{21} & u_{22} \\      u_{31} & u_{32} \\      u_{41} & u_{42} \\      u_{51} & u_{52}  \end{bmatrix}  \begin{bmatrix}      v_{11} & v_{12} & v_{13} & v_{14} & v_{15} \\      v_{21} & v_{22} & v_{23} & v_{24} & v_{25} \\  \end{bmatrix}

Matrix factorization with SVD actually takes the form of

R = U \Sigma V^T

where \Sigma is a diagonal matrix of weights. The values in the row or column for the user-feature matrix or item-feature matrix are referred to as latent factors. The exact meanings of the latent factors are usually not discernible. For a movie, one latent factor might represent a specific genre, such as comedy or science-fiction; while for a user, one latent factor might represent gender while another might represent age group. The goal of Funk SVD is to extract these latent factors in order to predict the values of the user-item rating matrix.

While Funk SVD can only accommodate explicit interactions, in the form of numerical ratings, a team of researchers from AT&T enhanced Funk SVD to additionally account for implicit interactions, such as likes, purchases, and bookmarks. This enhanced algorithm is referred to as SVD++.4 During the Netflix competition, SVD++ was shown to generate more accurate predictions than Funk SVD.

Machine Learning on Pregel

Kafka Graphs provides an implementation of the Pregel programming model, so any algorithm written for the Pregel programming model can easily be supported by Kafka Graphs. For example, there are many machine learning algorithms written for Apache Giraph, an implementation of Pregel that runs on Apache Hadoop, so such algorithms are eligible to be run on Kafka Graphs as well, with only minor modifications. For Kafka Graphs, I’ve ported the SVD++ algorithm from Okapi, a library of machine learning and graph mining algorithms for Apache Giraph.

Running SVD++ on Kafka Graphs

In the rest of this post, I’ll show how you can run SVD++ using Kafka Graphs on a dataset of movie ratings. To set up your environment, install git, Maven, and Docker Compose. Then run the following steps:

git clone https://github.com/rayokota/kafka-graphs.git

cd kafka-graphs

mvn clean package -DskipTests

cd kafka-graphs-rest-app

docker-compose up
 

The last step above will launch Docker containers for a ZooKeeper instance, a Kafka instance, and two Kafka Graphs REST application instances. The application instances will each be assigned a subset of the graph vertices during the Pregel computation.

For our data, we will use the librec FilmTrust dataset, which is a relatively small set of 35497 movie ratings from users of the FilmTrust platform. The following command will import the movie ratings data into Kafka Graphs:

java \
  -cp target/kafka-graphs-rest-app-1.2.2-SNAPSHOT.jar \
  -Dloader.main=io.kgraph.tools.importer.GraphImporter \
  org.springframework.boot.loader.PropertiesLauncher 127.0.0.1:9092 \
  --edgesTopic initial-edges \
  --edgesFile ../kafka-graphs-core/src/test/resources/ratings.txt \
  --edgeParser io.kgraph.library.cf.EdgeCfLongIdFloatValueParser \
  --edgeValueSerializer org.apache.kafka.common.serialization.FloatSerializer
 

The remaining commands will all use the Kafka Graphs REST API. First we prepare the graph data for use by Pregel. The following command will group edges by the source vertex ID, and also ensure that topics for the vertices and edges have the same number of partitions.

curl -H "Content-type: application/json" -d '{ "algorithm":"svdpp",
  "initialEdgesTopic":"initial-edges", "verticesTopic":"vertices",
  "edgesGroupedBySourceTopic":"edges", "async":"false" }' \
  localhost:8888/prepare
 

Now we can configure the Pregel algorithm:

curl -H "Content-type: application/json" -d '{ "algorithm":"svdpp",
  "verticesTopic":"vertices", "edgesGroupedBySourceTopic":"edges", 
  "configs": { "random.seed": "0" } }' \
  localhost:8888/pregel
 

The above command will return a hexadecimal ID to represent the Pregel computation, such as a8d72fc8. This ID is used in the next command to start the Pregel computation.

curl -H "Content-type: application/json" -d '{  "numIterations": 6 }' \
  localhost:8888/pregel/{id}
 

You can now examine the state of the Pregel computation:

curl -H "Content-type: application/json" localhost:8888/pregel/{id}
 

Once the above command shows that the computation is no longer running, you can use the final state of the graph for predicting user ratings. For example, to predict the rating that user 2 would give to item 14, run the following command, using the same Pregel ID from previous steps:

java \
  -cp target/kafka-graphs-rest-app-1.2.2-SNAPSHOT.jar \
  -Dloader.main=io.kgraph.tools.library.SvdppPredictor \
  org.springframework.boot.loader.PropertiesLauncher localhost:8888 \
  {id} --user 2 --item 14
 

The above command will return the predicted rating, such as 2.3385806. You can predict other ratings by using the same command with different user and item IDs.

Summary

The Kafka ecosystem provides several ways to build a machine learning system. Besides the various machine learning libraries that can be directly integrated with a Kafka application, Kafka Graphs can be used to run any machine learning algorithm that has been adapted for the Pregel programming model. Since Kafka Graphs is a library built on top of Kafka Streams, we’ve essentially turned Kafka Streams into a distributed machine learning platform!

Machine Learning with Kafka Graphs

Fun with Confluent Schema Registry Extensions

The Confluent Schema Registry often serves as the heart of a streaming platform, as it provides centralized management and storage of the schemas for an organization. One feature of the Schema Registry that deserves more attention is its ability to incorporate pluggable resource extensions.

In this post I will show how resource extensions can be used to implement the following:

  1. Subject modes. For example, one might want to “freeze” a subject so that no further changes can be made.
  2. A Schema Registry browser. This is a complete single-page application for managing and visualizing schemas in a web browser.

Along the way I will show how to use the KCache library that I introduced in my last post.

Subject Modes

The first resource extension that I will demonstrate is one that provides support for subject modes. With this extension, a subject can be placed in “read-only” mode so that no further changes can be made to the subject. Also, an entire Schema Registry cluster can be placed in “read-only” mode. This may be useful, for example, when using Confluent Replicator to replicate Schema Registry from one Kafka cluster to another. If one wants to keep the two registries in sync, one could mark the Schema Registry that is the target of replication as “read-only”.

When implementing the extension, we want to associate a mode, either “read-only” or “read-write”, to a given subject (or * to indicate all subjects). The association needs to be persistent, so that it can survive a restart of the Schema Registry. We could use a database, but the Schema Registry already has a dependency on Kafka, so perhaps we can store the association in Kafka. This is a perfect use case for KCache, which is an in-memory cache backed by Kafka. Using an instance of KafkaCache, saving and retrieving the mode for a given subject is straightforward:

public class ModeRepository implements Closeable {

    // Used to represent all subjects
    public static final String SUBJECT_WILDCARD = "*";

    private final Cache<String, String> cache;

    public ModeRepository(SchemaRegistryConfig schemaRegistryConfig) {
        KafkaCacheConfig config =
            new KafkaCacheConfig(schemaRegistryConfig.originalProperties());
        cache = new KafkaCache<>(config, Serdes.String(), Serdes.String());
        cache.init();
    }

    public Mode getMode(String subject) {
        if (subject == null) subject = SUBJECT_WILDCARD;
        String mode = cache.get(subject);
        if (mode == null && subject.equals(SUBJECT_WILDCARD)) {
            // Default mode for top level
            return Mode.READWRITE;
        }
        return mode != null ? Enum.valueOf(Mode.class, mode) : null;
    }

    public void setMode(String subject, Mode mode) {
        if (subject == null) subject = SUBJECT_WILDCARD;
        cache.put(subject, mode.name());
    }

    @Override
    public void close() throws IOException {
        cache.close();
    }
}

Using the ModeRepository, we can provide a ModeResource class that provides REST APIs for saving and retrieving modes:

public class ModeResource {

    private static final int INVALID_MODE_ERROR_CODE = 42299;

    private final ModeRepository repository;
    private final KafkaSchemaRegistry schemaRegistry;

    public ModeResource(
        ModeRepository repository, 
        SchemaRegistry schemaRegistry
    ) {
        this.repository = repository;
        this.schemaRegistry = (KafkaSchemaRegistry) schemaRegistry;
    }

    @Path("/{subject}")
    @PUT
    public ModeUpdateRequest updateMode(
        @PathParam("subject") String subject,
        @Context HttpHeaders headers,
        @NotNull ModeUpdateRequest request
    ) {
        Mode mode;
        try {
            mode = Enum.valueOf(
                Mode.class, request.getMode().toUpperCase(Locale.ROOT));
        } catch (IllegalArgumentException e) {
            throw new RestConstraintViolationException(
                "Invalid mode. Valid values are READWRITE and READONLY.", 
                INVALID_MODE_ERROR_CODE);
        }
        try {
            if (schemaRegistry.isMaster()) {
                repository.setMode(subject, mode);
            } else {
                throw new RestSchemaRegistryException(
                    "Failed to update mode, not the master");
            }
        } catch (CacheException e) {
            throw Errors.storeException("Failed to update mode", e);
        }

        return request;
    }

    @Path("/{subject}")
    @GET
    public ModeGetResponse getMode(@PathParam("subject") String subject) {
        try {
            Mode mode = repository.getMode(subject);
            if (mode == null) {
                throw Errors.subjectNotFoundException();
            }
            return new ModeGetResponse(mode.name());
        } catch (CacheException e) {
            throw Errors.storeException("Failed to get mode", e);
        }
    }

    @PUT
    public ModeUpdateRequest updateTopLevelMode(
        @Context HttpHeaders headers,
        @NotNull ModeUpdateRequest request
    ) {
        return updateMode(ModeRepository.SUBJECT_WILDCARD, headers, request);
    }

    @GET
    public ModeGetResponse getTopLevelMode() {
        return getMode(ModeRepository.SUBJECT_WILDCARD);
    }
}

Now we need a filter to reject requests that attempt to modify a subject when it is in read-only mode:

@Priority(Priorities.AUTHORIZATION)
public class ModeFilter implements ContainerRequestFilter {

    private static final Set<ResourceActionKey> subjectWriteActions = 
        new HashSet<>();

    private ModeRepository repository;

    @Context
    ResourceInfo resourceInfo;

    @Context
    UriInfo uriInfo;

    @Context
    HttpServletRequest httpServletRequest;

    static {
        initializeSchemaRegistrySubjectWriteActions();
    }

    private static void initializeSchemaRegistrySubjectWriteActions() {
        subjectWriteActions.add(
            new ResourceActionKey(SubjectVersionsResource.class, "POST"));
        subjectWriteActions.add(
            new ResourceActionKey(SubjectVersionsResource.class, "DELETE"));
        subjectWriteActions.add(
            new ResourceActionKey(SubjectsResource.class, "DELETE"));
        subjectWriteActions.add(
            new ResourceActionKey(ConfigResource.class, "PUT"));
    }

    public ModeFilter(ModeRepository repository) {
        this.repository = repository;
    }

    @Override
    public void filter(ContainerRequestContext requestContext) {
        Class resource = resourceInfo.getResourceClass();
        String restMethod = requestContext.getMethod();
        String subject = uriInfo.getPathParameters().getFirst("subject");

        Mode mode = repository.getMode(subject);
        if (mode == null) {
            // Check the top level mode
            mode = repository.getMode(ModeRepository.SUBJECT_WILDCARD);
        }
        if (mode == Mode.READONLY) {
            ResourceActionKey key = new ResourceActionKey(resource, restMethod);
            if (subjectWriteActions.contains(key)) {
                requestContext.abortWith(
                    Response.status(Response.Status.UNAUTHORIZED)
                        .entity("Subject is read-only.")
                        .build());
            }
        }
    }

    private static class ResourceActionKey {

        private final Class resourceClass;
        private final String restMethod;

        public ResourceActionKey(Class resourceClass, String restMethod) {
            this.resourceClass = resourceClass;
            this.restMethod = restMethod;
        }

        ...
    }
}

Finally, the resource extension simply creates the mode repository and then registers the resource and the filter:

public class SchemaRegistryModeResourceExtension 
    implements SchemaRegistryResourceExtension {

    private ModeRepository repository;

    @Override
    public void register(
        Configurable<?> configurable,
        SchemaRegistryConfig schemaRegistryConfig,
        SchemaRegistry schemaRegistry
    ) {
        repository = new ModeRepository(schemaRegistryConfig);
        configurable.register(new ModeResource(repository, schemaRegistry));
        configurable.register(new ModeFilter(repository));
    }

    @Override
    public void close() throws IOException {
        repository.close();
    }
}

The complete source code listing can be found here.

To use our new resource extension, we first copy the extension jar (and the KCache jar) to ${CONFLUENT_HOME}/share/java/schema-registry. Next we add the following to ${CONFLUENT_HOME}/etc/schema-registry/schema-registry.properties:

kafkacache.bootstrap.servers=localhost:9092
resource.extension.class=io.yokota.schemaregistry.mode.SchemaRegistryModeResourceExtension

Now after we start the Schema Registry, we can save and retrieve modes for subjects via the REST API:

$ curl -X PUT -H "Content-Type: application/json" \
  http://localhost:8081/mode/topic-key --data '{"mode": "READONLY"}'
{"mode":"READONLY"}

$ curl localhost:8081/mode/topic-key
{"mode":"READONLY"}

$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  --data '{ "schema": "{ \"type\": \"string\" }" }' \
  http://localhost:8081/subjects/topic-key/versions                                          
Subject is read-only.

It works!

A Schema Registry Browser

Resource extensions can not only be used to add new REST APIs to the Schema Registry; they can also be used to add entire web-based user interfaces. As a demonstration, I’ve developed a resource extension that provides a Schema Registry browser by bundling a single-page application based on Vue.js, which resides here. To use the Schema Registry browser, place the resource extension jar in ${CONFLUENT_HOME}/share/java/schema-registry and then add the following properties to ${CONFLUENT_HOME}/etc/schema-registry/schema-registry.properties:1

resource.static.locations=static
resource.extension.class=io.yokota.schemaregistry.browser.SchemaRegistryBrowserResourceExtension

The resource extension merely indicates which URLs should use the static resources:

public class SchemaRegistryBrowserResourceExtension 
    implements SchemaRegistryResourceExtension {

    @Override
    public void register(
        Configurable<?> configurable,
        SchemaRegistryConfig schemaRegistryConfig,
        SchemaRegistry kafkaSchemaRegistry
    ) throws SchemaRegistryException {
        configurable.property(ServletProperties.FILTER_STATIC_CONTENT_REGEX, 
            "/(static/.*|.*\\.html|.*\\.js)");
    }

    @Override
    public void close() {
    }
}

Once we start the Schema Registry and navigate to http://localhost:8081/index.html, we will be greeted with the home page of the Schema Registry browser:

From the Entities dropdown, we can navigate to a listing of all subjects:

If we view a specific subject, we can see the complete version history for the subject, with schemas nicely formatted:

I haven’t shown all the functionality of the Schema Registry browser, but it supports all the APIs that are available through REST, including creating schemas, retrieving schemas by ID, etc.

Hopefully this post has inspired you to create your own resource extensions for the Confluent Schema Registry. You might even have fun. 🙂

Fun with Confluent Schema Registry Extensions