I was surprised to see LevelDB (https://code.google.com/p/leveldb/) was missing from the list of storage solutions you tried, because it seems optimal for your use-case. Were you aware of it?
I'm not sure about the optimal use-case match. Sparkey is for "mostly static" datasets where on disk structures are generated by a batch process and pushed to servers providing read only access to the data to consumers.
leveldb on the other hand, supports concurrent writes and provides features to handle data consistency and cheap gradual reindexing.
Right, this is actually the first commit - it's just that the history was squashed before publish - we had to remove sensitive Spotify specific things in it, and it seemed easiest to just do a big squash.
The problem is, that so many projects for so many things exists. It is hard to find the matching project, thats why many people invent the wheele again.