There is no listed equivalent of RecordIO. What do people use for high-reliability journals?
When I needed something like RecordIO to store market data, I couldn't find anything. So I implemented https://github.com/romkatv/ChunkIO. I later learned of https://github.com/google/riegeli (work in progress), which could've saved me a lot of time if only I found it earlier. I think my ChunkIO is a better though.
I suppose you mean "exactly" in a figurative way. Riegeli is definitely inspired by RecordIO and is meant as a successor to it but it's not RecordIO.
> Is there a reason that doesn't meet your requirements?
I need to store timeseries with fast lookup by timestamp. Riegeli doesn't support this out of the box. If I had discovered it before I built ChunkIO, I probably would've pulled the low-level code out of it and added timeseries support on top. Or maybe not. Reliability is very important to me and it's risky to use work-in-progress software that may or may not have any production footprint (I'm no longer with Google so I don't know if they use it internally.)
I don't understand. RecordIO doesn't support lookup of any kind; it is a linear format. The interface of Riegeli looks to me exactly like the interface to RecordIO. All they've done is removed support for Google's abstract File* storage interface so it can be used by the public.
What you are describing sounds like SSTable. Perhaps you could benefit from LevelDB.
This format looks somewhat underpowered. If one record is corrupted, there is no way to read anything after it. For the same reason there is no lookup/sharding support, such as finding the first record that starts in the second half of the file. If a writer crashes, a new instance of writer cannot append to an existing file without reading its whole content and truncating on the last readable record.
When I needed something like RecordIO to store market data, I couldn't find anything. So I implemented https://github.com/romkatv/ChunkIO. I later learned of https://github.com/google/riegeli (work in progress), which could've saved me a lot of time if only I found it earlier. I think my ChunkIO is a better though.