Yep. I’ve been working in the collaborative editing space for over a decade - with both OT algorithms and CRDT algorithms. The modern CRDTs I’ve been developing are significantly faster and smaller than any OT system I made 5+ years ago.
Much more complex though. (~3k loc for a good, high performance text crdt merging algorithm, vs 300 loc for a good text OT algorithm.)
Sorry, I'm not sure I properly understood OT as an acronym. Did you mean "operational transformation"? Because I thought that OT was method of making operational based CRDTs[0].
But I'm probably wrong in at least one way! Hoping to learn.
Usually the difference is that operational transform algorithms create an ordered global list of all the changes (typically on a centralized server). They flatten operations using a heuristic "transform" function.
CRDTs don't reorder the changes, but guarantee that when all changes are merged (in any order) the final result would be the same. For example, MAX() is a complete CRDT merging function.
But the distinction gets much more blurry at the edges. You can make OT algorithms which work without a central server, and CRDTs which use operation reordering to guarantee merge consistency.
Could you go into detail a bit more on operation based CRDTs and merge consistency?
I'm currently struggling with moving document merge use-case to Yjs while leveraging it's updates for efficient real-time rebasing. It's insert/delete world view (state based) seems to make this practically impossible.
Quick and dirty: OTs require a centralized controller and a (potentially complicated) merge strategy to shepherd conflicting changes. CRDTs on the other hand apply commutative operations so they don't need be centralized, though nodes that communicate infrequently might get very out of sync
I have always found CRDTs a lot easier to wrap my brain around. Agreed with Joseph that optimizing them is non-trivial. In Teletype for Atom, we indexed fragments in splay trees, and each text fragment was actually a member of two splay trees at the same time: one for the full document, another for the pieces of the original insertion as they'd been split apart. We would find the insertion fragment in the insertion tree, then walk parent pointers in the document tree to locate it. In Zed, we have two trees, but can't do this double embedding because we use persistent data structures where parent pointers are not an option. So we instead have insertion trees that map subslices of inserted text to fragment ordered identifiers, which we use for lookup in the main document B-tree. These fragment identifiers were themselves inspired by another CRDT paper called Logoot, but we don't actually send them over the wire.
Both OT and CRDT approaches have lots of obscure edge cases. In both cases, I can't write a correct algorithm without a fuzzer to save myself. (And I don't trust anyone else to, either).
CRDTs aren't that complex on the surface - this messy file[1] implements 4 different list CRDT algorithms in a few hundred lines of code. And CRDTs are pretty simple for JSON structures, which is not at all true for OT algorithms.
But text CRDTs in particular need a whole lot more thought around optimization because of how locations work. Locations in list/text CRDTs generally work by giving every item a globally-unique ID and referencing that ID with every change. So, inserts work by saying "Insert between ID xxx and yyy". But, if you have a text document which needs to store a GUID for every keystroke which has ever happened, disk and memory usage becomes crazy bad. And if you don't have a clever indexing strategy, looking up IDs will bottleneck your program.
In diamond types (my CRDT), I solve that with a pancake stack of optimizations which all feed into each other. Ids are (agent, sequence number) pairs so I can run-length encode them. Internally those IDs get flattened into integers, and I use a special hand-written b-tree which internally run-length encodes all the items it stores. I've iterated on Yjs's file format to get the file size down to ~0.05 bytes of overhead per inserted character in some real world data sets, which is wild given each inserted character needs to store about 8 different fields. (Insert position, ID (agent + seq), parents, the inserted character, and so on).
CRDTs aren't that complex. Making them small and fast is the challenge. Automerge has been around for years and still takes hundreds of megabytes of ram to process a simple editing trace for a 10 page text document. Even in their rust implementation. Diamond types is orders of magnitude faster, and uses ~1M of RAM for the same test.
The current version of diamond types is many times faster than any OT algorithm I've written in the past thanks to all those optimizations. Now that I've been down this road, I could optimize an OT algorithm in the same way if necessary, but you don't really need to.
Other kinds of CRDTs don't need these optimizations. (Eg the sort you'd use for a normal database). There's two reasons: 1. When editing text, you make an edit with each keystroke. So you have a lot of edits. 2. Merging list items correctly is orders of magnitude more complex than merging a database entry. If you just want CRDTs for editing a database of records, that'd probably only take a few dozen lines.
How would you recommend an interested student to get up-to-speed? There is so much development in so many fields, but often no idea where to even begin to get there.
Given the interest in CRDTs, it'd be a great help if someone wants to do a much more interactive exploration of those algorithms. I feel like there's an interactive guide just waiting to be written which could help people understand this stuff.
Much more complex though. (~3k loc for a good, high performance text crdt merging algorithm, vs 300 loc for a good text OT algorithm.)