Every browser uses different markup to represent contents of contenteditable. One browser uses div for text and splits it into two when enter is pressed, other browser inserts <br/>, next one has paragraph, etc. ect.
JSON is not here for resemblance, but for normalization. You force single presentation of things from start, and when user presses enter (or other text composition event happens) you prevent default action, update document model (JSON) and then update the HTML.
This approach is much more sane and predictable than trying to trust browser's HTML.
Still, that does not mean it cannot output HTML. CKEditor 5 implements a custom data model, works in the way you described (cancel native actions or extracts stuff from it, apply those changes to its data model and re-renders the DOM if needed), but it still outputs HTML. Thanks to that you don't need a processor to use that content on your page. Plus, there are cases where converting a custom data model to HTML is actually a tricky job (e.g. inline formatting) and even Editor.js chose HTML for those.
JSON is not here for resemblance, but for normalization. You force single presentation of things from start, and when user presses enter (or other text composition event happens) you prevent default action, update document model (JSON) and then update the HTML.
This approach is much more sane and predictable than trying to trust browser's HTML.