One interesting thing I got in replies is Unison language (content adressed functions, function is defined by AST). Also, I recommend checking Dion language demo (experimental project which stores program as AST).
In general I think there's a missing piece between text and storage. Structural editing is likely a dead end, writing text seems superior, but storage format as text is just fundamentally problematic.
I think we need a good bridge that allows editing via text, but storage like structured database (I'd go as far as say relational database, maybe). This would unlock a lot of IDE-like features for simple programmatic usage, or manipulating langauge semantics in some interesting ways, but challenge is of course how to keep the mapping between textual input in shape.
> Dion language demo (experimental project which stores program as AST).
Michael Franz [1] invented slim binaries [2] for the Oberon System. Slim binaries were program (or module) ASTs compressed with the some kind of LZ-family algorithm. At the time they were much more smaller than Java's JAR files, despite JAR being a ZIP archive.
I'm quite sure I've read your article before and I've thought about this one a lot. Not so much from GIT perspective, but about textual representation still being the "golden source" for what the program is when interpreted or compiled.
Of course text is so universal and allows for so many ways of editing that it's hard to give up. On the other hand, while text is great for input, it comes with overhead and core issues for (most are already in the article, but I'm writing them down anyway):
1. Substitutions such as renaming a symbol where ensuring the correctness of the operation pretty much requires having parsed the text to a graph representation first, or letting go of the guarantee of correctness in the first place and performing plain text search/replace.
2. Alternative representations requiring full and correct re-parsing such as:
- overview of flow across functions
- viewing graph based data structures, of which there tend to be many in a larger application
- imports graph and so on...
3. Querying structurally equivalent patterns when they have multiple equivalent textual representations and search in general being somewhat limited.
4. Merging changes and diffs have fewer guarantees than compared to when merging graphs or trees.
5. Correctness checks, such as cyclic imports, ensuring the validity of the program itself are all build-time unless the IDE has effectively a duplicate program graph being continuously parsed from the changes that is not equivalent to the eventual execution model.
6. Execution and build speed is also a permanent overhead as applications grow when using text as the source. Yes, parsing methods are quite fast these days and the hardware is far better, but having a correct program graph is always faster than parsing, creating & verifying a new one.
I think input as text is a must-have to start with no matter what, but what if the parsing step was performed immediately on stop symbols rather than later and merged with the program graph immediately rather than during a separate build step?
Or what if it was like "staging" step? Eg, write a separate function that gets parsed into program model immediately, then try executing it and then merge to main program graph later that can perform all necessary checks to ensure the main program graph remains valid? I think it'd be more difficult to learn, but I think having these operations and a program graph as a database, would give so much when it comes to editing, verifying and maintaining more complex programs.
Why would structural editing be a dead end? It has nothing to do with storage format. At least the meaning of the term I am familiar with, is about how you navigate and manipulate semantic units of code, instead of manipulating characters of the code, for example pressing some shortcut keys to invert nesting of AST nodes, or wrap an expression inside another, or change the order of expressions, all at the pressing of a button or key combo. I think you might be referring to something else or a different definition of the term.
I'm referring to UI interfaces that allow you to do structural editing only and usually only store the structural shape of the program (e.g. no whitespace or indentation). I think at this point nobody uses them for programming, it's pretty frustrating to use because it doesn't allow you to do edits that break the semantic text structure too much.
I guess the most used one is styles editor in chrome dev tools and that one is only really useful for small tweaks, even just adding new properties is already pretty frustrating experience.
[edit] otherwise I agree that structural editing a-la IDE shortcuts is useful, I use that a lot.
Some very bright Jetbrains folks were able to solve most of those issues. Check out their MPS IDE [1], its structured/projectional editing experience is in a class of its own.
I would say structural editing is not a dead end, because as you mention projects like Unison and Smalltalk show us that storing structures is compatible with having syntax.
The real problem is that we need a common way of storing parse tree structures so that we can build a semantic editor that works on the syntax of many programming languages
I have to respond to your point, though. Whether 30% cut is excessive depends on whether devs feel like they are getting a good deal. As far as I can tell, game developers don't seem to complain about Steam cut very much, it seems like the value you get is worth it.
For example, this thread https://www.reddit.com/r/Steam/comments/10wvgoo/do_you_think... seems like majority is positive about it, even though people debate. When Apple tax is brought up, there's almost never even a discussion there, it's pretty universally hated.
Apple seems to have almost adveserial relationship to its developers. I deploy to App Store and I feel like I'm getting screwed. Even compared to Google, which takes the same cut, but does bahave a lot more nicely to its developers.
There's so many ways this benchmark can go wrong that there's pretty much no way I can trust this conclusion.
> All the loops call a dummy function DATA.doSomethingWithValue() for each array element to make sure V8 doesn't optimize out something too much.
This is probably the most worrying comment - what is "too much?" How are you sure it doesn't change between different implementations? Are you sure v8 doesn't do anything you don't expect? If you don't look into what's actually happening in the engine, you have no idea at this point. Either you do the real work and measure that, or do the fake work but verify that the engine does what you think it does.
There are a lot of "probably"s in the article. I was also suspicious that the author didn't say they did any pre measurement runs of the code to ensure that it was warmed up first. Nor did they e.g. use V8 arguments with Node (like --trace-opt) to check what was actually happening.
Claude Code in particular seems to use very few redundant comments. That or it's just better at obeying the standing instruction I give it to not create them, something other assistants seem to blithely ignore.
No, you can't always do that. We have workarounds for platform bugs that were even fixed, because we get users with old devices that can't upgrade. You cannot fork a phone of a random person on the other side of the world. Once a platform bug is out, it can stay out in the wild for a very long time.
Our website codebase contains a workaround for a bug in native Android file picker in Samsung One UI. How are you supposed to solve this by "deploying your own platform?"
So, when operating system gives you invalid file, it magically becomes valid, because your UI code is in a different file. Sure, that sounds plausible.
I also find that phrase super misleading. I've been using a different heuristic that seems to work better for me - "comments should add relevant information that is missing." This works against redundant comments but also isn't ambigous about what "why" means.
There might be a better one that also takes into account whether the code does something weird or unexpected for the reader (like the duplicate clear call from the article).
I like this framing, but might add to it: "comments should add relevant information that is missing and which can't easily be added by refactoring the code".
It might look ok from user's point of view, but lot of the problems fall on web developers who have to work around a bunch of these issues to make their pages work in Safari
This is such nonsense and everyone who’s a web developer knows you’re not being honest here but just to make it ever clearer for anyone else here’s a chart showing the number of bugs that only occur in a single browser.
> This is such nonsense and everyone who’s a web developer knows you’re not being honest
And in your opinion "being honest" is speaking for every web dev out there?
I've been a web dev for 25 years (god I'm old) and Safari has not been a major pain for me.
You keep bandying wpt.fyi results around not even understanding what they mean. E.g. Safari only passes 8 out of 150 accelerometer tests. So? Does it affect every web dev? Lol no. But it does pass 57 out 57 accessibility tests which is significantly more important.
Late on a lot of standards, quirky in many ways and just a lot of bugs, especially around images and videos. Also positioning issues. They recently broke even position fixed, which broke a ton of web pages on iOS, including apple.com
I like this, especially because it focuses on the actual problem these contributioms cause, not the AI tools themselves.
I especially like the term "extractive contribution." That captures the issue very well and covers even non-AI instances of the problem which were already present before LLMs.
Making reviewer friendly contributions is a skill on its own and makes a big difference.
One interesting thing I got in replies is Unison language (content adressed functions, function is defined by AST). Also, I recommend checking Dion language demo (experimental project which stores program as AST).
In general I think there's a missing piece between text and storage. Structural editing is likely a dead end, writing text seems superior, but storage format as text is just fundamentally problematic.
I think we need a good bridge that allows editing via text, but storage like structured database (I'd go as far as say relational database, maybe). This would unlock a lot of IDE-like features for simple programmatic usage, or manipulating langauge semantics in some interesting ways, but challenge is of course how to keep the mapping between textual input in shape.
reply