The list goes on an on and yet in practice these problems are solvable and the i...

aforwardslash · on March 27, 2023

> and the impedance is just really not a big deal

Im quite happy you haven't come across major issues with this. If you develop clean architecture solutions that are live products across years, this is a major problem (eg. table X is now a separate full blown service; table Y is an external materialized table with no inserts, as inserts now go into an external messaging system such as kafka) etc.

> "Impedance mismatch" is just a thought-terminating cliché.

So are the whole ORM advantages. My personal distaste from ORM doesn't even start in the obvious technical drawbacks, starts with the fact that a developer should have a solid grasp on the domain he is working, which more often than not, ORM advocates lack. If you can't model data from a storage perspective (which, btw, is often the bottleneck of your application), you shure as hell won't do a good job modelling it in a business domain.

> Yet we find solutions and workarounds and the severity of these problems is generally overrated outside of purely theoretical contexts.

Ahh yes, the typical "lets not get theoretical" argument. ORMs are usually crap, and in python they are actual crap. If Django is a good example for you, good for you. If you ever have a look at Entity Framework you'll be amazed. Try to use a schema-first approach with any mainstream ORM and you'll quickly realize all you do is workarounds because of assumptions and limitations. Thing is, for my daily work, these problems are actual problems. So much we don't use Django or ORMs.

s17n · on March 27, 2023

ORMs are just not performant unless you reason about all the code at the level of "what queries are going to be generated and when", which makes the ORM an unhelpful layer of obfuscation over the layer of abstraction that you're actually reasoning at.

This is quite different from high level vs assembly where you can easily go your whole life without ever learning assembly language or how a compiler works.

Or to put it another way, the difference between the two situations is that an ORM API is not a higher level language than SQL. Transpiling between two languages of comparable expressiveness (SQL is actually more expressive but no need to go there) adds an extra source of problems without gaining you much.

Daishiman · on March 27, 2023

> ORMs are just not performant unless you reason about all the code at the level of "what queries are going to be generated and when", which makes the ORM an unhelpful layer of obfuscation over the layer of abstraction that you're actually reasoning at.

You're assuming I use the ORM to not reason about SQL or not think about performance. This isn't true; first of all because even if you write SQL, SQL performance is not immediately obvious for any but the simplest of indexed queries. In no storage system do you ever get away from reasoning about this.

Second because SQL is actually a mediocre abstraction layer over your data storage. You can't really compose SQL queries; in an ORM taking a base Query object and adding a bunch of various `filter()` statements automatically does the right thing. Basic queries are much shorter visually; ORMs deal with the abstraction of table and column renames that mean rewriting all your SQL in other systems. I feel like you're just trotting out "reasons" out of a blog post from people whose priorities aren't the ones that people like us who write CRUD systems day in and day out do.

Again, you're talking about theoretical disadvantages which I have only really encountered about a half dozen times in over a decade of using Django even in performance-sensitive areas. Rewriting one ORM query out of a hundred is not a problem, especially if I had to rewrite the SQL in the first place.

aforwardslash · on March 28, 2023

> Second because SQL is actually a mediocre abstraction layer over your data storage

Actually, SQL is not an abstraction layer for data storage. It is a query language based on set theory. Even DDL doesn't impose on you any limitations or makes any assumptions regarding storage - just data representation. You're conflating SQL RDBMS with SQL language, just like many ORM's do. Quick example, you can use SQL with Apache DRILL to query log files in CSV - SQL Language, none of the RDBMS stuff.

> You can't really compose SQL queries;

in an ORM taking a base Query object and adding a bunch of various `filter()` statements automatically does the right thing. Basic queries are much shorter visually

Of course you can. But even without needing to explain to you how one of the most famous anti-patterns work (EAV - entity-attribute-value), the most easy way of doing it is by using a query builder in your own development language - you just filter based on the specific code contitions.

Oh, and "compose" is a terrible term, specially when SQL actually allows you to use SELECT * FROM (SELECT QUERY), LATERAL and UNION - all of these allow you to perform proper composition, years ahead of what most ORMs give you.

s17n · on March 27, 2023

> In no storage system do you ever get away from reasoning about this.

Agreed

> Second because SQL is actually a mediocre abstraction layer over your data storage.

This is fair but adding yet another layer only makes things worse

> I feel like you're just trotting out "reasons" out of a blog post

Rest assured that I haven't read a blog post or indeed anything on the topic; in fact I'm shockingly ignorant in general.

aforwardslash · on March 28, 2023

> This isn't true; first of all because even if you write SQL, SQL performance is not immediately obvious for any but the simplest of indexed queries. In no storage system do you ever get away from reasoning about this.

True, but ORM adds yet another layer of cruft to debug and monitor, with arguably few benefits for such an advanced user; Also, in some databases, EXPLAIN is your friend, and may not give you execution time (it does, but lets assume you're right just for the sake of it), but it does give you planned "execution cost". 0.1 is better than 0.5, and so on and so on. Also, it will tell you if your indexes are actually being used (are you a mysql user? that would explain a lot).

Regarding performance, you have 3 main areas that are costly: query execution, result retrieval time, result serialization & transformation; The first two are characteristics of the design of the database and the specific system characteristics; the last one is pure framework-dependant. If you're a python buff, you'll quickly realize that your application will spend a non-trivial amount of time getting around proxy class implementations for models in the deserialization phase of the data - in some cases, more than the query and transport time itself. All because you eg. wanted to compute spent minutes in a given task for a given user for a year, counted in hours or melons or whatever.

> that people like us who write CRUD systems day in and day out do

I write a shit-ton of CRUD systems (in python), and none of them use Django, because interoperability is often desired, and in some cases - a requirement. Just because you use Django and a ORM to design some stuff, doesn't make it wrong, but also doesn't make it right. Want to design database CRUDS without code? There are plenty of tools for that, ranging from bpm tools like Bonita BPM (OSS) to OutSystems.

>Again, you're talking about theoretical disadvantages which I have only really encountered about a half dozen times in over a decade of using Django even in performance-sensitive areas.

Or - or - we're discussing problems you don't have because you work in a very narrow field where Django is actually a good fit. They exist, and nothing against it. But the mustard is on the nose if you tell me you've been spending "over a decade" with Django. In "over a decade", I've written a shit ton of libraries in at least 4 languages on this specific topic (database middlewares, query builders, object mappers, etc), and the driver for all of them was to solve problems you describe as "one in a hundred".

> Rewriting one ORM query out of a hundred is not a problem

Actually, it is. Because its not *one* query, its one endpoint - it may be using a dozen models, on a dozen child subroutines, to compute a specific value; It may be actually using a function on a different app for a specific computation; It may actually happen that the local computation has certain characteristics the replacing query doesn't because it is used server-side - sum of times; sum of decimals; sum of floats; And now every time someones edit that model assuming "that's it", they either have a crappy abstraction model and realize they also need to edit some method's sql queries by hand (oh and patch the migrations, if necessary), or they're just playing whack-a-mole with the assorted array of custom functions breaking up in unit testing. In the end - assuming all things are equal - I would very much prefer if architecture and concern delegation wasn't affected by some performance-related refactorings.