> my rule of thumb is that anything that can be done inside the database should be done inside the database
For non-distributable (or at least non-multi-master) DBs, you are often using your most expensive and least distributable resource to do your cheapest and most distributable work. My rule of thumb is that unless you need atomicity or the performance (as profiled) makes a real difference, do it outside the DB. Also has the side effect of avoiding system lock-in (at the expense of language lock-in which I think is ok).
Granted, it's subjective, but I've had the most trouble scaling DBs than other systems, and it's usually CPU/mem of business logic pushing the boundaries.
I don't often find myself in a situation where I don't need need atomicity for the work I do. So I'm speaking from a point of view that reflects that.
Even if I didn't need it for a project, I would probably start there because, you know, that's the hammer I have, so SQL is the nail I'm going to start with.
I haven't had problems scaling DBs though. DB throughput has never caused a problem for me. What has caused problems is business logic put in what I consider the wrong places.
Business logic belongs in the database structure wherever possible because that's the single source of truth for your data. That's where all of your controls for data integrity belong.
For any app that has longevity, you are going to have multiple methods of accessing the data. There will be APIs and crappy junk that wants your data. You have to account for this. There will be users logging directly into your database. Sometimes it will be your boss.
Start with an ACID approach to your CRUD app, and you probably won't go wrong. At least not soon. And you have options for scaling when you need to. You can scale vertically or horizontally or both.
> For non-distributable (or at least non-multi-master) DBs, you are often using your most expensive and least distributable resource to do your cheapest and most distributable work.
I see your logic, but I think this is misleading, and often the reason why people do inefficient things.
The pattern I see regularly is if the database does more thinking, its planner will diminish the amount of actual work; it will touch fewer rows from disk, and less data will result sent over the network and in fewer round trips. Overall for any extensive report you're likely to be paying significant cost to distribute the work, often more than actually doing the work.
Of course, as you say it's subjective. Scaling databases becomes necessary at a point, but it may be because of developers not embracing them; I think this is why NoSQL has gained in popularity.
For non-distributable (or at least non-multi-master) DBs, you are often using your most expensive and least distributable resource to do your cheapest and most distributable work. My rule of thumb is that unless you need atomicity or the performance (as profiled) makes a real difference, do it outside the DB. Also has the side effect of avoiding system lock-in (at the expense of language lock-in which I think is ok).
Granted, it's subjective, but I've had the most trouble scaling DBs than other systems, and it's usually CPU/mem of business logic pushing the boundaries.