Fuzzy automated reviews should always run in an interactive loop with a developer on their workstation and contain enough context to quickly assess if they are valid or not.
When developers create a PR, they already feel they are "done", and they have likely already shifted their focus on another task. False positive are horrible at this point, especially when they keep changing with each push of commits.
I have also seen situations where sales opted into Microsoft early on. When they grew in relation to engineering forced the rest of the company to standardize to Microsoft products so they could get better rates and “save money”.
Yeah that was my point, it doesn't matter if it's left or right, because the only ideology Meta et-al speak, is USD, so they will kiss the ring of whoever is in power at the present moment in EU, far left or far right. Same how many of them also kissed the ring of the CCP or Saudi Arabia while flying the pride flag in the west.
They don't really care about those ideologies they preach, they just virtue signal however needed in order to appease the mobs and governments in power so they can be allowed to extract wealth.
Which is awesome for us who move to places with sparse population to get a silent environment :)
I'd much rather they put them in areas that are already ruined with traffic rumble, at least the difference would be minimal instead of "silence" vs "rumble".
I'm guessing you're with that adding indirection for what you're actually processing, in that case? So I guess the counter-case would be when you don't want/need that indirection.
If I understand what you're saying, is that you'll instead of doing:
- Create job with payload (maybe big) > Put in queue > Let worker take from queue > Done
You're suggesting:
- Create job with ID of payload (stored elsewhere) > Put in queue > Let worker take from queue, then resolve ID to the data needed for processing > Done
Is that more or less what you mean? I can definitively see use cases for both, heavily depends on the situation, but more indirection isn't always better, nor isn't big payloads always OK.
- Persist payload in db > Queue with id > Process via worker.
Push the payload directly to queue can be tricky. Any queue system usually will have limits on the payload size, for good reasons. Plus if you already commit to db, you can guarantee the data is not lost and can be process again however you want later. But if your queue is having issue, or it failed to queue, you might lost it forever.
yes and no, as the sibling comment mentions sometimes a message bus is used (Kafka, for example), but Netflix is (was?) all-in with HTTP (low-latency gRPC, HTTP/3, wrapped in nice type-safe SDK packages)
but ideally you don't break the glass and reach for a microservices architecture if you don't need the scalability afforded by very deep decoupling
which means ideally you have separate databases (and DB schema and even likely different kind of data store), and through the magic of having minimally overlapping "bounded contexts" you don't need a lot of data to be sent over (the client SDK will pick what it needs for example)
... of course serving a content recommendation request (which results in a cascade of requests that go to various microservices, eg. profile, rights management data, CDN availability, and metadata for the results, image URLs, etc) for a Netflix user doesn't need durability, so no Kafka (or other message bus), but when the user changes their profile it might be something that gets "broadcasted"
(and durable "replayable" queues help, because then services can be put to read-only mode to serve traffic, while new instances are starting up, and they will catch up. and of course it's useful for debugging too, at least compared to HTTP logs, which usually don't have the body/payload logged.)
I have been doing this for at least a decade now and it is a great pattern, but think of an ETL pipeline where you fetch a huge JSON payload, store it in the database and then transform it and load it in another model. I had an use case where I wanted to process the JSON payload and pass it down the pipeline before storing it in the useful model. I didn't want to store the intermediate JSON anywhere. I benchmarked it for this specific use case.
...well, that's good for scaling the queue, but this means the worker needs to load all relevant state/context from some DB (which might be sped up with a cache, but then things are getting really complex)
ideally you pass the context that's required for the job (let's say it's less than 100Kbytes), but I don't think that counts as large JSON, but request rate (load) can make even 512byte too much, therefore "it depends"
but in general passing around large JSONs on the network/memory is not really slow compared to writing them to a DB (WAL + fsync + MVCC management)
If you ask around Magnificent 7, a lot of the talk rhymes with: "we're converting Opex into Capex", translated: "we're getting rid of people to invest in data centers (to hopefully be able to get rid of even more people over time).
There are tons of articles online about this, here's one:
They're all doing it, Microsoft, Google, Oracle, xAI, etc. Those nuclear power plants they want to build, that's precisely to power all the extra data centers.
If anything, everyone hopes to outsource data validation (the modern equivalent to bricklayers under debt slavery).
reply