More

boarush · 2026-02-25T12:58:46 1772024326

Changing the definition of drop-in definitely has me concerned and makes me not take this any seriously than other projects open-sourced by Cloudflare, particularly the ones focused on more critical parts of their systems – e.g. pingora and ecdysis.

boarush · 2026-02-21T19:28:47 1771702127

While neither am I nor the company I work for directly impacted by this outage, I wonder how long can Cloudflare take these hits and keep apologizing for it. Truly appreciate them being transparent about it, but businesses care more about SLAs and uptime than the incident report.

llama052 · 2026-02-21T19:42:39 1771702959

I’ll take clarity and actual RCAs than Microsoft’s approach of not notifying customers and keeping their status page green until enough people notice.

One thing I do appreciate about cloudflare is their actual use of their status page. That’s not to say these outages are okay. They aren’t. However I’m pretty confident in saying that a lot of providers would have a big paper trail of outages if they were more honest to the same degree or more so than cloudflare. At least from what I’ve noticed, especially this year.

boarush · 2026-02-21T19:47:21 1771703241

Azure straight up refuses to show me if there's even an incident even if I can literally not access shit.

But last few months has been quite rough for Cloudflare, and a few outages on their Workers platform that didn't quite make the headlines too. Can't wait for Code Orange to get to production.

jacquesm · 2026-02-21T19:58:41 1771703921

Bluntly: they expended that credit a while ago. Those that can will move on. Those that can't have a real problem.

As for your last sentence:

Businesses really do care about the incident reports because they give good insight into whether they can trust the company going forward. Full transparency and a clear path to non-repetition due to process or software changes are called for. You be the judge of whether or not you think that standard has been met.

boarush · 2026-02-21T20:04:22 1771704262

I might be looking at it differently, but aren't decisions over a certain provider of service being made by the management. Incident reports don't ever reach there in my experience.

jacquesm · 2026-02-21T23:31:27 1771716687

Every company that relies on their suppliers and that has mature management maintains internal supplier score cards as part of their risk assessment, more so for suppliers that are hard to find replacements for. They will of course all have their of thresholds for action but what has happened in the last period with CF exceeds most of the thresholds for management comfort that I'm aware of.

Incident reports themselves are highly technical, so will not reach management because they are most likely simply not equipped to deal with them. But the CTOs of the companies will take notice, especially when their own committed SLAs are endangered and their own management asks them for an explanation. CF makes them all look bad right now.

samrus · 2026-02-21T20:50:18 1771707018

In my experience, the gist of it does reach management when its an existing vendor. Especially if management is tech literate

Becuase management wants to know why the graphs all went to zero, and the engineers have nothing else to do but relay the incident report.

This builds a perception for management of the vendor, and if the perception is that the vendor doesnt tell them shit or doesnt even seem to know theres an outage, then management can decide to shift vendors

boarush · 2025-11-19T05:53:49 1763531629

Not all em dashes are ChatGPT. Good writers use it wherever required.

boarush · 2025-10-15T06:31:43 1760509903

Reading this really makes me wonder how does Chrome actually optimize for the plethora of devices running v8 (under Chrome). Definitely involves tricky decisions to be taken for great performance.

boarush · 2025-10-04T17:49:45 1759600185

I believe it is to make sure that the product remains compliant with the data guarantees that Workspace provides. You aren't paying for the latest and the greatest features, you're paying for the support and compliance guarantees your business expects.

boarush · 2025-09-25T15:54:42 1758815682

You can also reply to incoming emails from what I know, you just cannot initiate any email directly to prevent the obvious abuse. I wonder how they plan to mitigate that apart from keeping the pricing sane.

boarush · 2025-09-10T16:36:10 1757522170

I don't think this is just the occasional 503s, and it is not just Claude Code. Their console is also down.

boarush · 2025-09-10T16:35:11 1757522111

Anthropic has by far been the most unreliable provider I've ever seen. Daily incidents, and this one seems to have taken down all their services. Can't even login to the Console.

Insanity · 2025-09-10T16:58:15 1757523495

Maybe they have vibe-coded their own stack!

But less tongue-in-cheek, yeah Anthropic definitely has reliability issues. It might be part of trying to move fast to stay ahead of competitors.

Analemma_ · 2025-09-10T17:04:47 1757523887

The tongue-in-cheek jokes are kind of obvious, but even without the snark I think it is worth asking why the supposed 100x productivity boost from Claude Code I keep hearing about hasn't actually resulted in reliability improvements, even from developers who presumably have effectively-unlimited token budgets to spend on improving their stack.

Uehreka · 2025-09-10T18:02:15 1757527335

I love how people like Simon Willison and Pete Steinberger spend all this effort trying to be skeptical of their own experiences and arrive at nuanced takes like “50% more productive, but that’s actually a pretty big deal, but the nature of the increase is complicated” and y’all just keep repeating the brainrotted “100x, juniors are cooked” quote you heard someone say on LinkedIn.

CuriouslyC · 2025-09-10T17:59:33 1757527173

AI gives you what you ask for. If you don't understand your true problems, and you ask it to solve the wrong problems, it doesn't matter how much compute you burn, you're still gonna fail.

adastra22 · 2025-09-10T17:47:06 1757526426

They have. Claude Code was their internal dev tool, and it shows.

CuriouslyC · 2025-09-10T17:57:16 1757527036

And yet even dogfooding their own product heavily, it's still a giant janky pile. The prompt work is solid, the focus on optimizing tools was a good insight, and the model makes a good agent, but the actual claude code software is pretty shameful to be the most viable product of a billion dollar company.

shuckles · 2025-09-10T19:30:30 1757532630

What artifact are you evaluating to come to this conclusion? Is the implementation available?

rmonvfer · 2025-09-10T21:28:33 1757539713

The source for one of the initial versions got leaked a while ago and let’s say it’s not very good architecturally speaking, specifically when compared with the Gemini CLI, which it open source.

The point of Claude Code is deep integration with the Claude models, not the actual CLI as a piece of software, which is quite buggy (it also has some great features, of course!)

At least for me, if I didn’t have to put in the work to modify the Gemini CLI to work reliably with Claude (or at least to get a similar performance), I wouldn’t use Claude Code CLI (and I say this while paying $200 per month to Anthropic because the models are very good)

CuriouslyC · 2025-09-10T19:40:32 1757533232

A. I use it daily to take advantage of the plan inference discount.

B. Let's just say I didn't write the most robust javascript decompilation/deminification engine in existence solely as an academic exercise :)

mccoyb · 2025-09-11T02:21:21 1757557281

Share, if you please.

CuriouslyC · 2025-09-11T03:42:16 1757562136

https://github.com/sibyllinesoft/arachne

There are a lot more stuff (both released and still cooking) on my products page (https://sibylline.dev/products), I will be doing a few drops this week, including hopefully something pretty huge (benchmark validation is killing me but I'm almost good to cut release).

cainxinth · 2025-09-10T18:16:55 1757528215

I've been paying for the $20/m plan from Anthropic, Google, and OpenAI for the past few months (to evaluate which one I want to keep and to have a backup for outages and overages).

Gemini never goes down, OpenAI used to go down once in a while but is much more stable now, and Anthropic almost never goes a full week without throwing an error message or suffering downtime. It's a shame because I generally prefer Claude to the others.

panarky · 2025-09-10T20:17:27 1757535447

Same here, but for API access to the big three instead of their web/app products, and Gemini also shows greater uptime.

But even when the API is up, all three have quite high API failure rates, such as tool calls not responding with valid JSON, or API calls timing out after five minutes with no response.

Definitely need robust error handling and retries with exponential backoff because maybe one in twenty-five calls fails and then succeeds on retry.

boarush · 2025-09-10T21:13:54 1757538834

Invalid JSON and other formatting issues is more towards the model behavior I would say since no model guarantees that level of conformance to the schema. I wouldn't necessarily club it with the downtime of the API.

j45 · 2025-09-10T21:27:17 1757539637

A lot of people might be discovering their preference for Claude.

RobertLong · 2025-09-10T17:05:19 1757523919

All the AI labs are but Anthropic is the worst. Anyone serious about running Claude in prod is using Bedrock or Vertex. We've been pretty happy with Vertex.

boarush · 2025-09-10T16:45:00 1757522700

I wonder why they haven't invested a lot more in the inference stack? Is it really that different from Google, OpenAI and other open weight models?

paulddraper · 2025-09-10T21:22:08 1757539328

OpenAI used to be just as bad if not worse.

But they've stabilized the past 5 months.

ihaveajob · 2025-09-10T17:03:48 1757523828

Have you used Bitbucket?

boarush · 2025-09-10T17:07:22 1757524042

A core research library for MATLAB I used in a course project used to be on BitBucket, though thankfully didn't have to deal with a lot of collaboration there.

boarush · 2025-04-09T13:13:38 1744204418

Using Workers is now what Cloudflare recommends by default, with "Static Assets" to host all the static content for your website. Pages, as I understand, is already built on the Workers platform, so it's all just simplifying the DX for Cloudflare's platform and giving more options to choose what rendering strategy you use for your website.

boarush · on Feb 27, 2025

Had me confused for a second too, but I think it is the former that they meant.

K8s has unneeded complexity which is really not required at even decent enough scales, if you've put in enough effort to architect a solution that makes the right calls for your business.

KaiserPro · on Feb 27, 2025

yeah sorry, double negatives.

People got burnt by kubernetes, and that pissed in the well of enthusiasm for experimenting with distributed systems

DrFalkyn · on Feb 28, 2025

Because people, especially Devops, thought k8s was some magic, when all it really does is makes the mechanics easier

If you’re architecture is poor k8s won’t help you