More

handoflixue · 2026-02-25T01:08:04 1771981684

> The core problem is that risks were not being identified (systematically or in response to expert feedback) and prioritised.

Or the person who wrote the article just wasn't involved in that loop, or otherwise disagreed on what threat models mattered.

angry_octet · 2026-02-26T09:49:31 1772099371

It seems much more a compliance and auditing goal. To meet some objective of knowing who is in the office at what time, which informs office space leasing decisions, return to office mandates, decisions of charging for staff parking, etc. Personnel protection seems almost an afterthought.

Protecting JIRA auth tokens is quite likely low down the list for IT security. Making sure your workers are not remote North Koreans is indeed a security benefit of secured physical offices with regular on-site work.

But the author did have a deeper point -- visible security theatre gets lots of money and management attention, while meaningful expert driven changes are mired in bureaucracy.

handoflixue · 2026-02-27T02:27:35 1772159255

I still challenge whether his proposal was actually "meaningful, expert driven changes" - is this actually a serious threat vector? How would you actually exploit it, without having access to dozens of other vectors? Can you even meaningfully resolve that vulnerability when you have people walking in off the streets due to a lack of physical security?

handoflixue · 2026-02-24T05:37:49 1771911469

Testing some subset X does not mean the test is rigged unless they failed to disclose that.

But also:

GPT 5.2 Thinking, Standard Effort: Walk - https://chatgpt.com/share/699d38cb-e560-8012-8986-d27428de8a...

I'm assuming "GPT 5.2 Thinking" is, in fact, a thinking model?

randomtoast · 2026-02-24T10:01:14 1771927274

The problem is you haven't used the API, but you have used your ChatGPT subscriptions with personality, memories and possible customization. I can see for instance that your ChatGPT answers with emojis, while my ChatGPT subscription never does.

If you ask GPT 5.2 with high reasoning efforts in the API, you get 10 out of 10: drive.

handoflixue · 2026-02-27T02:25:29 1772159129

If it doesn't work at all using the most popular pricing plans (subscription), AND it doesn't work on the most popular way of accessing it (web), then it seems fair to say there's a problem.

And the problem is NOT that I'm using a product in the advertised, intended way.

handoflixue · 2026-02-24T05:34:55 1771911295

Critically, that's not the question that was asked. It's not "My car is 50m away", it's "The Car Wash Is 50 Meters Away"

Which hopefully explains why everyone is assuming that "washing your car" does in fact mean "taking your car to the car wash"

handoflixue · 2026-02-24T05:31:42 1771911102

Claude Code has an entire tool for the LLM to asking clarifying questions - it'll give you three pre-written responses or you can respond with your own text.

handoflixue · 2026-02-24T05:29:20 1771910960

Oh wow, Sonnet still isn't handling it well:

Opus 4.6: Drive (https://claude.ai/share/d57fef01-df32-41f2-b1dc-07de7916bdc7)

Opus 4.5: Drive (https://claude.ai/chat/a590cac1-100a-490b-b0a2-df6676e1ae99)

Opus 3.0: Walk (https://claude.ai/chat/372c144c-d6eb-43f5-b7ea-fd4c51c681db)

Sonnet 4.6: Walk (https://claude.ai/share/1f2a80f3-4741-40a5-8a05-7349ea1a17e5)

Sonnet 4.5: Walk (https://claude.ai/share/905afeb6-ffc9-4b4b-a9ee-4481e5cfd527)

Favorite answer, using my default custom instructions: "Drive. Walking there means... leaving your car at home? Walk it there on a leash? Walk if you want the exercise, but you're bringing the car either way."

randomtoast · 2026-02-24T10:03:04 1771927384

This is because it is without thinking enabled. Of course the results are disappointing.

handoflixue · 2026-02-27T02:29:42 1772159382

It seems entirely fair to evaluate a product based on the baseline that the company itself offers.

handoflixue · 2026-02-24T03:04:04 1771902244

Curious to see the link to him saying that back in 2014, didn't realize he'd been saying it for quite that long!

handoflixue · 2026-02-23T05:54:54 1771826094

I'd hardly call decentralization a "hypothetical" issue: we've already seen governments are willing to issue gag orders so that we can't even find out what they're doing inside major companies. That's clearly a lot easier to do when there's a single central point of control.

If there's a single central point of control, then that also means an outage takes everything offline, instead of just 1-2 tools. That also makes it a bigger target for attackers.

jen20 · 2026-02-23T18:39:39 1771871979

It doesn't even need to be an attacker - CloudFlare themselves have managed to take down impressive portions of the internet more times than should be accepted just this year.

handoflixue · 2026-02-20T04:29:16 1771561756

See, the problem is, "obviously harmless" varies by person: if you think it is obviously harmless to ban an entire political party, which ostensibly won a legitimate election, and certainly had a lot of popular support... well then, of course we should also ban whichever current political party you consider most evil, right? And then the next most evil political party, and so on, until people have the freedom that comes from knowing only Good, Proper, State-Sanctioned Political Parties exist!

And of course, once it's illegal to agitate against violence, we just have to redefine violence: for instance, posting about Nazis puts them in danger, and they're all white, so clearly you're a racist for opposing Nazis.

These aren't hypothetical examples: the people defending Free Speech have watched these slippery slopes get pulled out again and again. Misgendering a trans person is a "hate crime", reporting on the location of gestapo agents is "inciting violence", protesting against the state is "terrorism"

And fundamentally, this is a lever that gets wielded by whoever is in power: even if you agree with the Left censoring Nazi salutes, are you equally comfortable with the Right censoring child mutilation sites (also known as "Trans resources")?

SURELY "child mutilation" is "obviously harmless" to ban, right?

hananova · 2026-02-20T11:07:26 1771585646

Child mutilation is obviously harmless to ban of course. Though calling trans resources that is equally obviously disingenuous.

Maybe Americans should take a break from criticizing the EU and fix their own shit first. It's incredibly frustrating to constantly see far right goons swing around "freedom of speech" as if that term hasn't been a fig leaf for ages. In the US, if you do something that the powers that be dislike that is covered by freedom of speech, they'll manufacture something else to hit you with. At least here in the EU, when you get investigated for something that freedom of expression covers, you'll at least get acquitted eventually.

handoflixue · 2026-02-20T04:14:37 1771560877

We encourage people to be safe about plenty of things they aren't responsible for. For example, part of being a good driver is paying attention and driving defensively so that bad drivers don't crash into you / you don't make the crashes they cause worse by piling on.

That doesn't mean we're blaming good drivers for causing the car crash.

handoflixue · 2026-02-20T04:11:17 1771560677

Does it actually cut both ways? I see tons of harassment at people that use AI, but I've never seen the anti-AI crowd actively targeted.

nekal · 2026-02-20T05:00:33 1771563633

Anti-AI people are treated in a condescending way all the time. Then there is Suchir Balaij.

Since we are in a Matplotlib thread: People on the NumPy mailing list that are anti-AI are actively bullied and belittled while high ranking officials in the Python industrial complex are frolicking at AI conferences in India.

minimaxir · 2026-02-20T04:25:36 1771561536

It's to a lesser extent that blurs the line between harassment and trolling: I've retracted my comment.

tovej · 2026-02-20T06:34:59 1771569299

I see it all the time. If you're anti-AI your boss may call you a luddite and consider you not fit for promotion.