More

juanre · 2026-03-01T18:03:04 1772388184

Reports of MCP's demise have been greatly exaggerated, but a CLI is indeed the right choice when the interface to the LLM is not a chat in a browser window.

For example, I built https://claweb.ai to enable agents to communicate with other agents. They run aw [1], an OSS Go CLI that manages all the details. This means they can have sync chats (not impossible with MCP, but very difficult). It also enables signing messages and (coming soon) e2ee. This would be, as far as I can tell, impossible using MCP.

[1] https://github.com/awebai/aw

juanre · 2026-02-28T19:02:05 1772305325

"The system they built feels slightly foreign even as it functions correctly." This is exactly the same issue that engineers who become managers have. You are further away from the code; your understanding is less grounded, it feels disconnected.

When software engineers become agent herders their day-to-day starts to resemble more that of a manager than that of an engineer.

allan_s · 2026-02-28T19:18:40 1772306320

exactly, as a manager and a sometimes a developer, "vibe-coding" has been looking more and more as my day job (in a good way, it's good to not have to do all the dirty work for your pet projects) and it's all about having the same discipline in term of:

* thinking about the big picture * knowing how you can verify that the code match the big picture.

In both case, somtimes you are happily surprised, sometimes you discover that the things you told 3 times the one writing code to do was still not done.

archagon · 2026-02-28T20:12:18 1772309538

Engineering is not "dirty work."

Management is not "engineering."

mort96 · 2026-02-28T19:36:21 1772307381

Do you view it as an issue at all that when everyone takes on a more manager-like role, no human remains who has the hands-on experience and understanding of the system?

konschubert · 2026-02-28T19:46:49 1772308009

And like good management, the solution is to define clear domain boundaries, quality requirements, and a process that enables iterative improvement both within and across domains.

juanre · 2026-02-24T15:35:40 1771947340

The key is what we consider good code. Simon’s list is excellent, but I’d push back on this point:

> it does only what’s needed, in a way that both humans and machines can understand now and maintain in the future

We need to start thinking about what good code is for agents, not just for humans.

For a lot of the code I’m writing I’m not even “vibe coding” anymore. I’m having an agent vibe code for me, managing a bunch of other agents that do the actual coding. I don’t really want to look at the code, just as I wouldn’t want to look at the output of a C compiler the way my dad did in the late ’80s.

Over the last few decades we’ve evolved a taste for what good code looks like. I don’t think that taste is fully transferable to the machines that are going to take over the actual writing and maintaining of the code. We probably want to optimize for them.

juanre · 2026-02-22T11:31:26 1771759886

Shameless plug: https://beadhub.ai allows you to do exactly that, but with several agents in parallel. One of them is in the role of planner, which takes care of the source-of-truth document and the long term view. They all stay in sync with real-time chat and mail.

It's OSS.

Real-time work is happening at https://app.beadhub.ai/juanre/beadhub (beadhub is a public project at https://beadhub.ai so it is visible).

Particularly interesting (I think) is how the agents chat with each other, which you can see at https://app.beadhub.ai/juanre/beadhub/chat

juanre · 2026-02-21T23:16:15 1771715775

This sounds very plausible. Arguably MCPs are already a step in that direction: give the LLMs a way to use services that is text-based and easy for them. Agents that look at your screen and click on menus are a cool but clumsy and very expensive intermediate step.

When I use telegram to talk to the OpenClaw instance in my spare Mac I am already choosing a new interface, over whatever was built by the designers of the apps it is using. Why keep the human-facing version as is? Why not make an agent-first interface (which will not involve having to "see" windows), and make a validation interface for the human minder?

xp84 · 2026-02-22T16:17:52 1771777072

I've thought about this a lot, even before LLMs - so much about the modern web especially is so slow and bloated. I want the airline to give me an API to query flights and one to book, I don't need 400 nested DIVs of styled components vomited at me every pageview. But everyone considers API access to be "commercial" and are afraid someone else will make money without them getting an extra cut.

juanre · 2026-02-12T19:23:06 1770924186

It does not matter which of the scenarios is correct. What matters is that it is perfectly plausible that what actually happened is what the OP is describing.

We do not have the tools to deal with this. Bad agents are already roaming the internet. It is almost a moot point whether they have gone rogue, or they are guided by humans with bad intentions. I am sure both are true at this point.

There is no putting the genie back in the bottle. It is going to be a battle between aligned and misaligned agents. We need to start thinking very fast about how to coordinate aligned agents and keep them aligned.

wizzwizz4 · 2026-02-12T22:35:02 1770935702

> There is no putting the genie back in the bottle.

Why not?

juanre · 2026-02-12T23:24:40 1770938680

I cannot see how.

tovej · 2026-02-13T07:57:50 1770969470

Ban AI products that cause harm? Did we forget that governments can regulate what companies are allowed to do.

wizzwizz4 · 2026-02-13T13:40:00 1770990000

If we stop using these things, and pass laws to clarify how the notion of legal responsibility interacts with the negligent running of semi-automated computer programs (though I believe there's already applicable law in most jurisdictions), then AI-enabled abusive behaviour will become rare.

chrismorgan · 2026-02-13T04:15:04 1770956104

The Roman empire declined and fell. Many inventions were lost.

juanre · 2026-02-11T09:32:11 1770802331

I’m in Steve’s demographic, showing similar symptoms, and I’m as worried as he is about how we’re going to cope.

It’s a matter of opportunity cost. It used to be that when I rested for an hour, I lost an hour of output. Now, when I rest for an hour, I lose what used to be a day of output.

I need to rewire my brain and learn how to split the difference. There’s no point in producing a lot of output if I don’t have time to live.

The idea that you’ll get to enjoy the spoils when you grow up is false. You won’t. Just produce 5x and take some time off every day. You may even be more likely to reflect, and end up producing the right thing.

juanre · 2026-01-24T22:46:33 1769294793

I have been using a simpler version of this pattern, with a coordinator and several more or less specialized agents (eg, backend, frontend, db expert). It really works, but I think that the key is the coordinator. It decreases my cognitive load, and generally manages to keep track of what everyone is doing.

juanre · 2026-01-23T18:03:59 1769191439

I have not tried Gas Town yet, but Steve's beads https://github.com/steveyegge/beads (used by Gas Town) has been a game-changer, on the order of what claude code was when it arrived.

zingar · 2026-01-23T20:52:02 1769201522

Do you have any workflow tips or write up with beads?

juanre · 2026-01-23T22:29:28 1769207368

My workflow tends to be very simple: start a session; ask the agent "what's next", which prompts it to check beads; and more often than not ask it to just pick up whichever bead "makes more sense".

In claude I have a code-reviewer agent, and I remind cc often to run the code reviewer before closing any bead. It works surprisingly well.

I used to monitor context and start afresh when it reached ~80%, but I stopped doing that. Compacting is not as disruptive as it used to be, and with beads agents don't lose track.

I spent some time trying to measure the productivity change due to beads, analysing cc and codex logs and linking them to deltas and commits in git [1]. But I did not fully believe the result (5x increase when using beads, there has to be some hidden variable) and I moved on to other things.

Part of the complexity is that these days I often work on two or three projects at the same time, so attribution is difficult.

[1] Analysis code is at https://github.com/juanre/agent-taylor

juanre · 2026-01-15T13:38:59 1768484339

I built https://github.com/juanre/llmemory and I use it both locally and as part of company apps. Quite happy with the performance.

It uses PostgreSQL with pgvector, hybrid BM25, multi-query expansion, and reranking.

(It's the first time I share it publicly, so I am sure there'll be quirks.)