Hacker Newsnew | past | comments | ask | show | jobs | submit | gavmor's commentslogin

Hm... any idea how I might script `git worktree` into the new-pane action?

Currently experimenting with agent-of-empires for tmux+worktrees to parallelize code changes.


No built in way to override new-pane actions right now, but `cmux --help` can automate all parts of cmux.

So you can make your own script that can make new panels/workspaces and just invoke it from the terminal:

  git worktree add -b my-branch ../repo-my-branch
  ws=$(cmux new-workspace 2>&1 | awk '{print $2}')
  cmux send --workspace "$ws" "cd ../repo-my-branch && claude"
  cmux send-key --workspace "$ws" Enter
I think we should make this easier though, open to suggestions!

I've had plenty of success with skills juggling various entities via CLI.

LocalGPT uses Landlock LSM.

I feel he has been laudibly even-keeled about the whole thing.

This seems backwards, somehow. Like you're asking for an nth view and an nth API, and services are being asked to provide accessibility bridges redundant with our extant offerings.

Sites are now expected duplicate effort by manually defining schemas for the same actions — like re-describing a button's purpose in JSON when it's already semantically marked up?


No, I don't think you're thinking about this right. It's more like hacker news would expose an MCP when you visit it that would present an alternative and parallel interface to the page, not "click button" tools.

You're both right. The page can expose MCP tools like via a form element which is as simple as adding an attribute to an existing form and completely aligns with existing semantic HTML - eg submitting an HN "comment". Additionally, the page can define additional tools in javascript that aren't in forms - eg YouTube could provide a transcript MCP defined in JS which fetches the video's transcript

https://developer.chrome.com/blog/webmcp-epp


I think that rest and html could probably be already used for this purpose BUT html is often littered with elements used for visual structure rather than semantics.

In an ideal world html documents should be very simple and everything visual should be done via css, with JavaScript being completely optional.

In such a world agents wouldn’t really need a dedicated protocol (and websites would be much faster to load and render, besides being much lighter on cpu and battery)


> html could probably be already used for this purpose

You’re right, and it already is, and tools like playwright MCP can easily parse a webpage to use it and get things done with existing markup today.

> BUT html is often littered with elements used for visual structure rather than semantics.

This actually doesn’t make much of a difference to a tool like playwright because it uses a snapshot of the accessibility tree, which only looks at semantic markup, ignoring any presentation

> In such a world agents wouldn’t really need a dedicated protocol

They still do though, because they can work more better when given specific tools. WebMCP could provide tools not available on the page. Like an agent hits the dominoes.com landing page. The page could provide an order_pizza tool that the agent could interact with, saving a bunch of navigation, clicks and scrolling and whatnot. It calls the order_pizza tool with “Two large pepperoni pizzas for John at <address>”, and the whole process is done.


I see two totally different things from where we are today

1. This is a contextual API built into each page. Historically site's can offer an API, but that API a parallel experience, a separate machine-to-machine channel, that doesn't augment or extend the actual user session. The MCP API offered here is one offered by the page (not the server/site), in a fully dynamic manner (what's offered can reflect what the state of the page is), that layers atop user session. That's totally different.

2. This opens an expectation that sites have a standard means of control available. This has two subparts:

2a. There's dozens of different API systems available, to pick from, to expose your site. Github got half way from rest to graphql then turned back. Some sites use ttrpc or capnweb or gproto. There hasn't actually been one accepted way for machines to talk to your site, there's been a fractal maze of offerings on the web. This is one consistent offering mirroring what everyone is already using now anyways.

2b. Offering APIs for your site has gone out of favor in general. It often has had high walls and barriers when it is available. But now the people putting their fingers in that leaky damn are patently clearly Not Going To Make It, the LLM's will script & control the browser if they have to, and it's much much less pain to just lean in to what users want to do, and to expose a good WebMCP API that your users can enjoy to be effective & get shit done, like they have wanted to do all along. If webmcp takes off at all, it will reset expectations, that the internet is for end users, and that their agency & their ability to work your site as they please via their preferred modalities is king. WebMCP directs us towards a rfc8890 complaint future, by directly enabling site agency. https://datatracker.ietf.org/doc/html/rfc8890


To guard against this, the best course of action is probably modularization and composition, right? The Unix philosophy, ie building small, focused tools out of small, focused tools.


yes - i've thought that could work. returning to a more protected object oriented programming model (with hard-defined interfaces) could be a way - "make these changes but restrict yourself to this object" etc.


Better than summaries would be fact-checking: corroboration or counter-narratives.


Good point. Can you elaborate a little bit more? Do you mean corroboration within the same discussion or across multiple discussions?


Things I look for in a UI library:

1. Clean, expressive interface, 2. Extensive documentation.

That being said, good on you for shipping! I would like to try it just for the mystery factor.


Thanks! It'll get there eventually, for sure. Feedback like the stuff in this thread helps a lot.


Ouch, harsh words!

But, what would be your stack of choice? Or, what stack gives you the most confidence?


How is this different from any other agentic harness?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: