Hacker Newsnew | past | comments | ask | show | jobs | submit | jswny's commentslogin

Package a skill with your CLI itself and give users instructions on how to install the skill properly. That allows the agent to read the instructions in a context efficient way when it wants to use the CLI


MCP loads all tools immediately. CLI does not because it’s not auto exposed to the agent, got have more control of how the context of which tools exist, and how to deliver that context.


Accurate for naive MCP client implementations, but a proxy layer with inference-time routing solves exactly this control problem. BM25 semantic matching on each incoming query exposes only 3-5 relevant tool schemas to the agent rather than loading everything upfront - the 44K token cold-start cost that the article cites mostly disappears because the routing layer is doing selection work. MCPProxy (https://github.com/smart-mcp-proxy/mcpproxy-go) implements this pattern: structured schemas stay for validation and security quarantine, but the agent only sees what's relevant per query rather than the full catalog. The tradeoff isn't MCP vs CLI - it's routing-aware MCP vs naive MCP, and the former competes with CLI on token efficiency while retaining the organizational benefits the article argues for.


It does not have to load all tools. As you are able to hide the details in CLI you can implement the same in MCP server and client.

Just follow the widely accepted pattern (all you need 3 tools in front): - listTools - List/search tools - getToolDetails - Get input arguments for the given tool name - execTool - Execute given tool name with input arguments

HasMCP - Remote MCP framework follows/allows this pattern.


I’m not a technical person but I’ve seen people share various tips and tricks to get around the MCP context issues. There’s also this from Anthropic:

https://www.anthropic.com/engineering/code-execution-with-mc


You can solve the same problem by giving subsets of MCP tools to subagents so each subagent is responsible for only a subset of tools.

Or...just don't slam 100 tools into your agent in the first place.


>Or...just don't slam 100 tools into your agent in the first place.

But I can do them with CLI so that's a negative for MCP?


You've missed the point and hyperfocused on the story around context and not why an org would want to have centralized servers exposing MCP endpoints instead of CLIs


I would want to know what point I missed. I can have 100 CLI's but not 100 MCP tools.

100 MCP tools will bloat the context whereas 100 CLI's won't. Which part do you disagree with?


1. The part where you are providing 100 tools instead of a few really flexible tools

2. The part where you think your agent is going to know how to use 100 CLI tools that are not already in its training dataset without using extra turns walking the help content to dump out command names and schemas

3. The part where, without a schema defining the inputs, the LLM wastes iterations trying to correct the input format.

4. The part where, not having the full picture of the tools, your odds of it picking the same tools or the right tools is completely gambling that it outputs the right keywords to trigger the tool to be used.

5. The part where you forgot to mention that for your agent to know that your 100 CLI tools exist, you had to either provide it in context directly, provide it in context in a README.md, or have it output the directory listing and send that off to the LLM to evaluate before picking the tool and then possibly expanding the man pages for several tools and sub commands using several turns.

Don't get me wrong, CLIs are great if its already in the LLMs training set (`git`, for example). Not so great if it's not because it will need to walk the man pages anyways.


> The part where you are providing 100 tools instead of a few really flexible tools

I'm not sure how that solves the issue. The shape of each individual tool will be different enough that you will need different schema - something you will be passing each time in MCP and something you can avoid in CLI. Also, CLI's can also be flexible.

> The part where you think your agent is going to know how to use 100 CLI tools that are not already in its training dataset without using extra turns walking the help content to dump out command names and schemas

By CLI's we mean SKILLS.md so it won't require this hop.

> The part where, without a schema defining the inputs, the LLM wastes iterations trying to correct the input format.

What do we lose by one iteration? We lose a lot by passing all the tool shapes on each turn.

> The part where, not having the full picture of the tools, your odds of it picking the same tools or the right tools is completely gambling that it outputs the right keywords to trigger the tool to be used.

we will use skills

> The part where you forgot to mention that for your agent to know that your 100 CLI tools exist, you had to either provide it in context directly, provide it in context in a README.md, or have it output the directory listing and send that off to the LLM to evaluate before picking the tool and then possibly expanding the man pages for several tools and sub commands using several turns.

skills



”to know what tools you have access to read the dockerfile”?


MCP is fine, particular remote MCP which is the lowest friction way to get access to some hosted service with auth handled for you.

However, MCP is context bloat and not very good compared to CLIs + skills mechanically. With a CLI you get the ability to filter/pipe (regular Unix bash) without having to expand the entire tool call every single time in context.

CLIs also let you use heredoc for complex inputs that are otherwise hard to escape.

CLIs can easily generate skills from the —help output, and add agent specific instructions on top. That means you can give the agent all the instructions it needs to know how to use the tools, what tools exist, lazy loaded, and without bloating the context window with all the tools upfront (yes, I know tool search in Claude partially solves this).

CLIs also don’t have to run persistent processes like MCP but can if needed


but you need to _install_ a CLI. with MCP, you just configure!


Plenty of MCPs require you to install and run them locally, like I said remote MCP has a real advantage over CLI tho


You just paste in a web link to a skill. Your agent is smart enough to know hours to use it or save it.


agree!


Can you explain what’s so different about pro?

I’ve used everything frontier model and had Pro a while ago but it seemed to just be the same models served faster at the time.


It's a different model and designed to 'think very hard' about issues. It's basically a 'very extended thinking mixed with research' type of solution.

While the 'research' solutions tend to go very wide and come back with a 'paper' the Pro model seems to do an exhaustive amount of thinking combined with research, and tries to integrate findings. I think it goes down a lot of rabbit holes.

I find it's by far the best way to find solutions to hard problems, but it typically does require a 'hard problem' in order to shine.

And it takes an enormous amount of time. Ito could be essentially a form of 'saturating the problem with tokens'. It's OAI's most expensive model by far. A prompt usually costs me $1-3 if paying per token.


How do you get sub agents to work?


Add this to config.toml:

  [features]
  collab = true


That’s just spawning multiple parallel explore agents instructed to look at different things, and then compiling results

That’s a pretty basic functionality in Claude code


Sounds like I should probably switch to claude code cli. Thanks for the info. :)


I added tests to an old project a few days ago. I spent a while to carefully spec everything out, and there was a lot of tedious work. Aiming for 70% coverage meant that a few thousand unit tests were needed.

I wrote up a technical plan with Claude code and I was about to set it to work when I thought, hang on, this would be very easy to split into separate work, let's try this subagent thing.

So I asked Claude to split it up into non- overlapping pieces and send out as many agents as it could to work on each piece.

I expected 3 or 4. It sent out 26 subagents. Drudge work that I estimate would have optimistically taken me several months was done in about 20 minutes. Crazy.

Of course it still did take me a couple of days to go through everything and feel confident that the tests were doing their job properly. Asking Claude to review separate sections carefully helped a lot there too. I'm pretty confident that the tests I ended up with were as good as what I would have written.


What’s the point of having a public GitHub repo with PRs enabled if they will never merge any of them?


They merge bugfixes and documentation and they allow discussion in employee PRs


How do you know/toggle which API path you are using?


How does this work for other models that aren’t OpenAI models


It wouldn’t work for other models if it’s encoded in a latent representation of their own models.


The problem is that only Deno can type check single file scripts. Otherwise with Node and Bun you need a project to use tsc. Python can type check single file scripts (even with PEP 723 deps) with ty. Otherwise, I love TS for scripting, especially with Bun shell


Fair point. If I'm writing scripts, I normally opt for a JS file with JSDoc annotations if I want autocomplete for types (you don't need a tsconfig/tsc installed in WebStorm). Most projects I've encountered with TS scripts were already TS projects to begin with.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: