Package a skill with your CLI itself and give users instructions on how to install the skill properly. That allows the agent to read the instructions in a context efficient way when it wants to use the CLI
MCP loads all tools immediately. CLI does not because it’s not auto exposed to the agent, got have more control of how the context of which tools exist, and how to deliver that context.
Accurate for naive MCP client implementations, but a proxy layer with inference-time routing solves exactly this control problem. BM25 semantic matching on each incoming query exposes only 3-5 relevant tool schemas to the agent rather than loading everything upfront - the 44K token cold-start cost that the article cites mostly disappears because the routing layer is doing selection work. MCPProxy (https://github.com/smart-mcp-proxy/mcpproxy-go) implements this pattern: structured schemas stay for validation and security quarantine, but the agent only sees what's relevant per query rather than the full catalog. The tradeoff isn't MCP vs CLI - it's routing-aware MCP vs naive MCP, and the former competes with CLI on token efficiency while retaining the organizational benefits the article argues for.
It does not have to load all tools. As you are able to hide the details in CLI you can implement the same in MCP server and client.
Just follow the widely accepted pattern (all you need 3 tools in front):
- listTools - List/search tools
- getToolDetails - Get input arguments for the given tool name
- execTool - Execute given tool name with input arguments
HasMCP - Remote MCP framework follows/allows this pattern.
You've missed the point and hyperfocused on the story around context and not why an org would want to have centralized servers exposing MCP endpoints instead of CLIs
1. The part where you are providing 100 tools instead of a few really flexible tools
2. The part where you think your agent is going to know how to use 100 CLI tools that are not already in its training dataset without using extra turns walking the help content to dump out command names and schemas
3. The part where, without a schema defining the inputs, the LLM wastes iterations trying to correct the input format.
4. The part where, not having the full picture of the tools, your odds of it picking the same tools or the right tools is completely gambling that it outputs the right keywords to trigger the tool to be used.
5. The part where you forgot to mention that for your agent to know that your 100 CLI tools exist, you had to either provide it in context directly, provide it in context in a README.md, or have it output the directory listing and send that off to the LLM to evaluate before picking the tool and then possibly expanding the man pages for several tools and sub commands using several turns.
Don't get me wrong, CLIs are great if its already in the LLMs training set (`git`, for example). Not so great if it's not because it will need to walk the man pages anyways.
> The part where you are providing 100 tools instead of a few really flexible tools
I'm not sure how that solves the issue. The shape of each individual tool will be different enough that you will need different schema - something you will be passing each time in MCP and something you can avoid in CLI. Also, CLI's can also be flexible.
> The part where you think your agent is going to know how to use 100 CLI tools that are not already in its training dataset without using extra turns walking the help content to dump out command names and schemas
By CLI's we mean SKILLS.md so it won't require this hop.
> The part where, without a schema defining the inputs, the LLM wastes iterations trying to correct the input format.
What do we lose by one iteration? We lose a lot by passing all the tool shapes on each turn.
> The part where, not having the full picture of the tools, your odds of it picking the same tools or the right tools is completely gambling that it outputs the right keywords to trigger the tool to be used.
we will use skills
> The part where you forgot to mention that for your agent to know that your 100 CLI tools exist, you had to either provide it in context directly, provide it in context in a README.md, or have it output the directory listing and send that off to the LLM to evaluate before picking the tool and then possibly expanding the man pages for several tools and sub commands using several turns.
MCP is fine, particular remote MCP which is the lowest friction way to get access to some hosted service with auth handled for you.
However, MCP is context bloat and not very good compared to CLIs + skills mechanically. With a CLI you get the ability to filter/pipe (regular Unix bash) without having to expand the entire tool call every single time in context.
CLIs also let you use heredoc for complex inputs that are otherwise hard to escape.
CLIs can easily generate skills from the —help output, and add agent specific instructions on top. That means you can give the agent all the instructions it needs to know how to use the tools, what tools exist, lazy loaded, and without bloating the context window with all the tools upfront (yes, I know tool search in Claude partially solves this).
CLIs also don’t have to run persistent processes like MCP but can if needed
It's a different model and designed to 'think very hard' about issues. It's basically a 'very extended thinking mixed with research' type of solution.
While the 'research' solutions tend to go very wide and come back with a 'paper' the Pro model seems to do an exhaustive amount of thinking combined with research, and tries to integrate findings. I think it goes down a lot of rabbit holes.
I find it's by far the best way to find solutions to hard problems, but it typically does require a 'hard problem' in order to shine.
And it takes an enormous amount of time. Ito could be essentially a form of 'saturating the problem with tokens'. It's OAI's most expensive model by far. A prompt usually costs me $1-3 if paying per token.
I added tests to an old project a few days ago. I spent a while to carefully spec everything out, and there was a lot of tedious work. Aiming for 70% coverage meant that a few thousand unit tests were needed.
I wrote up a technical plan with Claude code and I was about to set it to work when I thought, hang on, this would be very easy to split into separate work, let's try this subagent thing.
So I asked Claude to split it up into non- overlapping pieces and send out as many agents as it could to work on each piece.
I expected 3 or 4. It sent out 26 subagents. Drudge work that I estimate would have optimistically taken me several months was done in about 20 minutes. Crazy.
Of course it still did take me a couple of days to go through everything and feel confident that the tests were doing their job properly. Asking Claude to review separate sections carefully helped a lot there too. I'm pretty confident that the tests I ended up with were as good as what I would have written.
The problem is that only Deno can type check single file scripts. Otherwise with Node and Bun you need a project to use tsc. Python can type check single file scripts (even with PEP 723 deps) with ty. Otherwise, I love TS for scripting, especially with Bun shell
Fair point. If I'm writing scripts, I normally opt for a JS file with JSDoc annotations if I want autocomplete for types (you don't need a tsconfig/tsc installed in WebStorm). Most projects I've encountered with TS scripts were already TS projects to begin with.