Will this work for Cowork as well?

stingraycharles · 2026-03-13T12:21:36 1773404496

This is not at all an MCP server you want to use with a regular tool, as this is about low level context window management. Tbh it’s really trivial to do this, and I have no idea why OP decided to make an MCP server for this as it’s completely useless for that.

As a matter of fact, i think this is not a problem at all as Anthropic makes it extremely easy to cache stuff; you just set your preferred cache level on the last message, and Anthropic will automatically cache it under the hood. Every distinct message is another “cache” point, eg they first compute the hash of all messages, if not found, compute the hash of all messages - 1, etc.

It’s really a non problem.

ermis · 2026-03-13T12:42:18 1773405738

No. Claude.ai is a consumer product — you have no access to the API layer underneath it. cache_control is an API-level feature only. This plugin works exclusively when you're making direct Anthropic API calls, either through the SDK in your own code or through MCP-compatible clients like Claude Code, Cursor, Windsurf, etc.

stingraycharles · 2026-03-13T13:36:28 1773408988

How would it work when you’re making Anthropic API calls? Wouldn’t an LLM have to invoke this, and as such, somehow the LLM needs to invoke this MCP tool (which is done using a tool call ie an answer from the LLM) before sending the request to Anthropic?

I am so confused why you chose an MCP server to solve this, wouldn’t a regular API at least have some merit in how it could be used (in that it doesnt require an LLM to invoke it) ?