Hacker Newsnew | past | comments | ask | show | jobs | submit | elephanlemon's commentslogin

Agree. I’d like more fine grained control of context and compaction. If you spend time debugging in the middle of a session, once you’ve fixed the bugs you ought to be able to remove everything related to fixing them out of context and continue as you had before you encountered them. (Right now depending on your IDE this can be quite annoying to do manually. And I’m not aware of any that allow you to snip it out if you’ve worked with the agent on other tasks afterwards.)

I think agents should manage their own context too. For example, if you’re working with a tool that dumps a lot of logged information into context, those logs should get pruned out after one or two more prompts.

Context should be thought of something that can be freely manipulated, rather than a stack that can only have things appended or removed from the end.


Yeah, the fact that we have treated context as immutable baffles me, it’s not like humans working memory keeps a perfect history of everything they’ve done over the last hour, it shouldn’t be that complicated to train a secondary model that just runs online compaction, eg: it runs a tool call, the model determines what’s Germaine to the conversion and prunes the rest, or some task gets completed, ok just leave a stub in the context that says completed x, with a tool available to see the details of x if it becomes relevant again.

That's pretty much the approach we took with context-mode. Tool outputs get processed in a sandbox, only a stub summary comes back into context, and the full details stay in a searchable FTS5 index the model can query on demand. Not trained into the model itself, but gets you most of the way there as a plugin today.

Is it because of caching? If the context changes arbitrarily every turn then you would have to throw away the cache.

Oh that's quite a nice idea - agentic context management (riffing on agentic memory management).

There's some challenges around the LLM having enough output tokens to easily specify what it wants its next input tokens to be, but "snips" should be able to be expressed concisely (i.e. the next input should include everything sent previously except the chunk that starts XXX and ends YYY). The upside is tighter context, the downside is it'll bust the prompt cache (perhaps the optimal trade-off is to batch the snips).


Good point on prompt cache invalidation. Context-mode sidesteps this by never letting the bloat in to begin with, rather than snipping it out after. Tool output runs in a sandbox, a short summary enters context, and the raw data sits in a local search index. No cache busting because the big payload never hits the conversation history in the first place.

> For example, if you’re working with a tool that dumps a lot of logged information into context

I've set up a hook that blocks directly running certain common tools and instead tells Claude to pipe the output to a temporary file and search that for relevant info. There's still some noise where it tries to run the tool once, gets blocked, then runs it the right way. But it's better than before.


That's exactly what context-mode does for tool outputs. Instead of dumping raw logs and snapshots into context, it runs them in a sandbox and only returns a summary. The full data stays in a local FTS5 index so you can search it later when you need specifics.

Interesting. I’m having anything on Gemini being profitable though, do you happen to have a source?

Here's one, basically AI is driving 15% of Google's profits at the end of 2025.

https://advergroup.com/gemini-hits-650-million-users/

I didn't really realize how big Gemini was until I saw that Qualia was using it, they apparently used 0.01% of Geminis total tokens (100 billion) in about 3 months, they're in production with the title and escrow industry, so that's a great deal of data going through Gemini, unlike some chat subscription this is all API driven, which I doubt Google is charging at a loss for.

https://www.qualia.com/qualia-clear/

Unlike OpenAI, Google has an actual business model, not just strange circular deals.

Edit: I misswrote "majority of" instead of 15% of Google's profits.


> Here's one, basically AI is driving 15% of Google's profits at the end of 2025. https://advergroup.com/gemini-hits-650-million-users/

This does not at all tell us Gemini is profitable or driving 15% of its profits. The article does not mention profits even once. It then goes on to bizarrely compare Gemini's monthly active users to Open AI's weekly active ones.


Indeed, that article doesn't support a single part of that claim.

It kinda feels like an LLM-generated article that another LLM picked as a "citation", and then no human bothered to check if it actually said what the LLM said it did.

And, really, advergroup.com? Who sites an advertising agency as if it's a reliable resource?

https://advergroup.com/digital-marketing/

"AdverGroup Web Design and Creative Media Solutions is a full service advertising agency that delivers digital marketing services. We manage Google Ad Word campaigns and/or Meta Ad Campaigns for local clients in Chicago, Las Vegas and their surrounding suburbs."

So credible a resource on Gemini's performance/profitability... /sarc

But yeah it doesn't even actually say anything about profits, let alone attribute any specific percentage of profits to Gemini. It just vague marketing copy.


“You’re in luck if you’ve been hankering to have your wall connected to wifi.”


It’s so they can begin selling you a subscription to allow you to hang a picture.


Great news, there’s finally going to be sufficient motivation for people to both build out and use open source alternatives.


Interesting how pedantic he is!

> Then, too, Orwell had the technophobic fixation that every technological advance is a slide downhill. Thus, when his hero writes, he 'fitted a nib into the penholder and sucked it to get the grease off. He does so 'because of a feeling that the beautiful creamy paper deserved to be written on with a real nib instead of being scratched with an ink-pencil'.

> Presumably, the 'ink-pencil' is the ball-point pen that was coming into use at the time that 1984 was being written. This means that Orwell describes something as being written' with a real nib but being 'scratched' with a ball-point. This is, however, precisely the reverse of the truth. If you are old enough to remember steel pens, you will remember that they scratched fearsomely, and you know ball-points don't.

> This is not science fiction, but a distorted nostalgia for a past that never was. I am surprised that Orwell stopped with the steel pen and that he didn't have Winston writing with a neat goose quill.


I don't think it's pedantic, he's trying to make a broad point about his mentality, using the detail as the defining example


Intel Arc seems to be well liked, this seems to just be bad writing by Reuters. Unclear what is news here exactly as Demmers was hired a month ago…


“Gemini 3 Pro was often overloaded, which produced long spans of downtime that 2.5 Pro experienced much less often”

I was unclear if this meant that the API was overloaded or if he was on a subscription plan and had hit his limit for the moment. Although I think that the Gemini plans just use weekly limits, so I guess it must be API.


Geminii CLI has a specific "model is overloaded" error message which is distinct from "you're out of quota" so I suspect whatever tools they're using for this probably have something similar, and they're referring to that.


Double hyphen converts to em dash in Microsoft Word and I think some other places. I was taught that it was incorrect to use a hyphen in place of a dash, so I’ve always used em dashes -- sometimes I’ll just use two hyphens if the software doesn’t convert, like a forum :).


In Microsoft Word, double hyphens convert to em dashes. Seems to be the case on the iOS keyboard as well.


If you click through the lectures they are mentioned in several of them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: