When LLMs process tokens, each token is first converted to an embedding vector. ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		wcoenen 61 days ago \| parent \| context \| favorite \| on: Claude Cowork exfiltrates files When LLMs process tokens, each token is first converted to an embedding vector. (This token to vectors mapping is learned during training.) Since a token itself carries no information about whether it has "authority" or not, I'm proposing to inject this information in a reserved number in that embedding vector. This needs to be done both during post-training and inference. Think of it as adding color or flavor to a token, so that it is always very clear to the LLM what comes from the system prompt, what comes from the user, and what is random data.

jcgl 60 days ago [–]

This is really insightful, thanks. I hadn't understood that there was room in the vector space that you could reserve for such purposes.

The response from tempaccsoz5 seems apt then, since this injection is performed/learned during post-training; in order to be watertight, it needs to overfit.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact