When LLMs process tokens, each token is first converted to an embedding vector. (This token to vectors mapping is learned during training.)
Since a token itself carries no information about whether it has "authority" or not, I'm proposing to inject this information in a reserved number in that embedding vector. This needs to be done both during post-training and inference. Think of it as adding color or flavor to a token, so that it is always very clear to the LLM what comes from the system prompt, what comes from the user, and what is random data.
This is really insightful, thanks. I hadn't understood that there was room in the vector space that you could reserve for such purposes.
The response from tempaccsoz5 seems apt then, since this injection is performed/learned during post-training; in order to be watertight, it needs to overfit.
Since a token itself carries no information about whether it has "authority" or not, I'm proposing to inject this information in a reserved number in that embedding vector. This needs to be done both during post-training and inference. Think of it as adding color or flavor to a token, so that it is always very clear to the LLM what comes from the system prompt, what comes from the user, and what is random data.