LM Studio has an option on model load that I believe does what you describing he...

		riidom 23 days ago \| parent \| context \| favorite \| on: Right-sizes LLM models to your system's RAM, CPU, ... LM Studio has an option on model load that I believe does what you describing here: "K Cache Quantization Type" (and similar for "V"). It's marked as experimental and says the effect is basically hard to predict. Never tried myself, though.