Picked the wrong one. LoRA, Low-rank Adaptation of LLMs (https://arxiv.org/pdf/2106.09685.pdf), consists in adapting the weights of a big neural network to a target task (here, answering to instructions). It doesn't touch the weights of the original model, but rather adds the product of two low-rank matrices to select layers. The weights from those matrices are learnable. The method allows to adapt big models on (relatively) low-memory GPUs.