Is there any benefit to fine-tuning a model on your corpus before using it to ge...

gunalx · on Nov 1, 2024

Yes. Especially if you work in a not well supported language and/or have specific datapairs you want to match that might be out of ordinary text.

Training your own fine tune takes a really short time and GPU resources, and you can easily outperform even sota models on your specific problem with a smaller model/vector space

Then again on general English text and doing a basic fuzzy search. I would not really expect high performance gains.