Not sure if this will help, but a while ago I was thoroughly confused about all the AI options (and advice from other people) so spent a while experimenting, now make systems for commercial sometimes, but for a basic-yet-functional knowledge base, that you can expand with whatever tooling you want:
- Don't use llamaindex/llangchain etc. - fine to get started quick but you'll quickly get frustrated when you try to do something different
- Suck in all your files using public libraries. convert to text. Remove obvious crap like line breaks etc. Don't worry about it too much.
- Use postgres as vectorDB - cheap.
- OpenAI is fine, and the docs are great - gpt 3.5 gives fine results; cheapest embedding model fine.
- Spend some time optimising the prompts - that's the most important thing.
I wrote up basics for my specific niche here, has cost/time breakdowns and costs about $4 per month for hosting (and only then because I couldn't face setting up postgres on my other server) and < $1 per 50GB of text/xlsx/etc embedded: https://superstarsoftware.co.uk/ai-for-drilling-engineers/
(as in: dirt cheap).
I basically made it as a showcase for potential customers, was half thinking of open sourcing it so people can get up and running quickly including with decent frontend, but not sure if there's much appetite since it's basic.
- Don't use llamaindex/llangchain etc. - fine to get started quick but you'll quickly get frustrated when you try to do something different
- Suck in all your files using public libraries. convert to text. Remove obvious crap like line breaks etc. Don't worry about it too much.
- Use postgres as vectorDB - cheap.
- OpenAI is fine, and the docs are great - gpt 3.5 gives fine results; cheapest embedding model fine.
- Spend some time optimising the prompts - that's the most important thing.
I wrote up basics for my specific niche here, has cost/time breakdowns and costs about $4 per month for hosting (and only then because I couldn't face setting up postgres on my other server) and < $1 per 50GB of text/xlsx/etc embedded: https://superstarsoftware.co.uk/ai-for-drilling-engineers/
(as in: dirt cheap).
I basically made it as a showcase for potential customers, was half thinking of open sourcing it so people can get up and running quickly including with decent frontend, but not sure if there's much appetite since it's basic.