I looked into this problem a while back and haven’t looked at since. The base ai...

lyu07282 · on May 26, 2024

isn't whisper the speech to text model by openai? which model did you mean?

maddynator · on May 27, 2024

yeah thats correct. I meant this one https://voicebox.metademolab.com/

yagudaev · on May 28, 2024

Yeah I tried a bunch of them and OpenAI's TTS was by far the best.

Outside of that standard tech stack Next.js, Postgres, TailwindCSS.

It is still early days for ML TTS, and it will be exciting to see the compute requirements drop and for it to run on the device. OSS models have some promise, but still not there from quality perspective.