Yeah I tried a bunch of them and OpenAI's TTS was by far the best.
Outside of that standard tech stack Next.js, Postgres, TailwindCSS.
It is still early days for ML TTS, and it will be exciting to see the compute requirements drop and for it to run on the device. OSS models have some promise, but still not there from quality perspective.
The base ai model sounded like whisper ai from meta. Did you train the voice yourself or is it one of defaults?
I am always curious as to what copyright issues products like this run into. Also whats the stack like to build something like this?