Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just looked at the Jasper project. The voice synthesizer sounds like crap compared to a product like Siri, and I suspect the voice recognition is worse too. I've coded on some the stuff using voice synthesis and recognition for fun, and it seems that the open source options haven't gotten any better for the past 15 years.


Well I think CMUSphinx and Julius have actually made leaps and bounds. Jasper is a thin layer built on top of those systems (sphinx in this case).

I guess I'm just easy to impress - I think the fact that we have off-the-shelf open source software that does voice recognition with any kind of quality is amazing.

And again, a lot of the time, it's about how well your model is trained. One of the ideas I had was to work training into the product, and have the user say sentences over time (maybe ~5 a week, at random times) to help the machine understand better (something like "hey, if you have a minute, let's work on our communication").

Also, even if the voice is a little bit computer-y, I don't mind, yeah you won't match on feature set right out of the gate, but I don't mind.

Also, there's an opportunity for advancement in that area -- if I were to build a site that collected voice/utterance samples from voice artists and had users pay a little bit for packs, I think we could fund the open-source teams doing the work to make more realistic voice/utterance.

I think of Jasper as a first step, but then again, I am biased, I like the idea and have been wanting to do it myself for a long time




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: