Hacker Newsnew | past | comments | ask | show | jobs | submit | boriscal's commentslogin

For anyone on classic DF, I've been running a fork of this visualizer: https://github.com/DFHack/stonesense


I feel so out the loop with all these new language models, this whole space is moving so fast


I've been on the waitlist for a while now, but this might be what you're looking for: https://lottielab.com


How does this benchmark against AWS [0] offering?

[0] https://aws.amazon.com/transcribe/


I have done some tests for a hobby project (transcribing local fire station radio dispatch), and Whisper is incomparably better than AWS transcribe.

Radio dispatch is very hard to understand, low quality audio. AWS transcribe was essentially useless, incomprehensible transcriptions. Whisper was 95+% accurate, with maybe 1 incorrect word per few sentences, and it was often easy to tell what the correct word would be.


Interesting, that's good to know, my personal experience was that transcribe handled accents very poorly, but if whisper can handle radio static maybe it can also decipher thick scottish


On the original blog post announcing Whisper there was a demo ("Accent" under the drop-down) of it transcribing a Scottish accent. https://openai.com/blog/whisper/


Whisper has done extremely well for me with a number of challenging audio transcriptions: background noise, heavy accents, music playing... you name it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: