Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Some people (like me) are primarily verbal processors:

- I am dictating this message through macOS's voice to text right now

- I am a huge user of Google Assistant

- I prefer to call people versus texting them

- I tend to call restaurants instead of using something like Toast to order takeout (although this is partially because online services will add a surcharge onto the price sometimes, and sometimes I need to ask questions about dietary restrictions, etc.)

Generally, wherever possible, I will use a voice interface versus a text based one to get my point across. It's just faster and more convenient for me. I'm pretty neutral on the consumption side: I read and listen to audiobooks in roughly equal amounts.

All that to say that, just like there are people out there who prefer text UIs, there are also people who prefer voice interfaces.



I use Superwhisper (no affiliation, just a happy user), which runs a local Whisper model, to create most of my email drafts and post-meeting notes. I find Whisper more accurate than Mac’s built-in speech-to-text, plus I’m faster at speaking than typing.

Sometimes, I even ‘talk’ into Cursor’s chat window instead of typing. The only downside? It can get a bit annoying for others when you're talking to yourself all day.


I'm looking for something like this that runs on Linux. Best thing I've found is LiveCaptions, but its output is janky. I can't just use it to type in any old text field, and its output requires substantial editing after the fact.

I guess I understand that a lot of things are being developed for Apple silicon specifically. It's just frustrating that despite hours of searching, I'm not finding anything decent.


Talon Voice is good and runs on linux.

https://talonvoice.com/


This looks really powerful for controlling the system with different scripts, but what if all I want it to do is let me narrate something and print out the sentences as close to real-time as possible? It's really just good STT that I'm looking for out of it.


The Talon voice dev created his own STT model that's very performant. The transcription quality is... good, but not world-class. It's better than anything that came out before Whisper IMO, but the newest generator of models can do things like inferring punctuation and words outside of its vocabulary (although the downside of the new generation of VTT is that they can sometimes hallucinate words that are very different from what you said).

It's a bit overkill to use Talon for just voice dictation, but that is 90% of what I use it for, and it's pretty good at it.


Interesting! I'll give Superwhisper a try.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: