Speech-to-text is a great feature and is a machine inference application whether it fits under the "Generative AI" banner or not. The ergonomics of typing on your phone are and have always been bad. I switched to a Pixel because of its greatly superior dictation abilities compared to iOS. They probably do need to be in that feature race because they have been looking pretty shabby of late.
About speech to text, are there models or apps that do speech to text but use a bit of AI to infer around the "umms and "uhhs" ?
Like if I'm doing a stream of consciousness talk about something while I'm on a hike, there's loads of utterances that would be converted to stuff Id need to edit out if it was a blog post.
Or better yet, have me be able to say "oh no, remove that last thing I said about the leprechauns."
I was talking specifically about Generativ AI, not text-to-speech or speech-to-text since I don't consider those to be generative AI in the Callovian sense that's pushed nowadays and I don't want to be all pedantic about it and start splitting hairs on what technically is and what isn't, just keeping the mainstream frame of what the device manufacturers claim to be gen-AI.
It's being shoved down your throat, but even with such a close view you have no idea what it is. And don't want to discuss it because that's "pedantic"?
Where did I say you can't discuss it? Feel free to discuss it if you want. Why do you need my permission? You'll just do it without me, since you seem to have a chip on your shoulder for no reason and I don't want to reward such attitudes.
I just clarified the gen-AI meaning I used in the context of my comment which is also the context manufacturers are referring to, and not the scientific definitions the AI experts are thinking about since your average consumer has no idea about ML and transformers and all the inner working of what they call AI.
This has to have been a wry joke, otherwise it's insane.
Example of a non-strenuous dictation task: if I am driving, my Android will read my texts and allow me to reply by voice, a speech-to-text and text-to-speech task that is damned handy.
Individual non strenuous tasks still add up, it’s the total amount per day that matters. If you’re just dictating on the drive to work then it’s no big deal, but just because X is fine doesn’t automatically mean 2 X is fine.
Conversation between multiple people doesn’t involve one person speaking continuously for hours. As such you can spend a lot more hours per day dictating than is normal, that’s the risk not simply talking an extra 20 minutes per day.