Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ok what's turn detection?


Turn detection is deciding when a person has finished talking and expects the other party in a conversation to respond. In this case, the other party in the conversation is an LLM!


Oh I see. Not like segmenting a conversation where people speak in turn. Thanks.


Speaker diarization is also still a tough problem for free models.


huh. how is analyzing conversations in the manner you described NOT the way to train such a model?


Did you reply to the wrong comment? No one is taking about training here.


Detecting when one user of a conversation has finished talking.

It’s a big deal for detecting human speech when interacting with LLM systems


It’s often called endpoint detection (in ASR).


Yes, weird that they didn't use that term for this project.


I've talked about this a lot with friends.

Endpoint detection (and phrase endpointing, and end of utterance) are terms from the academic literature about this, and related, problems.

Very few people who are doing "AI Engineering" or even "Machine Learning" today know these terms. In the past, I argued that we should use the existing academic language rather than invent new terms.

But then OpenAI released the Realtime API and called this "turn detection" in their docs. And that was that. It no longer made sense to use any other verbiage.


Re SEO, I note "utterance" only occurs once, in a perhaps-ephemeral "Things to do" description.

To help with "what is?" and SEO, perhaps something like "Turn detection (aka [...], end of utterance)"... ?


Thank for the explanation. I guess it makes some sense, considering many people with no nlp background are using those models now…




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: