Turn detection is deciding when a person has finished talking and expects the other party in a conversation to respond. In this case, the other party in the conversation is an LLM!
Endpoint detection (and phrase endpointing, and end of utterance) are terms from the academic literature about this, and related, problems.
Very few people who are doing "AI Engineering" or even "Machine Learning" today know these terms. In the past, I argued that we should use the existing academic language rather than invent new terms.
But then OpenAI released the Realtime API and called this "turn detection" in their docs. And that was that. It no longer made sense to use any other verbiage.