Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not clear if it is due to Groq or to Mixtral, but confident hallucinations are there.


We run the open source models that everyone else has access to. What we're trying to show off is our low latency and high throughput, not the model itself.


But if the model is useless/full of hallucinations, why does the speed of its output matter?

"generate hallucinated results, faster"


No, it is "do whatever you were already doing with ML, faster"

This question seems either from a place of deep confusion or is in bad faith. This post is about hardware. The hardware is model independent.* Any issues with models, like hallucinations, are going to be identical if it is run on this platform or a bunch of Nvidia GPUs. Performance in terms of hardware speed and efficiency are orthogonal to performance in terms of model accuracy and hallucinations. Progress on one axis can be made independently to the other.

* Technically no, but close enough


Well ok, Groq provides lower latency cheaper access to the same models of questionable quality.

Is this not putting lipstick on a pig scenario? I suppose more of a question to pig buyers.


Okay. How about this: Someone posts to HN about an amazing new battery technology, which they demo by showing an average-sized smartphone watching TikTok endlessly scroll for over 500 hours on a single charge.

Then someone comments that TikTok is a garbage fire and a horrible corrupting influence, yadda yadda, all that stuff. They ask: what is the point of making phones last longer just to watch TikTok? They say this improved efficiency in battery tech is just putting lipstick on a pig.

That's you in this thread. That's the kind of irrelevant non-contribution you are making here.


Perhaps your analogy reveals more then you intended.

What does it tell you about the new technology if the best vehicle to demonstrate it is TikTok?


Batteries are useful. The majority of LLMs are not?


They’re probably in the business of being the hardware provider. Best thing would be if Microsoft buys a lot of their chips and that way chatgpt is actually sped up. It’s basically model independent


Mixtral 8x7b is competitive with ChatGPT 3.5 Turbo so I'm not sure why you are being so dismissive.

https://chat.lmsys.org/ check the leaderboard.



At top left hand corner you can change the model to Llama2 70B Model.


I asked it to come up with name ideas for a company and it hallucinated them successfully :) I think the trick is to know what prompts will likely to yield results that are not likely to be hallucinated. In other contexts it's a feature.


A bit of a softball don't you think? The initial message suggests "Are you ready to experience the world's fastest Large Language Model (LLM)? We'd suggest asking about a piece of history"

So I did.


My comment was about generic experience with LLM-s. Obviously your experience can differ from this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: