We run the open source models that everyone else has access to. What we're trying to show off is our low latency and high throughput, not the model itself.
No, it is "do whatever you were already doing with ML, faster"
This question seems either from a place of deep confusion or is in bad faith. This post is about hardware. The hardware is model independent.* Any issues with models, like hallucinations, are going to be identical if it is run on this platform or a bunch of Nvidia GPUs. Performance in terms of hardware speed and efficiency are orthogonal to performance in terms of model accuracy and hallucinations. Progress on one axis can be made independently to the other.
Okay. How about this: Someone posts to HN about an amazing new battery technology, which they demo by showing an average-sized smartphone watching TikTok endlessly scroll for over 500 hours on a single charge.
Then someone comments that TikTok is a garbage fire and a horrible corrupting influence, yadda yadda, all that stuff. They ask: what is the point of making phones last longer just to watch TikTok? They say this improved efficiency in battery tech is just putting lipstick on a pig.
That's you in this thread. That's the kind of irrelevant non-contribution you are making here.
They’re probably in the business of being the hardware provider. Best thing would be if Microsoft buys a lot of their chips and that way chatgpt is actually sped up. It’s basically model independent
I asked it to come up with name ideas for a company and it hallucinated them successfully :) I think the trick is to know what prompts will likely to yield results that are not likely to be hallucinated. In other contexts it's a feature.
A bit of a softball don't you think? The initial message suggests "Are you ready to experience the world's fastest Large Language Model (LLM)? We'd suggest asking about a piece of history"