Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do we actually know that Gemini Pro is < 20 billion params? Their paper revealed Nano-1 is 1.8B and Nano-2 is 3.25B, but they seemed to omit any details about Gemini Pro or Ultra, and I couldn't find any further information online beyond just rumors and wildly ranging speculation


Not the person you replied to, but… Microsoft apparently revealed that GPT-3.5 Turbo is 20 billion parameters. Gemini Pro seems to perform only slightly better than GPT-3.5 Turbo according to some benchmarks, and worse in others. If Gemini Pro is significantly larger than 20 billion, that would be embarrassing for Google. If it is significantly smaller, that would be good for Google.

It seems reasonable to me to assume it’s somewhere in the neighborhood of 20 billion, but I agree it is worthwhile to recognize that we don’t actually know.


I don't think it would necessarily be embarrassing for Google because Gemini Pro is multimodal, while GPT-3.5 Turbo is text-only. Given this difference it wouldn't seem too unrealistic to me if Gemini Pro was bigger, but it seems like we just don't know.


Being multimodal doesn’t seem to require much of a size penalty: https://github.com/dlyuangod/TinyGPT-V

Even so, Google treats the Gemini Pro Vision model as a separate model from Gemini Pro, so it could have separate parameters that are dedicated to vision (like CogVLM does), and that wouldn’t impact the size of the model as far as text-tasks are concerned.


You're right - we don't know yet but, as mentioned in other replies, 20B is a reasonable guestimate that would make the point about unfair comparison.


I asked Gemini pro how big it was, and it could’ve been a hallucination, but it said 175 billion parameters.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: