Do we actually know that Gemini Pro is < 20 billion params? Their paper revealed Nano-1 is 1.8B and Nano-2 is 3.25B, but they seemed to omit any details about Gemini Pro or Ultra, and I couldn't find any further information online beyond just rumors and wildly ranging speculation
Not the person you replied to, but… Microsoft apparently revealed that GPT-3.5 Turbo is 20 billion parameters. Gemini Pro seems to perform only slightly better than GPT-3.5 Turbo according to some benchmarks, and worse in others. If Gemini Pro is significantly larger than 20 billion, that would be embarrassing for Google. If it is significantly smaller, that would be good for Google.
It seems reasonable to me to assume it’s somewhere in the neighborhood of 20 billion, but I agree it is worthwhile to recognize that we don’t actually know.
I don't think it would necessarily be embarrassing for Google because Gemini Pro is multimodal, while GPT-3.5 Turbo is text-only. Given this difference it wouldn't seem too unrealistic to me if Gemini Pro was bigger, but it seems like we just don't know.
Even so, Google treats the Gemini Pro Vision model as a separate model from Gemini Pro, so it could have separate parameters that are dedicated to vision (like CogVLM does), and that wouldn’t impact the size of the model as far as text-tasks are concerned.