Google Unveils Groundbreaking Gemini AI Model: A Multimodal Breakthrough

Google has announced the launch of Gemini, their most advanced and diverse AI system to date. The Gemini model is multimodal, meaning it can generalize and easily comprehend, utilize, and integrate various types of information, including text, images, sounds, videos, and code. It is also the most flexible of Google’s models, functioning efficiently on any device – from data centers to mobile devices.

“Gemini marks a milestone in AI development,” said Sundar Pichai, CEO of Google and Alphabet. “This model has the potential to radically change how we use technology. It can be used to create new applications that are more intuitive and helpful, as well as to improve our daily activities.”

The Gemini model was built on a wide range of data, including text, images, sounds, and code. It was trained on Google’s supercomputers, enabling it to learn to perform complex tasks.

Tests conducted by Google have shown that the Gemini model surpasses current achievements in 30 of the 32 commonly used tests and academic standards in the research and development of large language models (LLM).gemini google

In particular, the Gemini model:

  • Achieved a 90.00% score in MMLU (Massive Multitask Language Understanding) tests, outperforming experts in the field.
  • Scored 59.4% in MMMU (Multimodal Multidomain Tasks) tests, which involve the execution of multimodal tasks across multiple domains, requiring thoughtful argumentation.
  • Surpassed current advanced models in image analysis tests, without the support of Optical Character Recognition (OCR) systems.

“Gemini is a groundbreaking model that has the potential to change the way we use technology,” said Jeff Dean, Director of Research at Google AI. “We are excited to see how people around the world will use this model to create new and innovative applications.”

The Gemini model is currently available to developers and business clients. Google plans to make it available to a wider audience next year.

Starting today, Bard will utilize Gemini Pro. It will be available in English in over 170 countries and regions, and we plan to expand the available options and offer the service in more locations and languages in the near future.

From December 13, developers and business clients will have access to the Gemini Pro model through the Gemini API in Google AI Studio or Google Cloud Vertex AI.