In recent developments, Google has introduced Gemini, a robust contender that appears to be a highly intelligent rival to OpenAI’s GPT-4. The Gemini platform comprises three distinct models, each varying in size and capability. While Gemini Ultra, the most advanced model tailored for « highly complex tasks, » is not yet publicly available, Google claims it surpasses GPT-4 in multiple domains, including knowledge in fields such as history and law, Python code generation, and tasks demanding multi-step reasoning.

In a groundbreaking achievement, Gemini Ultra outperformed GPT-4 on the Massive Multitask Language Understanding test (MMLU), often likened to the « SATs for AI models. » The MMLU, however, transcends typical college prep exams by covering a broad spectrum of 57 subjects, including math, physics, history, law, medicine, and ethics. It evaluates both world knowledge and problem-solving capabilities. According to Google, Gemini Ultra achieved a remarkable score of 90% on the MMLU, surpassing GPT-4’s score of 86.4%.

What makes Gemini Ultra’s accomplishment even more noteworthy is that it stands as the first model to outperform human experts on the MMLU. Human experts achieved an approximately 89.8% score, as indicated in a technical report on Gemini by Google.

Reflecting on this achievement, Kevin Roose noted on The New York Times tech podcast Hard Fork that just a few years ago, the notion of a model achieving a 90% score on the MMLU, surpassing the benchmark threshold for human experts, would have been considered indicative of Artificial General Intelligence (AGI). AGI is a theoretical form of artificial intelligence capable of processing complex human capabilities like common sense and consciousness.

While GPT-4 did outperform Gemini Ultra in the evaluation of common sense reasoning abilities for everyday tasks, Google emphasizes that Gemini possesses a unique advantage as it is natively multimodal. This means it was specifically designed to process various types of data, including text, audio, code, images, and video, from the ground up. In contrast, other multimodal models were created by combining text-only, vision-only, and audio-only models in a less optimal manner, according to Oriol Vinyals, the Vice President of Research for Google’s DeepMind.

Furthermore, Google asserts that Gemini’s design allows it to understand inputs more effectively than existing multimodal models. Researchers from the SemiAnalysis blog also suggest that Gemini is likely to outperform GPT-4 due to its sheer computing power.

Despite the high expectations set by Gemini Ultra, the ultimate verdict on how the trio of Gemini models will fare against OpenAI, which already holds an advantage in consumer awareness, remains uncertain.

Early feedback on the less advanced Gemini Pro, accessible through Google’s chatbot Bard, has been generally positive. However, concerns about accuracy and hallucinations have emerged, with instances where it directed users to resort to Google for answers to controversial questions. The competition between Gemini and OpenAI awaits further exploration and scrutiny as the landscape of advanced AI models continues to evolve.

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Let's
work
together

Are you looking to advertise your brand or monetise your traffic? Drop us a line, our team will be glad to assist you with your queries.

Select an option from the drop-down menu