For the second consecutive month, Google’s AI models have outperformed those from OpenAI.
The AI industry has intensified its efforts since the spring of 2025. In the past month, major players in the field – including OpenAI, Google, Claude, Perplexity, and Mistral – have all unveiled new developments regarding their AI tools, while new models were either launched or made available to a broader audience. Yet, when it comes to performance, who stands out? This is what Chatbot Arena aims to determine, as it ranks the top AI models currently available. Here are the main trends from May 2025.
Leading in Generative AI Models: Gemini Maintains its Position
Google has evidently been the most active company in the AI domain in May 2025. Its annual event, Google I/O, was almost entirely dedicated to new developments in artificial intelligence, spanning images and videos, online search, and Gemini. This investment appears to be paying off. For the second month running, Gemini models are leading the pack in the Chatbot Arena.
The top two spots are occupied by Gemini 2.5 Pro and Gemini 2.5 Flash. These models, launched at the end of March 2025, aim to compete with OpenAI’s “o” series in the realm of chain-of-thought AI, which can break down tasks before providing an answer.
Top 10 Most Powerful AI Models in May 2025
Since the end of 2024, OpenAI’s models have been consistently outpaced by their competitors. However, they remain firmly in the top 5. This month, the o3 and 4o models are in the 3rd and 4th places, respectively, while GPT-4.5 ranks 6th. The highly anticipated release of GPT-5, touted as imminent by Sam Altman, could potentially turn the tide and allow the startup to reclaim the top spot.
The rest of the top 10 includes several familiar names, such as Grok and DeepSeek, as well as a newcomer: Hunyuan-TurboS. Developed by Tencent, the Chinese giant behind WeChat, this model also incorporates a chain-of-thought reasoning mechanism.
Here are the 10 most powerful AI models in May 2025, according to Chatbot Arena:
- Gemini-2.5 Pro: 1,446 (Elo score)
- Gemini-2.5 Flash: 1,418
- OpenAI o3: 1,409
- ChatGPT 4o: 1,405
- Grok-3: 1,399
- GPT-4.5: 1,394
- Gemini-2.5 Flash (previous version): 1,387
- DeepSeek V3: 1,368
- GPT-4.1: 1,365
- Hunyuan Turbos: 1,356
Chatbot Arena: Ranking Criteria
Chatbot Arena, developed by the Large Model Systems Organization (LMSYS), seeks to measure the performance of artificial intelligence models based on user judgments. On the platform, users are invited to compare two anonymously presented models and choose the one that, in their opinion, best responded to a given query. This direct comparison mechanism helps circumvent biases and ensures a fair evaluation.
The results of these matchups are used to assign each model an Elo score, a ranking method well known in chess and competitive gaming. This score changes based on performance: defeating a higher-ranked opponent gains points, while losing to a lower-ranked model results in a loss.
Similar Posts
- Top 10 AI Models of April 2025: See Which Ones Outperformed the Rest!
- Top AI Models of June 2025: Discover the Complete Performance Rankings!
- AI Showdown: Top 10 Image Generators of June 2025 Unveiled!
- AI Hallucinations: Which Models Are Most Prone to Errors?
- Gemini Update: Google Supercharges API with 2.5 Flash, Pro, and New Multimodal Features

Jordan Park writes in-depth reviews and editorial opinion pieces for Touch Reviews. With a background in UI/UX design, Jordan offers a unique perspective on device usability and user experience across smartphones, tablets, and mobile software.