Google Launches Gemini 25 Pro Experimental AI Model

On Tuesday, Google introduced Gemini 2.5, a new AI reasoning model designed to pause and "think" before generating responses.

To kick off this new series, Google is rolling out Gemini 2.5 Pro Experimental, a multimodal AI model that the company claims is its most advanced yet. Starting today, developers can access it through Google AI Studio, while subscribers to the $20-a-month Gemini Advanced plan can use it in the Gemini app.

Going forward, Google says all its AI models will incorporate built-in reasoning capabilities.

Since OpenAI debuted its first AI reasoning model, o1, in September 2024, the industry has been in a race to match or surpass its performance. Today, companies like Anthropic, DeepSeek, Google, and xAI all offer AI reasoning models, which leverage extra computing power to fact-check and analyze problems before delivering responses.

These reasoning models have significantly improved math and coding tasks, and many believe they will play a crucial role in the development of AI agents—autonomous systems capable of completing tasks with minimal human intervention. However, this increased reasoning ability comes at a higher computational cost.

Google has explored reasoning models before, launching a “thinking” version of Gemini last December. But with Gemini 2.5, the company is making its most serious push yet to challenge OpenAI’s “o” series.

Google claims Gemini 2.5 Pro surpasses its previous AI models—and some of its competitors—on multiple benchmarks. The model is specifically designed to excel in visually compelling web app development and agentic coding applications.

On the Aider Polyglot benchmark, which measures code editing abilities, Gemini 2.5 Pro scored 68.6%, outperforming OpenAI, Anthropic, and DeepSeek’s top models. However, on SWE-bench Verified, which evaluates software development skills, it scored 63.8%—beating OpenAI’s o3-mini and DeepSeek’s R1, but falling short of Anthropic’s Claude 3.7 Sonnet, which led with 70.3%.

In Humanity’s Last Exam, a multimodal test covering math, humanities, and natural sciences, Gemini 2.5 Pro achieved 18.8%, surpassing most competing flagship models.

One of its standout features is its massive 1-million-token context window, allowing it to process around 750,000 words at once—longer than the entire Lord of the Rings book series. Google also plans to double this capacity to 2 million tokens soon.

As for API pricing, Google hasn’t released details yet but promises to share more information in the coming weeks.

Relation