Google Unveils Agentic Vision: Revolutionizing Image Analysis in Gemini

February 16, 2026

Google lance Agentic Vision pour améliorer l’analyse d’images dans Gemini

With Agentic Vision, Gemini now actively analyzes images by zooming, annotating, and performing calculations to base its responses on verifiable visual evidence.

Table of Contents

In a blog post released this Tuesday, January 27, 2026, Google announced the launch of Agentic Vision in Gemini 3 Flash. The new feature “combines visual reasoning and code execution to base responses on visual evidence”. Here are the key takeaways.

Agentic Vision: Enhancing Google’s Image Analysis Precision

Agentic Vision “transforms image understanding from a static act into an active process”, the company explains. To achieve this, it operates on a Think-Act-Observe loop, turning image analysis into an active process.

Initially, the model analyzes the user’s query and the image to formulate a multi-step plan (Think). It then generates Python code to manipulate the image (crop, rotate, annotate) or to perform calculations and count elements (Act). The transformed image is added to the model’s context, allowing it to examine the new data with greater precision before generating its final response (Observe).

Leading AI models like Gemini typically process the world in a single static glance. If they fail to perceive a specific detail—such as a serial number on an electronic chip or a distant traffic sign—they are forced to guess, Google laments in its blog post.

This new approach allows, according to Google’s vision benchmarks, an improvement of 5 to 10% in accuracy.

New Features: Zoom and Inspection, Image Annotation, Visual Mathematics

Agentic Vision unlocks three key capabilities that enhance the accuracy and reliability of visual analysis:

  • Zoom and Inspection: the model implicitly zooms in on fine details for deeper examination. This feature ensures that small critical visual elements are not overlooked.
  • Image Annotation: the model executes Python code to draw directly on the image and mark identified elements. Visual annotations serve as verification to ensure the accuracy of results.
  • Visual Mathematics and Graphical Representation: the model analyzes complex charts and generates visualizations through the execution of Python code. It replaces assumptions based on probabilities with verifiable execution.

How to Access Agentic Vision

Developers can utilize Agentic Vision within Google AI Studio and Vertex AI, where they have access to the Gemini API to directly integrate this capability into their applications. They can also test the functionality in the Google AI Studio playground by activating “Code Execution” under the Tools section.

For the general public, deployment begins in the Gemini application. To do this, select “Thinking” from the model menu.

Similar Posts

Rate this post

Leave a Comment

Share to...