With Agentic Vision, Gemini now actively analyzes images by zooming, annotating, and performing calculations to base its responses on verifiable visual evidence.
Scientists confirm: This is the most effective way to get your cat’s attention, according to new research
Elderly Couple Refuses Reserved Seats—Viral Train Standoff Sparks Fiery Debate on Courtesy
In a blog post released this Tuesday, January 27, 2026, Google announced the launch of Agentic Vision in Gemini 3 Flash. The new feature “combines visual reasoning and code execution to base responses on visual evidence”. Here are the key takeaways.
Agentic Vision: Enhancing Google’s Image Analysis Precision
Agentic Vision “transforms image understanding from a static act into an active process”, the company explains. To achieve this, it operates on a Think-Act-Observe loop, turning image analysis into an active process.
Why You Should Never Reheat These Foods in the Microwave – The Hidden Dangers Experts Warn About
I tried the top 5 guard dogs—here’s what makes these breeds the ultimate protectors
Initially, the model analyzes the user’s query and the image to formulate a multi-step plan (Think). It then generates Python code to manipulate the image (crop, rotate, annotate) or to perform calculations and count elements (Act). The transformed image is added to the model’s context, allowing it to examine the new data with greater precision before generating its final response (Observe).
Leading AI models like Gemini typically process the world in a single static glance. If they fail to perceive a specific detail—such as a serial number on an electronic chip or a distant traffic sign—they are forced to guess, Google laments in its blog post.
This new approach allows, according to Google’s vision benchmarks, an improvement of 5 to 10% in accuracy.
New Features: Zoom and Inspection, Image Annotation, Visual Mathematics
Agentic Vision unlocks three key capabilities that enhance the accuracy and reliability of visual analysis:
- Zoom and Inspection: the model implicitly zooms in on fine details for deeper examination. This feature ensures that small critical visual elements are not overlooked.
- Image Annotation: the model executes Python code to draw directly on the image and mark identified elements. Visual annotations serve as verification to ensure the accuracy of results.
- Visual Mathematics and Graphical Representation: the model analyzes complex charts and generates visualizations through the execution of Python code. It replaces assumptions based on probabilities with verifiable execution.
How to Access Agentic Vision
Developers can utilize Agentic Vision within Google AI Studio and Vertex AI, where they have access to the Gemini API to directly integrate this capability into their applications. They can also test the functionality in the Google AI Studio playground by activating “Code Execution” under the Tools section.
For the general public, deployment begins in the Gemini application. To do this, select “Thinking” from the model menu.
Similar Posts
- Google Unveils Gemini 2.5 Flash Image: Transform Your Photos with AI Magic!
- Google’s AI Code Agent Jules: How to Test Its Cutting-Edge Features!
- Gemini CLI Now Integrates With Figma, Stripe, Shopify: Enhance Your Workflow!
- Google Unveils Gemini CLI: Free Open Source AI Tool for Developers
- Google’s Opal: Revolutionizing Vibe Coding in Gemini Platform!

Jordan Park writes in-depth reviews and editorial opinion pieces for Touch Reviews. With a background in UI/UX design, Jordan offers a unique perspective on device usability and user experience across smartphones, tablets, and mobile software.