Google introduces built-in computer use tool in Gemini 3.5 Flash for GUI automation
By
Mateo Quiros
Summary
Google has introduced a built-in computer use tool in Gemini 3.5 Flash, enabling AI agents to interact with graphical user interfaces (GUIs) by controlling mouse and keyboard actions. The tool allows developers to build agents that can navigate screens, click buttons, fill forms, and perform multi-step tasks across desktop and mobile environments. The article details the tool's architecture, performance benchmarks, safety features, and integration with the Gemini API, positioning it as a significant advancement in agentic AI capabilities.
Source
Key quotes
· 5 pulledComputer use is now a built-in tool supported in Gemini 3.5 Flash, delivering our best performance yet for agentic computer use tasks.
The computer use tool enables the model to see and interact with graphical user interfaces (GUIs) — just like a human would — by controlling mouse and keyboard actions.
We've designed the computer use tool with safety at its core, including a dedicated safety classifier that monitors for potentially harmful actions.
Developers can now build agents that can navigate screens, click buttons, fill forms, and perform multi-step tasks across desktop and mobile environments.
This represents a significant step toward more capable and autonomous AI agents that can interact with the digital world on behalf of users.
You might also wanna read
Google introduces built-in computer use tool in Gemini 3.5 Flash for GUI automation
Google has introduced a built-in computer use tool in Gemini 3.5 Flash, enabling AI agents to interact with graphical user interfaces (GUIs)
Google introduces built-in computer use tool in Gemini 3.5 Flash for GUI automation
Google has introduced a built-in computer use tool in Gemini 3.5 Flash, enabling AI agents to interact with graphical user interfaces (GUIs)
Google introduces built-in computer use tool in Gemini 3.5 Flash for GUI automation
Google has introduced a built-in computer use tool in Gemini 3.5 Flash, enabling AI agents to interact with graphical user interfaces (GUIs)
Google Integrates Computer Control Into Gemini 3.5 Flash, Raising New Security Concerns for AI Agents
Google has integrated "computer use" capabilities directly into Gemini 3.5 Flash, enabling the AI to see, reason about, and interact with co
searchenginejournal.com·8d agoGoogle Releases Gemini 2.5 Computer Use Model for UI Interaction
Google has released the Gemini 2.5 Computer Use model, a specialized AI model built on Gemini 2.5 Pro that enables agents to interact with u
Google launches Gemini 3.5 with agentic AI capabilities and 2M token context window
Google has released Gemini 3.5, a new series of AI models that combine frontier-level intelligence with the ability to take actions in the r
Google launches Gemini 3.5 with agentic AI capabilities and 2M token context window
Google has released Gemini 3.5, a new series of AI models that combine frontier-level intelligence with the ability to take actions in the r

Google's Gemini AI Model Can Navigate Web Browsers Like Humans
Google is previewing Gemini 2.5 Computer Use, a new AI model that can navigate and interact with web browsers like humans do. The model uses
Gemini in Chrome gets 'Select from screen' tool; Gemini 3.5 Flash adds computer use for developers
Google is rolling out a new "Select from screen" tool for Gemini in Chrome, allowing users to highlight text or images from their current ta

Comments
Sign in to join the conversation.
No comments yet. Be the first.