Google introduces built-in computer use tool in Gemini 3.5 Flash for GUI automation
By
Mateo Quiros
Summary
Google has introduced a built-in computer use tool in Gemini 3.5 Flash, enabling AI agents to interact with graphical user interfaces (GUIs) by controlling mouse and keyboard actions. The tool allows developers to build agents that can navigate screens, click buttons, fill forms, and perform multi-step tasks across desktop and mobile environments. The article details the architecture, including screenshot capture, action prediction, and execution, along with performance benchmarks showing state-of-the-art results on computer use evaluation tasks. It also covers safety considerations, rate limits, and practical implementation guidance for developers.
Source
Key quotes
· 3 pulledComputer use is now a built-in tool supported in Gemini 3.5 Flash, delivering our best performance yet for agentic computer use tasks.
The computer use tool enables the model to see and interact with a computer screen, performing actions like clicking buttons, filling out forms, and navigating through applications.
We've designed the computer use tool with safety in mind, including rate limiting and action confirmation mechanisms to prevent unintended behaviors.
You might also wanna read

Google's Gemini AI Model Can Navigate Web Browsers Like Humans
Google is previewing Gemini 2.5 Computer Use, a new AI model that can navigate and interact with web browsers like humans do. The model uses
Google Launches Gemini 2.5 Flash AI Model in Preview with Controllable Reasoning Features
Google's Gemini 2.5 Flash AI model is now available in preview, offering developers a fast and cost-efficient option with controllable reaso

Google overhauls Search with Gemini 3.5 Flash, AI Overviews, and AI agents
Google is rolling out a major update to Search powered by the new Gemini 3.5 Flash model. The update includes an expanded search box for lon
Google Integrates Gemini AI Assistant Directly into Chrome Browser
Google has integrated its Gemini AI assistant directly into the Chrome browser, allowing users to access AI features without switching tabs.
Gemini 3.5 Flash Review: Blazing Speed in AI Coding, but Accuracy Issues Persist
A review of Google's Gemini 3.5 Flash AI coding model, highlighting its exceptional speed and multi-agent task partitioning capabilities, bu
rswebsols.com·18d ago
Hands-on with Google's Gemini Spark: An impressive but unsettling agentic AI experience
The article describes the author's hands-on experience with Google's new Gemini Spark, an agentic AI platform that can autonomously perform

Hands-on with Google's Gemini Spark: An impressive but unsettling agentic AI experience
The article describes the author's hands-on experience with Google's new Gemini Spark, an agentic AI platform that can autonomously perform
Comments
Sign in to join the conversation.
No comments yet. Be the first.
