All Topics
All Topics
Technology
Technology
AI
AI
Business
Business
Entertainment
Entertainment
News
News
Programming
Programming
Security
Security
Science
Science
Design
Design
Environment
Environment
Finance
Finance
Crypto
Crypto
Politics
Politics
Sports
Sports
Education
Education
Gaming
Gaming
Art
Art
Music
Music
Health
Health
Books
Books
Food
Food
Travel
Travel
Personal
Personal
Bluesky
Twitter

Google introduces built-in computer use tool in Gemini 3.5 Flash for GUI automation

By

Mateo Quiros

6h ago· 5 min readenNews

Summary

Google has introduced a built-in computer use tool in Gemini 3.5 Flash, enabling AI agents to interact with graphical user interfaces (GUIs) by controlling mouse and keyboard actions. The tool allows developers to build agents that can navigate screens, click buttons, fill forms, and perform multi-step tasks across desktop and mobile environments. The article details the architecture, including screenshot capture, action prediction, and execution, along with performance benchmarks showing state-of-the-art results on computer use evaluation tasks. It also covers safety considerations, rate limits, and practical implementation guidance for developers.

Source

Hacker NewsGoogle introduces built-in computer use tool in Gemini 3.5 Flash for GUI automationblog.google

Key quotes

· 3 pulled
Computer use is now a built-in tool supported in Gemini 3.5 Flash, delivering our best performance yet for agentic computer use tasks.
The computer use tool enables the model to see and interact with a computer screen, performing actions like clicking buttons, filling out forms, and navigating through applications.
We've designed the computer use tool with safety in mind, including rate limiting and action confirmation mechanisms to prevent unintended behaviors.
Snippet from the RSS feed
A look at the built-in computer use tool in Gemini 3.5 Flash.

You might also wanna read

Google's Gemini AI Model Can Navigate Web Browsers Like Humans

Google is previewing Gemini 2.5 Computer Use, a new AI model that can navigate and interact with web browsers like humans do. The model uses

The Verge·8mo ago

Google Launches Gemini 2.5 Flash AI Model in Preview with Controllable Reasoning Features

Google's Gemini 2.5 Flash AI model is now available in preview, offering developers a fast and cost-efficient option with controllable reaso

Product Hunt·1y ago

Google overhauls Search with Gemini 3.5 Flash, AI Overviews, and AI agents

Google is rolling out a major update to Search powered by the new Gemini 3.5 Flash model. The update includes an expanded search box for lon

The Verge·1mo ago

Google Integrates Gemini AI Assistant Directly into Chrome Browser

Google has integrated its Gemini AI assistant directly into the Chrome browser, allowing users to access AI features without switching tabs.

Product Hunt·3mo ago

Gemini 3.5 Flash Review: Blazing Speed in AI Coding, but Accuracy Issues Persist

A review of Google's Gemini 3.5 Flash AI coding model, highlighting its exceptional speed and multi-agent task partitioning capabilities, bu

rswebsols.com·18d ago

Hands-on with Google's Gemini Spark: An impressive but unsettling agentic AI experience

The article describes the author's hands-on experience with Google's new Gemini Spark, an agentic AI platform that can autonomously perform

The Verge·21d ago

Hands-on with Google's Gemini Spark: An impressive but unsettling agentic AI experience

The article describes the author's hands-on experience with Google's new Gemini Spark, an agentic AI platform that can autonomously perform

theverge.com·21d ago

Comments

Sign in to join the conversation.

No comments yet. Be the first.