Google AI Gemini 2.5 Computer Use (Preview), A browser-control Model that will allow AI agents to interact with interfaces.

What browser tasks would you be willing to delegate if an agent was able to plan and execute predefined UI action? Google AI introduces Gemini 2.5 Computer UseGemini 2.5 is an advanced version that can plan and execute. Real UI action It’s possible to do this in real time via an action API. It’s The public can now preview the new version of this app Through Google AI Studio You can also find out more about the following: Vertex AI. Model targets automated web testing and UI, with gains documented by humans on web/mobile benchmarks, as well as a safety feature that requires human verification for any risky step.

The model that ships?

The new term for developers is “new”. computer_use Tool that Returns function calls The following are some examples of how to use click_at, type_text_atOr drag_and_drop. Client code executes the action (e.g., Playwright/Browserbase), captures a fresh screenshot/URL, and loops until the task ends or a safety rule blocks it. The The supported action area is 13 predefined actions—open_web_browser, wait_5_seconds, go_back, go_forward, You can search for the best deal by clicking here., You can navigate to this page by clicking here., click_at, hover_at, type_text_at, key_combination, scroll_document, scroll_at, drag_and_drop—and can be Custom functions to extend the functionality (e.g., open_app, long_press_at, go_homeFor non-browser surface.

https://blog.google/technology/google-deepmind/gemini-computer-use-model/

How does the law affect you??

This model has a slant. Optimized for Web Browsers. Google claims it to be Not yet optimized for desktop OS level controlUsing the same loop, mobile scenarios can be created using custom actions. Built-in safety monitoring can stop prohibited actions and require confirmation by the user. “high-stakes” Operation (payments and sending of messages or access to sensitive information).

Measuring performance

Online-Mind2Web (official): 69.0% pass@1 This is a human-majority voting judgment, validated by the benchmark organizers.
Browserbase matched harness: Leads Both APIs are competing for computer use. Accuracy and Latency You can find out more about this by clicking here. Online-Mind2Web You can also find out more about the following: WebVoyager under identical time/step/environment constraints. Google’s model card lists 65.7% (OM2W) You can also find out more about the following: 79.9% (WebVoyager) Runs Browserbase
Latency/quality trade-off (Google figure): Accuracy of 70%+ at 225 s Browserbase OM2W harness median latency. Consider as Google-reported with human evaluation.
AndroidWorld (mobile) 69.7% Google measures; this is done via the same API as with Custom mobile Actions You can exclude browser actions.

https://blog.google/technology/google-deepmind/gemini-computer-use-model/

Early Production Signals

Automatic UI Test Repair: Google Payment Platform team reports model rehabilitates >60% executing automated UI testing that had previously failed. The public report should be given credit for this rather than the main blog.
Operational Speed: Poke.com Workflows of (early-external tester) reports often ~50% faster The next best alternative.

Gemini 2.5 Computer Use, a restricted API that exposes 13 documented UI commands and requires a server-side executor is available in preview through Google AI Studio. Google’s materials and model cards report the latest results in web/mobile control, while Browserbase’s harness shows that Online-Mind2Web is passed at 65.7% with the lowest latency. This scope is browser centric with step safety/confirmation. These data points are evidence of the need for a measured evaluation when it comes to UI Testing and Web Ops.

Click here to find out more GitHub Page You can also find out more about the following: Technical details. Please feel free to browse our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter Join our Facebook group! 100k+ ML SubReddit Subscribe Now our Newsletter.

Michal is a professional in data science with a Masters of Science degree from the University of Padova. Michal Sutter excels in transforming large datasets to actionable insight. He has a strong foundation in statistics, machine learning and data engineering.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.

Google AI Gemini 2.5 Computer Use (Preview), A browser-control Model that will allow AI agents to interact with interfaces.

Anthropic releases Claude Opus 4.7, a major upgrade for agentic coding, high-resolution vision, and long-horizon autonomous tasks

The Coding Guide to Property Based Testing with Hypothesis and Stateful, Differential and Metamorphic Test Designs

Google AI Releases Google Auto-Diagnosis: A Large Language Model LLM Based System to Diagnose Integrity Test Failures At Scale

This is a complete guide to running OpenAI’s GPT-OSS open-weight models using advanced inference workflows.

Anthropic Claims Pentagon Feud Cost It Billions

Real Estate Is Entering Its AI Slop Era

Anthropic Uses Claude Chats as Training Data. You can opt out.

What is Adobe Firefly? Learn How To Use This Generative AI Tool

Digital Marketing Courses to Sell Digital Marketing Courses • AI Blog

Top Insights

NVIDIA Released Audio Flamingo 3 : A Model Open Source for Advancing General Audio Intelligence

‘She’s Never Going to Age’: Porn Stars Are Embracing AI Clones to Stay Forever Young

Latest News

Anthropic releases Claude Opus 4.7, a major upgrade for agentic coding, high-resolution vision, and long-horizon autonomous tasks

The Coding Guide to Property Based Testing with Hypothesis and Stateful, Differential and Metamorphic Test Designs

Google AI Gemini 2.5 Computer Use (Preview), A browser-control Model that will allow AI agents to interact with interfaces.

The model that ships?

How does the law affect you??

Measuring performance

Early Production Signals

Related Posts