Google integrates computer use into Gemini 3.5 Flash to power next generation AI agents

Google has announced a major upgrade to its Gemini ecosystem by integrating computer use capabilities directly into Gemini 3.5 Flash. This capability was earlier available only in a separate Gemini 2.5 computer use model, but is now built into the main Flash model itself.

The update allows AI agents to go beyond text generation. They can now see interfaces, reason through tasks and take actions across browsers, mobile systems and desktop environments. This effectively turns the model into an execution layer, not just a conversational assistant.

Google said the goal is to support long horizon workflows where multiple steps must be completed over time. This includes enterprise automation tasks like continuous software testing, repetitive digital operations and complex knowledge work inside professional tools.

Enterprise automation and AI agents expand across browser and desktop environments

With this integration, Gemini-powered agents can interact directly with software interfaces instead of relying only on APIs. That means they can click, navigate, input data and perform actions across real applications.

This shift is important for enterprise users because many workflows still depend on legacy systems that do not have modern API access. Computer use allows AI systems to bridge that gap by operating software the way a human would.

Developers can access this capability through the Gemini API and the Gemini Enterprise Agent Platform. Google also launched a demo environment through Browserbase to let developers test how agents behave in real time environments.

The update positions Gemini more strongly in the growing market for AI agents that can perform tasks end to end rather than just respond with information.

Google adds safety controls to reduce prompt injection risks in live AI systems

Alongside the new capability, Google also focused heavily on safety design. One of the key risks in computer use systems is prompt injection, where hidden or malicious instructions inside web pages or apps can manipulate an AI agent.

To address this, Google introduced targeted adversarial training methods designed to make the system more resistant to these attacks. This training helps the model distinguish between trusted instructions and potentially harmful embedded prompts.

The company also added two optional enterprise safeguard systems. One requires explicit user confirmation before sensitive actions are completed. The other automatically stops tasks if it detects indirect prompt injection attempts during execution.

Google recommends that developers combine these protections with sandboxed environments, human verification steps and strict access control policies. These layers are meant to reduce risk when AI agents operate in live business systems.

Comments are closed.