Gemini 3.5 Flash Adds Native Computer Use for Agentic Workflows

Key Takeaways

  • Integrates computer use natively into Gemini 3.5 Flash, enabling agents to interact with desktop, mobile, and browser interfaces.
  • Enhances enterprise automation and software testing by allowing models to perform complex, long-horizon tasks across professional applications.
  • Introduces robust safety features, including adversarial training and optional enterprise safeguards to prevent unauthorized actions and prompt injection.

Introducing computer use in Gemini 3.5 Flash

Google DeepMind has announced that computer use is now a built-in tool within Gemini 3.5 Flash, marking a significant advancement for developers building agentic workflows. By integrating this capability directly into the main Gemini Flash model, Google provides a more streamlined approach for creating agents capable of interacting across browser, mobile, and desktop environments.

Expanding Agentic Capabilities

Previously available only as a standalone Gemini 2.5 computer use model, this native integration allows Gemini 3.5 Flash to see, reason, and take action across various platforms. This evolution builds upon the model’s existing strengths in function calling and grounding tools like Search and Maps. The result is improved performance for complex, long-horizon tasks, including enterprise automation, continuous software testing, and knowledge work across professional applications.
Developers and enterprises can access these new capabilities through the Gemini API and the Gemini Enterprise Agent Platform. Practical applications of this technology include tasks such as analyzing application features or auditing documentation for accessibility issues.

Prioritizing Safety and Security

To address the risks associated with agents operating in live environments, such as prompt injection, Google has implemented targeted adversarial training for computer use in Gemini 3.5 Flash. This defense-in-depth strategy is supported by two optional enterprise safeguard systems. These systems allow organizations to require explicit user confirmation for sensitive or irreversible actions and to automatically halt tasks if an indirect prompt injection is detected.
Google encourages developers to complement these built-in safeguards with additional security measures. Recommended best practices include the use of secure sandboxing, human-in-the-loop verification, and the implementation of strict access controls. Further details on these safety protocols are available in the official best practices documentation.

Getting Started with Development

Developers interested in exploring these capabilities can begin by testing the technology in a demo environment hosted by Browserbase. For those ready to move into development, reference implementations and comprehensive documentation are available through the Gemini API and the Gemini Enterprise Agent Platform. These resources are designed to help users integrate computer use into their own custom agent workflows.

Comments (0)

No comments yet

Be the first to share your thoughts!