Microsoft Research has unveiled Fara1.5, a new family of computer-use agent (CUA) models designed to navigate web browsers and perform tasks autonomously. Available in 4B, 9B, and 27B parameter sizes, these models are built upon Qwen3.5 base checkpoints and integrated with MagenticLite, a sandboxed browser interface. By processing screenshots and executing mouse and keyboard inputs, Fara1.5 aims to bridge the gap between intent and action in complex digital environments.
Performance Benchmarks and Capabilities
The Fara1.5-27B model has demonstrated significant performance gains on the Online-Mind2Web benchmark, achieving a 72% task success rate. This score surpasses several prominent competitors, including OpenAI’s Operator at 58.3%, Google’s Gemini 2.5 Computer Use at 57.3%, and Yutori’s Navigator n1 at 64.7%. The 9B variant also shows strong performance, scoring 63.4%, which represents a substantial improvement over the predecessor Fara-7B’s 34.1% success rate.
Beyond Online-Mind2Web, the models have been evaluated on WebVoyager, where the 27B, 9B, and 4B variants achieved scores of 88.6%, 86.6%, and 80.8%, respectively. These evaluations, which were averaged over three independent runs using Browserbase to ensure session stability, highlight the competitive edge of the Fara1.5 family against peers like MolmoWeb 8B, GUI-Owl-1.5 8B, and Holo2 8B.
Architecture and Synthetic Training
The Fara1.5 models operate through an observe-think-act loop, analyzing conversation history alongside the three most recent browser screenshots to determine the next action. The action space encompasses standard input commands, web-specific tasks like searching, and meta-actions such as memorizing facts or requesting user clarification. This design allows the agents to manage longer horizons and collaborate more effectively with human users.
Training for these models was supported by FaraGen1.5, a synthetic data pipeline that generated approximately two million training samples. To address the challenges of gated domains—sites requiring authentication or irreversible actions—the team created six functional app clones known as FaraEnvs. These environments, which include Mail, Calendar, Stream, ML, Stay, and Scheduler, were built using GitHub Copilot CLI. The training mix was diverse, consisting of 60% web trajectories, 12.8% synthetic environments, and smaller portions dedicated to form filling, grounding, and visual question answering.
Safety and User Interaction
Safety is a core component of the Fara1.5 framework. The agents are trained to pause and solicit user input in three specific scenarios: when a task requires missing personal information, when the task description is ambiguous, or when an irreversible action is about to be performed. All agent actions within the MagenticLite interface are logged and auditable, with the sandboxed browser serving as a security boundary between the agent and the user’s machine. These measures align with Microsoft’s Responsible AI Policy, ensuring that the agents operate within defined safety parameters while executing complex web-based workflows.

Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!