Google DeepMind’s optimized AI model runs directly on robots | The Verge

Google DeepMind has developed an on-device version of its Gemini Robotics AI model, enabling robots to perform tasks without an internet connection. This new Vision-Language-Action (VLA) mo…

Open original source

Google DeepMind has developed an on-device version of its Gemini Robotics AI model, enabling robots to perform tasks without an internet connection. This new Vision-Language-Action (VLA) model offers similar dexterous capabilities to the original model released in March, but is optimized to run directly on robots.

The on-device model is designed to be small and efficient, allowing for offline functionality that is nearly as effective as the flagship Gemini Robotics model. This advancement allows robots to generalize to new situations, understand commands, and execute tasks requiring fine motor skills.

The on-device model can perform various tasks out of the box and can adapt to new situations with a relatively small number of demonstrations. Google trained the model on its ALOHA robot but successfully adapted it to different robot types, including Apptronik's Apollo humanoid robot and the Franka FR3 robot.

While the hybrid Gemini Robotics model remains more powerful, Google is impressed by the on-device model's capabilities. It is positioned as a suitable option for applications with poor connectivity or for companies with strict security requirements. To support this launch, Google is releasing a Software Development Kit (SDK) for the on-device model.

This is the first time Google DeepMind has offered an SDK for its VLAs, enabling developers to evaluate and fine-tune the model. The on-device model and its SDK will be available to a select group of trusted testers as Google continues to address safety concerns. This innovation represents a significant step toward more versatile and accessible robotics, enabling robots to operate in environments with limited or no internet access.

The combination of on-device functionality and an accompanying SDK empowers developers to explore and refine the capabilities of the Gemini Robotics model, further advancing the field of robotics and AI.