Google DeepMind has launched Gemini Robotics On-Device, a new language model designed to operate locally on robots without requiring an internet connection. This model builds upon the compa…
Google DeepMind has launched Gemini Robotics On-Device, a new language model designed to operate locally on robots without requiring an internet connection. This model builds upon the company's previous Gemini Robotics model, enhancing the ability to control robot movements. Developers can utilize natural language prompts to control and fine-tune the model for various applications.
In benchmark tests, Google claims that Gemini Robotics On-Device performs nearly as well as its cloud-based counterpart, and outperforms other on-device models in general benchmarks. Demonstrations showcased robots performing tasks like unzipping bags and folding clothes. The model, initially trained for ALOHA robots, was successfully adapted for use on a bi-arm Franka FR3 robot and the Apollo humanoid robot by Apptronik.
The Franka FR3 robot, in particular, demonstrated the ability to handle tasks and objects it hadn't previously encountered, such as industrial belt assembly. To further facilitate development, Google DeepMind is releasing a Gemini Robotics SDK. This SDK allows developers to train robots on new tasks through demonstrations, utilizing the MuJoCo physics simulator.
The SDK will allow robots to learn and adapt to new tasks. The field of robotics and AI is seeing rapid growth, with other companies also developing similar models. Nvidia is creating a platform for humanoid robot foundation models, Hugging Face is developing open models and datasets, and Korean startup RLWRLD is working on foundational models for robots.