Gemini AI can now turn photos into videos | The Verge

Google's Gemini AI now boasts a new feature allowing users to transform static photos into dynamic video clips. This innovative photo-to-video capability, powered by Google's advanced Veo 3…

Open original source

Google's Gemini AI now boasts a new feature allowing users to transform static photos into dynamic video clips. This innovative photo-to-video capability, powered by Google's advanced Veo 3 video model, enables users to upload a reference image and generate an eight-second video. The feature goes beyond simple animation, incorporating AI-generated audio that includes background noises, environmental sounds, and even synchronized speech, enhancing the immersive experience.

This update is currently available to Google AI Ultra and Pro subscribers in select regions, accessible through the web starting today and rolling out to mobile devices throughout the week. To utilize this feature, Gemini users can access it through the "tools" option in the prompt bar, selecting "video" and uploading their chosen photo.

They can further customize the video by adding a text description detailing the desired movement and including audio descriptions for dialogue, sound effects, and ambient noise. Google emphasizes that the audio will be perfectly synchronized with the visuals. The resulting videos are delivered as MP4 files with a 720p resolution and a 16:9 landscape format, providing a versatile and shareable output.

Google envisions a range of creative applications for this new feature, suggesting users can animate everyday objects, bring drawings and paintings to life, or add movement to nature scenes. To maintain transparency, all generated videos will include a visible watermark indicating they are AI-generated, alongside an invisible SynthID digital watermark.

This feature mirrors a similar capability already present in Google's generative AI filmmaking tool, Flow, which also sees an expansion with launches in 75 additional countries today. This launch signifies a significant step in integrating AI-powered video creation directly into the Gemini platform, offering users a streamlined experience for animating their photos without needing to switch applications.

The addition of synchronized audio further enhances the potential for creative expression, opening up new avenues for users to explore and share their visual narratives. The rollout to a wider audience through web and mobile platforms indicates Google's commitment to making this innovative technology accessible to a broader user base.