Gemini Omni

last releaseMay 19, 2026

powered byGemini Omni Flash

goblin vibe check:

basically nano banana for video, with enough world knowledge to matter for creators

google deepmind's multimodal creative model family for generating and editing video from text, image, audio, and video inputs. the first release, gemini omni flash, starts with conversational video creation and editing inside gemini, google flow, and youtube creation surfaces.

creates video from text, image, audio, video, or mixed multimodal inputsedits videos through step-by-step natural conversation while preserving scene continuitysupports avatar-style video creation from a user's own voice and likenesscombines Gemini reasoning with video generation for physics-aware and knowledge-grounded edits

key features

spec & usage

available through the Gemini app and Google Flow for Google AI Plus, Pro, and Ultra subscribers

also rolling out at no cost to YouTube Shorts and YouTube Create users

developer and enterprise API access was described as coming in the following weeks

limitations

first release starts with video; broader output modalities are described as coming later

audio and speech editing capabilities are being tested cautiously for responsible-use reasons

scope:

visualtoolvideoconsistencycloudpaidmultimodalmodel

launchMay 19, 2026

last releaseMay 19, 2026

visit site x