tools
updated every night
Gemini Omni
last releaseMay 19, 2026
powered byGemini Omni Flash
goblin vibe check:
basically nano banana for video, with enough world knowledge to matter for creators
google deepmind's multimodal creative model family for generating and editing video from text, image, audio, and video inputs. the first release, gemini omni flash, starts with conversational video creation and editing inside gemini, google flow, and youtube creation surfaces.
creates video from text, image, audio, video, or mixed multimodal inputsedits videos through step-by-step natural conversation while preserving scene continuitysupports avatar-style video creation from a user's own voice and likenesscombines Gemini reasoning with video generation for physics-aware and knowledge-grounded edits
key features
creates video from text, image, audio, video, or mixed multimodal inputsedits videos through step-by-step natural conversation while preserving scene continuitysupports avatar-style video creation from a user's own voice and likenesscombines Gemini reasoning with video generation for physics-aware and knowledge-grounded editsoutputs include SynthID watermarking and C2PA Content Credentials
spec & usage
available through the Gemini app and Google Flow for Google AI Plus, Pro, and Ultra subscribers
also rolling out at no cost to YouTube Shorts and YouTube Create users
developer and enterprise API access was described as coming in the following weeks
limitations
first release starts with video; broader output modalities are described as coming later
audio and speech editing capabilities are being tested cautiously for responsible-use reasons
scope:
visualtoolvideoconsistencycloudpaidmultimodalmodel