Start building in a flash with Gemini 2.0 Flash—packed with new features and improvements ⚡ Starting today, developers can test Gemini 2.0 Flash Experimental via the Gemini API in Google AI Studio and Vertex AI, taking advantage of improved spatial understanding, multimodal and reasoning performance. And with the new Multimodal Live API users can create interactive experiences that respond in real time, like in the demo below. Learn more → https://goo.gle/4gsj9W3
Gemini 2.0 Flash is a major step forward! 🙌 The improvements in spatial understanding and multimodal performance are impressive. Excited to dive into this and see how it opens up new possibilities for developers. Can't wait to explore more!
Incredible leap forward! 🚀 These spatial and multimodal capabilities will transform how we build AI solutions. Let's go! ⚡
Built on Trillium, sixth-generation TPUs, which powered 100% of Gemini 2.0 flash and today it's available to all Gemini users so that they can build with it too with Deep research feature and supports multimodal output like natively generated images mixed with text and steerable text-to-speech (TTS) multilingual audio.
Free live multimodal model usage is crazy good. My previous project can be revived thanks to this low-latency audio to text model :D I hope there are more websocket examples on the cookbook
#Gemini 2.0 is a game-changer for builders and innovators. Google Cloud’s commitment to empowering creativity with cutting-edge AI tools is unmatched. Can’t wait to see how this transforms projects across industries, appreciate to the team for making big ideas a reality.
Heard about this today at the AI Summit in NY. Amazing work Google Cloud . So now I don't need LiveKit nor ElevenLabs to drive my user experience.
Does it have an inbuilt audio splitter or one needs to bring in a custom one ?
Beautiful. Can't wait to try out the streaming audio.
Early stage AI startups investor - reach out to me with opportunities! | Ex Senior Director @ Google AI, Google Zurich Site Lead
3dOk, I must give it to Google - this one launch is equal to 12 days of launches... 😇🤷♂️ Amazing responsive video and audio interaction! Killer feature in the browser - sending quite a few startups back to square 1 Truly multimodal input and output. Text and visual input output are truly one modality now. Very impressive demonstration by Google of what they are best at - building amazing AI technology. If only they knew how to package and deliver it in an easy to use product - they would be unstoppable.