Google Cloud’s Post

Start building in a flash with Gemini 2.0 Flash—packed with new features and improvements ⚡ Starting today, developers can test Gemini 2.0 Flash Experimental via the Gemini API in Google AI Studio and Vertex AI, taking advantage of improved spatial understanding, multimodal and reasoning performance. And with the new Multimodal Live API users can create interactive experiences that respond in real time, like in the demo below. Learn more → https://goo.gle/4gsj9W3

Yariv Adan

Early stage AI startups investor - reach out to me with opportunities! | Ex Senior Director @ Google AI, Google Zurich Site Lead

3d

Ok, I must give it to Google - this one launch is equal to 12 days of launches... 😇🤷♂️ Amazing responsive video and audio interaction! Killer feature in the browser - sending quite a few startups back to square 1 Truly multimodal input and output. Text and visual input output are truly one modality now. Very impressive demonstration by Google of what they are best at - building amazing AI technology. If only they knew how to package and deliver it in an easy to use product - they would be unstoppable.

Gemini 2.0 Flash is a major step forward! 🙌 The improvements in spatial understanding and multimodal performance are impressive. Excited to dive into this and see how it opens up new possibilities for developers. Can't wait to explore more!

Incredible leap forward! 🚀 These spatial and multimodal capabilities will transform how we build AI solutions. Let's go! ⚡

Vikas Gupta ☁️

Having 22+ yr exp.- Sales & BDM in DC, Cloud & Cybersecurity,IMS ,Prof. Services HySc - AWS/Azure/GCP Staff Aug.(GCC -L1/L2/L3) /NOC, Cymmetri-IAM/PAM l Forcepoint-DLP/SASE/ZTNA l Seceon's MSOC (aiSIEM/aiXDR) l SOAR

3d

Built on Trillium, sixth-generation TPUs, which powered 100% of Gemini 2.0 flash and today it's available to all Gemini users so that they can build with it too with Deep research feature and supports multimodal output like natively generated images mixed with text and steerable text-to-speech (TTS) multilingual audio.

Google - Gemini 2.0 is delivering an abundance of features to users. Excited to learn about their experiences firsthand!

Free live multimodal model usage is crazy good. My previous project can be revived thanks to this low-latency audio to text model :D I hope there are more websocket examples on the cookbook

Like
Reply

#Gemini 2.0 is a game-changer for builders and innovators. Google Cloud’s commitment to empowering creativity with cutting-edge AI tools is unmatched. Can’t wait to see how this transforms projects across industries, appreciate to the team for making big ideas a reality.

Like
Reply
Jerome A. White

Senior Project Manager | Digital Transformation | Growth & Strategy Leader: Market Development and Value Creation (Cloud, AI, Agile)

2d

Heard about this today at the AI Summit in NY. Amazing work Google Cloud . So now I don't need LiveKit nor ElevenLabs to drive my user experience.

Ankit Pareek

Director of Technology Generative AI, Conversational AI, NLP, Intelligent Automation

2d

Does it have an inbuilt audio splitter or one needs to bring in a custom one ?

Imran Bashir

GenAI, MLOps, IoT | Architect | Engineer | Consultant ◈ I Partner with Tech Leaders To Solve IoT, GenAI & Video Analytics Complexities, Architect Advanced Systems, Cut Costs and Fast-Track Deployments - Let's Talk !

2d

Beautiful. Can't wait to try out the streaming audio.

See more comments

To view or add a comment, sign in

Explore topics