2001: A Space Odyssey is over 50 years old now yet its talking AI, HAL 9000, remains iconic. However, the reality of talking AIs has been pretty disappointing – Siri, Alexa, Google Assistant are useful, but they rarely feel smart. That is changing and changing fast.

Google is setting the stage (literally) for its annual I/O conference (you can watch the livestream here, it starts at 17:00 UTC) and shared a teaser of what’s to come – its Gemini AI has learned new tricks.

It’s a “multi-modal” AI, meaning that it can seamlessly incorporate text, audio and imagery. This allows it to run on a Pixel phone and see the world through its camera eyes while holding a conversation about what’s going on around it. Check out the brief demo below:

In case you missed it, check out OpenAI’s GPT-4o demo, which shows off very similar skills – an AI talking to a human and discussing their surroundings in real time. A battle of AIs is coming, especially since Apple wants to integrate ChatGPT into iOS 18 (Samsung, Oppo and OnePlus have picked Gemini).

What does that mean for the Google Assistant? Well, we don’t know – Google I/O might be its farewell as Gemini takes over or the two might coexist for a while since Gemini is still missing some functionality (as some of you pointed out in the comments of the livestream post). That said, it looks like Google has major Gemini announcements ready, so the missing functionality might just be waiting to be introduced.

Another interesting thing to think about – we hear that the Pixel 8 will get on-device Gemini Nano after all (just like the Pro) and the Pixel 9 series is bound to have even more compute power on tap. How much of that Gemini demo is running on device and how much of it happens in the cloud?

This year’s Google I/O is not to be missed. Besides the AI stuff, Google will also launch Android 15 and we expect to hear about AI assistant integration, satellite communication and other new developments that rose to prominence since I/O 2023.


Source link