Clive Bull 1am - 4am
The key announcements from Google’s I/O conference
14 May 2024, 21:44
The tech giant has been laying out its plans for AI-powered tech.
Google has been showcasing a wide range of new AI-powered products, as the tech giant stakes its claim to be one of the major players of the AI era.
During a two-hour keynote event to open its annual I/O developer conference – and where the phrase “AI” was mentioned more than 120 times – the firm laid out how it would be further integrating generative AI into its most well-known products.
Here is a look at the key announcements from the event.
– Rebuilding search for generative AI
Google Search, the most well-known part of the company’s business, is being revamped for the generative AI, Google confirmed.
In the coming weeks, it will begin rolling out AI Overviews, a new type of search result powered by AI.
Powered by Google’s Gemini AI model, AI Overviews is able to understand longer and more complex queries and provide a range of suggestions in response, breaking queries into key parts and offering multiple perspectives from the web.
Google says the aim is to “take the legwork” out of searching by providing a wider range of content in one place as new look search results.
It will begin rolling out in the US this week, with support for more countries set to follow shortly.
– Project Astra
The name for Google’s vision for the “future of AI assistants”, Project Astra is the first experiment the company has done on a so-called “universal agent”.
The idea is an AI assistant that is useful at any point in every day life and has real-time conversational capabilities while also being multimodal – able to take in any combination of text, audio and visual input and respond accurately.
The demonstration of this during I/O saw someone scanning an office space with their smartphone camera, while asking Astra to identify objects in the office, offer creative suggestions on things it saw and even remind the user where certain items where they had misplaced.
– Veo and Imagen 3
Two major new content creation tools from the tech giant, enabling users to create either video or images based on text inputs.
Google said Veo was its “most capable video generation model to date” and can generate 1080p videos over a minute in length in a range of cinematic and visual styles.
Veo has an advanced understanding of natural language and cinematic terms, Google said, enabling for more creative control.
Similarly for Imagen 3, Google said its improve understanding of language meant the app would be able to incorporate smaller details from longer prompts accurately and could also now better render text.
The company suggested this could all for the better generation of personalised messages and title slides in presentations, previously an issue for image generation models.
– Gemini in Workspace
In a key update for consumers, Google said it was bringing its Gemini 1.5 Pro AI model to its workspace suite of apps, meaning users will soon have the ability to use Gemini within some of their most commonly used apps.
For example in Gmail, users will be able to ask Gemini to summarise all recent emails from a specific sender or on a certain topic, in order to catch up on anything you’ve missed.
Elsewhere, the company demonstrated a new feature coming to its Photos app, called “Ask Photos”.
Here, Gemini will be able to search a user’s photo album for specific single images based on text prompts for what they need – for example that photo of your home wifi network name and password or car number plate – and quickly surface it to the user.
– Gemini on Android
As part of what Android boss Sameer Samat said was a “multi-year journey to reimagine Android with Gemini at its core”, Google said it was making its AI model the new AI assistant on its mobile operating system.
Much of the AI work would be done on-device, Google said, for better security, and a number of new tools would also be steadily rolled out, including a new scam detection feature that would analyse voice calls in real time and alert users if Gemini believed it could be a scam.
Multimodality support via Gemini Nano was also coming to Google’s own Pixel devices, it was confirmed, allowing users to use text, image and visual prompts to interact with Gemini.