Google I/O 2024 was packed with 121 AI references and countless generative features

On Tuesday, Google held the opening keynote to its 2024 Google I/O developer conference where a cadre of executives showcased new experiments, platforms, and services that all revolve around artificial intelligence.

Here are some of the highlights from the two-hour presentation headlined by Google CEO Sundar Pichai and accompanied by host of other executives.

Putting AI officially in Search

Since announcing Bard and its newer Gemini large language model platform, Google has been flirting with pre-generative results in its Google Search experience for a quick minute. While Gemini produced results show up briefly at the top of the more traditional Google Search UI, the experience remains relatively familiar, however, after I/O the search experience is set to get an official AI upgrade.

According to Liz Reid, head of Google Search, the platform will utilize a custom version of Gemini to reorganize its Search results page with real-time, additional context, updated overview feature, and bring multimodal features.

We’ve heard that users find search more helpful than ever. At this point, we’ve served billions of queries. And what we hear again and again is that people like this combination of insights, mixed with the ability to dive deeper to hear from human perspectives and different authoritative sources.

Liz Red, Head of Google Search

The implementation of the Gemini-powered Google Search won’t be a full-on switch but an amalgamation of both the current search context and the newer extensive pre-generative results. Reid said that traditional search results will appear when sufficient while the new AI overview will pop up when more complex prompts are applied across more scattered results such as trip planning, meal preps, workout routines, and more.

Circle to Search gets broader distribution

Samsung announced Circle to Search when it launched its Galaxy S24 lineup and eventually the feature made it to Google’s own Pixel hardware months later, but now it’s coming to more Android devices with even more abilities.

Currently Circle to Search is capable of intuitively sourcing the internet to match results with screenshots that have specific items highlighted by a circle, scribble or tap. However, thanks to Google’s new LearnLM AI model, the feature will now be able to help with homework.

Google demoed complex problem solving with step-by-step instructions accompanied with a simple Circle to Search motion for problems that include symbolic formulas, diagrams, graphs, and a handful more.

Currently Circle to Search is supported by over 100M devices and Google expects that number to be at least double by the end of the year.

AI Gets into video

Several pre generative video models have started popping up over the past few months with OpenAI’s Sora capturing the lions share of headlines right now.

However, Google just announced its response to iys Sora with its very own generative video AI model, Veo.

“We’re exploring features like storyboarding and generating longer scenes to see what Veo can do,”told reporters during a virtual roundtable. “

Demis Hassabis, head of Google’s AI R&D lab DeepMind

Google is combing its current work in video generation with its Imagen 2 image-generating models to produce short looping videos that, at least, look pretty impressive.

According to Google, the company has been feeding Veo tons of footage that includes content from Google Search and YouTube for a while to help the AI model generate its video samples.

Veo isn’t fully accessible to regular customers right now but Google plans to remove the feature from its invite only status, sometime in the near future.

Google beefs up Image generation

Google updated its text to illustrate platform with its latest Imagne model, Imagen 3. CEO of DeepMind joined the list of executives to grace the stage during opening keynote, with the news that Imagen 3 can better understand natural language text prompts.

This is our best model yet for rendering text which has been a challenge for image-generation models.

Demis Hassabis, CEO of DeepMind

Google hasn’t made the new model fully available to the general public, but it is offering a sneak peek at it in a private preview of ImageFX.

Project IDX in Open Beta

Google unveiled its next-gen upgrade to its developer environment that is entirely browser-based, and AI powered. Instead of the traditional IDE being offered to developers, Google’s IDX instead supports for better Google Maps integration that includes geolocation features, integrated Chrome Dev Tools and Lighthouse debugging.

IDX is also backed by support for Google’s Cloud Run serverless platform for better managed front-end and back-end services.

“As AI becomes more prevalent, the complexities that come with deploying all of that really becomes harder, becomes greater, and we wanted to help solve that challenge. That’s why we built project IDX, a multi-platform development experience that makes building applications fast and easy. Project IDX makes it really frictionless to get going with your preferred framework or language with easy-to-use templates like Next.js, Astro, Flutter, Dart, Angular, Go and more.

Jeanine Banks, Google’s VP and general manager for Developer X and the company’s head of developer relations

Google is pairing its IDX with another AI powered solution in Checks. Google Checks is the company’s compliance platform that automates end-to-end monitoring of apps and will be generally available following the developer conference this week.

Gemini gets added to Google Maps

Predictibly, Google managed to add its Gemini LLM to its prize location app that results in a couple of new features coming to the app such as Places API. Google Places API will help developer’s tie-in generative AI summaries of locations and places within their app when data is pulled from Google Maps.

Users of apps that leverage Places API will no longer need to rely on static descriptions written by developers but evolving summaries that are generated from the 300 million contributors of Google Maps location descriptions, reviews, and recommendations.

Gemini gets split into mini models

Google introduced several disparate versions of Gemini models that are designed to address specific niche cases that include Gemini 1.5 Pro for analyzing documents, codebases, audio recordings, video recognition and more. Gemini Live was also introduced as a solution to offer users a more immersive voice chat experience with Gemini. Gemini Live feels like a second attempt at reviving interactions with Google Assistant beyond asking for weather predictions and setting timers by offering a less error prone experience and more contextual results.

The long-awaited Gemini Nano will be directly baked into Google’s Chrome desktop browser version 126 and beyond. Developers will soon be able to tap into on-device power of the browser to power their own AI features that leverage products such as Gmail, captioning, and transcriptions.

“To deliver this feature, we fine-tuned our most efficient version of Gemini and optimized Chrome. Now we want to give you access to Gemini models in Chrome. Our vision to give you the most power AI models in Chrome to reach billions of users without having to worry about prompt engineering, fine-tuning, capacity and cost. All you have to do is call a few high-level APIs – translate, caption, transcribe. This is a big shift for the web and we want to get it right.

Jon Dahlke, Google’s director of product management for Chrome

Android gets beefed up with Gemini

Google didn’t speak specifically to any Android 15 related updates or news during the Google I/O keynote, but it did briefly talk about its LLM coming to the mobile platform in a broader feature release.

According to Google, Android users will soon be able to drag and drop AI generative images into various sections of Android and apps that include Gmail and Messages.

The YouTube app will also support a new “Ask this video” prompt that will help provide information and context from videos in the app. For $19.99, users can extend the “Ask this” feature from YouTube to more enterprise related experiences such as “Ask this PDF” via Gemini Advance.

Gemini Nano will soon be coming to devices to help expand multimodal applications and process text, visual, and audio inputs in various sections of the operating system.

Gemini will also be found in Gmail specifically to help automate actions within email such as processing e-commerce returns, expanded contextual search queries, email drafting, summarizations, and attachment analyzations among other tasks.

Gemini Nano will also be used to beef up Android’s spam call detection to include in-call identification. While Android is getting better at detecting Spam calls before users pick up a call, Google is also looking to help people in the midst of a call, identify if they’re about to get scammed as well.

The feature will listen to live call conversation patterns to match those to commonly associated scam patterns and help identify the likelihood of interacting with a spam or fraudulent call.

This feature should be godsend for older smartphone users who are less cynical about answering all incoming phone calls.

Gemini will be infused in Google Photos to help launch an experimental feature called “Ask Photos”. Ask Photos will roll out this summer to help users sort through large photo portfolios using natural language mixed with metadata sourcing verses the imprecise single keyword search that exist currently.

AI for learning

Google spent quite a bit of time talking about its LeanLM platform which the company claims is “fine-tuned” for education. According to Google, LearnLM is meant to help students conversationally interact with AI to help tutor them on a range of subjects through lesson planning, new teaching ideas, content and activities suggestions, quizzes, games, and finding subject matter experts.

Google is seeding its LLM through its Google Classroom platform and select educators at the moment and there was no mention of when this will be generally available.

New developer kit

Lastly, Google introduced a new developer kit called Firebase Genkit which is aimed at helping developers build AI-powered applications specifically. Firebase Genkit leverages Apache 2.0 license to help developers get started and will support applications written in JavaScript and Go support planned for the future.

Firebase Genkit is designed to automate the quick development of applications that leverage summarizations, text translations, and image generation, among other AI-related tasks.

Sundar Pichai made a comedic note of how many times AI was said on stage by actually using AI to track the occurrence at 121 times. However, what wasn’t talked about at length were the other bets Google had been traditionally working on that include Nest, ChromeOS, Android OS, Google Cloud, Google Cloud IoT Core, Google Workspace, Google Chat, Google Fit, Google Cast, Google Ads, and more.

While competitors are expected to speak at length about AI as well during their developer conferences, it’ll be interesting to see how they also balance their other bets against the onslaught of AI marketing the sector appears to be doing at the moment.

Subscribe

Related articles

Microsoft Steps Back as OpenAI Takes Control of CoreWeave Agreement

Microsoft's decision to pass on the $12 billion CoreWeave agreement and allow OpenAI to take the reins has sparked significant discussion in the tech world. This move, while surprising to some, reflects a strategic shift in how Microsoft is managing its AI investments and partnerships.

Plex fans might not be happy with the latest updates

Plex has announced it will have much to say...

Go fishing this weekend with Free Play Days

This weekend's Free Play Days offers you a chance...

Microsoft Teams Up with Musk, BlackRock, and MGX for a $30B Data Center Gamble

Microsoft’s latest venture—a $30 billion data center project in partnership with Elon Musk’s xAI, BlackRock, and UAE-based MGX—feels like a plot twist straight out of a dystopian novel. On paper, it’s a bold move to dominate the AI infrastructure race. In reality, it’s a tangled web of questionable alliances and potential PR disasters waiting to happen.

The EU Calls Out Apple and Google’s Half-Hearted Compliance with its Digital Markets Act

The European Union has once again donned its superhero cape, swooping in to rescue us from the clutches of Big Tech's monopolistic tendencies. This time, the EU has outlined specific steps for Apple and Google to comply with the Digital Markets Act (DMA), a regulation that was supposed to make digital markets fairer and more competitive. But let’s be honest—Apple and Google have been dragging their feet since the DMA was established, and the EU is finally calling them out.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

WP Twitter Auto Publish Powered By : XYZScripts.com