Azure AI Studio gains new multimodal capabilities following BUILD 2024 announcement

Part of Microsoft’s multi-billion dollar investment into OpenAI includes access to its latest GPT models which now includes GPT-4o that will now be part of Azure AI starting today.

As part of the keynote speech given by Microsoft CEO Satya Nadella during the company’s annual developer conference, BUILD attendees were privy to the announcement that GPT-4o is now available in Azure AI studio as an API.

Developers can now make use of multimodal models that integrate texts, images, and audio processing within a single model.

Another multimodal experience announced during BUILD was the inclusion of Phi-3-vision into the Phi-3 family which is now home to Phi-3 small, Phi-3 medium, and Phi-3 vision. Microsoft has been working diligently on creating several specialized small language models to complement its investments in large language models and the Phi-3 family represents some of those efforts with Phi-3 vision specifically designed for personal devices.

According to Microsoft, Phi-3 vision will off “the availability to input images and text and receive text responses. For example, users can ask questions about a chart or ask an open-ended question about specific images.”

At 4.2 billion parameters, Microsoft’s new Phi-3-vision should easily support general visual reasoning tasks as well as chart/graph/table reasoning operations.

Depending on how developers implement the new APIs, users could in theory prompt AI solutions about charts or open-ended questions regarding specific images.

Microsoft also announced that its enabling new capabilities across current APIs for more multimodal experiences that include speech analytics and universal translations.

Subscribe

Related articles

Google I/O 2025 Program Lineup Unveiled: AI, Android, and Web Innovations Await

Tech enthusiasts, developers, and industry watchers—it's that time of year again! It's developer conference season, and Google has officially unveiled the Google I/O 2025 program lineup that's packed with exciting sessions covering AI, Android, web, and cloud. With the conference set to take place May 20-21, the agenda gives us a glimpse into what Google has been cooking up behind the scenes.

Hit the ice this weekend with Xbox Free Play Days

If you're a hockey fan and your team is...

The Ultimate Tech Deals for Mother’s Day: Bose Gifts That’ll Hit All the Right Notes

Mother’s Day is right around the corner, and if your mom loves music (or just appreciates a little peace and quiet), now is the perfect time to upgrade her listening experience. Thankfully, Bose is rolling out an epic series of discounts on headphones, earbuds, and speakers starting Friday, April 25—all with free two-day shipping, so you can snag the perfect gift just in time.

Create, Copilot Notebooks, and AI Agents—Microsoft’s Latest Copilot Upgrades

Microsoft is doubling down on its vision for AI-driven productivity with the Microsoft 365 Copilot Wave 2 spring release. This latest update introduces new AI-powered agents, enhanced collaboration tools, and a dedicated Agent Store, making it easier than ever for businesses to integrate AI into their workflows.

OpenAI Eyes Google Chrome Amid DOJ’s Antitrust Remedies

OpenAI’s Head of Product for ChatGPT, Nick Turley, has expressed interest in acquiring Google Chrome, should the Department of Justice (DOJ) force Google to divest its popular web browser. Turley’s comments, made during Google’s ongoing antitrust trial, highlight OpenAI’s ambition to expand its influence beyond AI-powered chatbots and into the broader internet ecosystem.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

WP Twitter Auto Publish Powered By : XYZScripts.com