The San Francisco-based OpenAI introduced the new generative AI model GPT-4o on Monday. The "o" stands for "omni" and refers to the model's ability to process text, speech, and video. GPT-4o will be gradually introduced into the company's developer and consumer products over the next few weeks.
OpenAI CTO Mira Murati Explains GPT-4o Exhibits the Intelligence of GPT-4 but Enhances Capabilities Across Various Modalities and Media. "GPT-4o Can Think Beyond Language, Text, and Vision," Murati Said During a Presentation at the OpenAI Offices. "This Is of Great Significance as We Shape the Future of Human-Machine Interaction."
The previous model, GPT-4 Turbo, was designed to analyze images and text. GPT-4o extends these capabilities with language. This allows for a variety of new applications, including an enhanced user experience in the AI-powered chatbot ChatGPT.
With GPT-4o, ChatGPT Becomes More User-Friendly as Users Can Now Ask Questions and Interrupt ChatGPT During the Response. The Model Reacts in Real Time and Can Even Recognize Nuances in a User's Voice and Respond Accordingly in Various Emotional Styles, Including Singing.
Additionally, GPT-4o enhances the visual capabilities of ChatGPT. The model can now respond to questions about a photo or a screenshot, for example, "What is happening in this software code?" or "Which brand is this shirt?"
These features are expected to be further developed, according to Murati. In the future, GPT-4 or ChatGPT could, for example, enable one to "watch" a live sports game and explain the rules.
GPT-4o is also multilingual and, according to OpenAI, shows improved performance in around 50 languages. In the OpenAI API and Microsoft's Azure OpenAI service, GPT-4o is twice as fast, half as expensive, and has higher rate limits than GPT-4 Turbo.
Currently, the language functionality of GPT-4 is not yet available to all customers in the API. OpenAI plans to initially provide the new audio capabilities to a small group of trusted partners.
GPT-4o is available in the free version of ChatGPT starting today and for subscribers of the premium plans ChatGPT Plus and Team with "5x higher" message limits. The enhanced ChatGPT language experience will be available in an alpha version for Plus users in the coming months.
Additionally, OpenAI has announced a revised ChatGPT user interface on the web, which offers a "conversation-oriented" homepage and message layout. A desktop version of ChatGPT for macOS allows users to ask questions or take and discuss screenshots via a keyboard shortcut. ChatGPT Plus users will have access to the app starting today, with a Windows version to follow later in the year.
Finally, the GPT Store, OpenAI's Library and Creation Tool for Third-Party Chatbots, is Now Available for Users of the Free ChatGPT Version. Free Users Can Now Also Use Features That Were Previously Behind a Paywall, Such as a Memory Function, Which Allows ChatGPT to Save Preferences for Future Interactions, Upload Files and Photos, and Search the Web for Answers to Current Questions.