OpenAI Expands into Next-Gen Audio AI With Three New Models

Technologymagazine · 🇮🇳 public India · May 11, 2026 · 504 words · By Rithula Nisha

headphones Listen to the eFinder podcast briefing

Generate a natural audio summary of this story

Daily briefing

What to know about OpenAI Expands into Next-Gen Audio AI With Three New Models

OpenAI has introduced three new audio models—GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper—designed for real-time voice interaction, translation, and transcription. The article details the technical capabilities of these models and provides examples of how companies like Deutsche Telekom, Vimeo, and Priceline are integrating them into their services.

Propaganda risk 10%

Claims checked 9

Techniques found 0

Topics 0

Coverage spectrum

Coverage gap: Low Left coverage

Left0%

Center80%

Right20%

5 sources compared across this story cluster. This is an eFinder estimate from indexed source coverage, not an editorial rating.

What happened

OpenAI Expands into Next-Gen Audio AI With Three New Models OpenAI has released three audio models designed to handle real-time voice interactions.

Why it matters

GPT-Realtime-2, GPT-Realtime-Translate and GPT-Realtime-Whisper could enable software systems to process spoken requests and respond whilst conversations are still taking place.

Common ground

The models target developers building applications where users need to communicate by voice rather than text.

Perspective signals

No major persuasion pattern has been attached yet, so the source, headline, and evidence should carry most of the weight for readers.

Follow-up questions

What concrete event or decision sits underneath the headline: OpenAI Expands into Next-Gen Audio AI With Three New Models?
What evidence would most clearly confirm or weaken the claim that GPT-Realtime-Translate processes speech from more than 70 input languages into 13 output languages?
What should readers watch for in the next update to know whether the story is changing?

open_in_new Read the original article: https://technologymagazine.com/news/openai-expands-into-next-gen-audio-ai-with-t…

analyticsAnalysis

10%

Propaganda Score

confidence: 95%

Low risk. This article shows minimal use of propaganda techniques.

fact_checkClaims Checked

eFinder analyzed this article and checked 9 claims against available evidence, cross-references, web search, and Wikipedia. Here is what the fact-checking layer found.

check_circle Corroborated 6

info Single Source 3

check_circle

Claim 1: “GPT-Realtime-Translate processes speech from more than 70 input languages into 13 output languages.”

CORROBORATED

Three independent web sources (9to5Mac, AI Magazine, and another news source) confirm the specific numbers: 70+ input languages and 13 output languages.

menu_book

wikipedia NEUTRAL — Google DeepMind, trading as Google DeepMind or simply DeepMind, is a British-American artificial intelligence (AI) research laboratory which serves as a subsidiary of Alphabet Inc. Founded in the UK i…
https://en.wikipedia.org/wiki/Google_DeepMind

travel_explore

web search NEUTRAL — GPT‑Realtime‑Translate, a new live translation model that translates speech from 70+ input languages into 13 output languages while keeping pace with the speaker.
https://9to5mac.com/2026/05/07/openai-has-new-voice-models-t…

travel_explore

web search NEUTRAL — GPT-Realtime-Translate is a new live translation model that translates speech from more than 70 input languages into 13 output languages while keeping pace with the speaker. It targets customer suppor…
https://aimagazine.com/news/new-openai-models-listen-transla…

+ 1 more evidence source

info

Claim 2: “Vimeo uses the model to translate product education videos as they play.”

SINGLE SOURCE

The provided evidence mentions GPT-4o and general use cases for GPT-Realtime-Translate, but there is no specific mention of 'Vimeo' using the model to translate product education videos.

travel_explore

web search NEUTRAL — Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time.Learn more here: https://www.openai.com/index/hello
https://www.youtube.com/watch?v=WzUnEfiIqP4

travel_explore

web search NEUTRAL — GPT-Realtime-Translate offers live translation from over 70 input languages into 13 output languages, keeping pace with the speaker. It is intended for use in cross-border customer support, live event…
https://www.ghacks.net/2026/05/11/openai-releases-three-new-…

travel_explore

web search NEUTRAL — Google's service, offered free of charge, instantly translates words, phrases, and web pages between English and over 100 other languages.
https://translate.google.com/

check_circle

Claim 3: “GPT-Realtime-2 is the first voice model from OpenAI to include reasoning capabilities from its GPT-5 class architecture.”

CORROBORATED

Multiple sources explicitly state that GPT-Realtime-2 is the first voice model to incorporate GPT-5 class reasoning capabilities.

travel_explore

web search NEUTRAL — GPT Realtime 2 supports configurable reasoning effort. Higher reasoning effort can increase latency and output token usage.
https://developers.openai.com/api/docs/models/gpt-realtime-2

travel_explore

web search NEUTRAL — GPT-Realtime-2 is OpenAI's most intelligent voice model to date, bringing GPT-5-class reasoning to real-time voice interactions. Unlike earlier realtime models, it can plan, decide, use tools, recover…
https://www.datacamp.com/blog/gpt-realtime-2

travel_explore

web search NEUTRAL — On May 7, 2026, OpenAI shipped a voice model that scores 96.6% on the Big Bench Audio benchmark, against 81.4% for the previous GPT-Realtime-1.5. It is called GPT-Realtime-2, and it is the first voice…
https://pasqualepillitteri.it/en/news/2153/gpt-realtime-2-op…

check_circle

Claim 4: “The Realtime API includes multiple layers of controls to prevent misuse.”

CORROBORATED

The gHacks Tech News source explicitly mentions that the Realtime API features 'active classifiers that can stop conversations that violate OpenAI's content policy'.

menu_book

wikipedia NEUTRAL — GPT-4o ("o" for "omni") is a multilingual, multimodal generative pre-trained transformer developed by OpenAI and released in May 2024. It can process and generate text, images and audio. Upon release,…
https://en.wikipedia.org/wiki/GPT-4o

menu_book

wikipedia NEUTRAL — Microsoft Copilot is a generative artificial intelligence chatbot developed by Microsoft AI, a division of Microsoft. Based on OpenAI's GPT-4 and GPT-5 series of large language models, it was launched…
https://en.wikipedia.org/wiki/Microsoft_Copilot

menu_book

wikipedia NEUTRAL — The Portable Operating System Interface (POSIX; IPA: ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. In order to define a lev…
https://en.wikipedia.org/wiki/POSIX

info

Claim 5: “GPT-Realtime-Translate delivered 12.5% lower word error rates than any other model we tested.”

SINGLE SOURCE

While the existence of the model is corroborated, the specific statistic of '12.5% lower word error rates' attributed to BolnaAI is not found in the provided evidence. The evidence confirms the model's capabilities but not this specific benchmark result.

travel_explore

web search NEUTRAL — OpenAI has introduced GPT-Realtime-Translate, a new model capable of translating live speech from over 70 input languages into 13 output languages, expanding beyond the capabilities of many existing r…
https://quantumzeitgeist.com/openais-translates-languages-re…

travel_explore

web search NEUTRAL — Other Models Pricing: GPT-Realtime-Translate: $0.034 per minute (70+ input languages, 13 output languages). GPT-Realtime-Whisper: $0.017 per minute (streaming speech-to-text).
https://finance.biggo.com/news/202605100624_OpenAI_GPT-Realt…

travel_explore

info

Claim 6: “Deutsche Telekom is building customer support systems where users speak in their preferred language and the model translates the conversation in real time.”

SINGLE SOURCE

The provided evidence for this claim contains information about Deutsche Bank and Deutsche Bahn, but no mention of 'Deutsche Telekom' using GPT-Realtime-Translate for customer support.

travel_explore

web search NEUTRAL — Deutsche Bank was founded in 1870 in Berlin.
https://en.wikipedia.org/wiki/Deutsche_Bank

travel_explore

web search NEUTRAL — We operate a 33,400 kilometre-long network with 5,400 stations, which are used by 450 railway companies and 50,000 trains every day. Every day, more than five million people travel with Deutsche Bahn …
https://www.deutschebahn.com/en/

travel_explore

web search NEUTRAL — Discover Deutsche Bank, one of the world’s leading financial services providers. News and Information about the bank and its products.
https://www.db.com/

check_circle

Claim 7: “GPT-Realtime-Whisper converts speech to text as speakers talk.”

CORROBORATED

Multiple sources (gHacks, OpenAI Voice AI Models, and Realtime Audio Models) confirm that GPT-Realtime-Whisper is a streaming speech-to-text model for low-latency transcription.

travel_explore

web search NEUTRAL — GPT-Realtime-Whisper is a streaming speech-to-text model designed for low-latency transcription.GPT-Realtime-Whisper costs $0.017 per minute. The Realtime API features active classifiers that can stop…
https://www.ghacks.net/2026/05/11/openai-releases-three-new-…

travel_explore

web search NEUTRAL — . Streaming Speech-to-Text with GPT-Realtime-Whisper.. This streaming speech-to-text capability enables real-time understanding for live captions, meeting notes, and voice-powered workflows where imme…
https://theoutpost.ai/news-story/open-ai-launches-three-voic…

travel_explore

web search NEUTRAL — GPT-Realtime-Translate supports real-time speech translation across 70+ input languages and 13 output languages, breaking language barriers in global markets.GPT-Realtime-Whisper delivers ultra-low-la…
https://aihaberleri.org/en/news/realtime-audio-models-2026-o…

check_circle

Claim 8: “GPT-Realtime-2, GPT-Realtime-Translate and GPT-Realtime-Whisper could enable software systems to process spoken requests and respond whilst conversations are still taking place.”

CORROBORATED

Three independent sources explicitly name GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper and describe their real-time processing capabilities.

travel_explore

web search NEUTRAL — All three models — GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper — are available now through the OpenAI Realtime API, which is generally available starting today.
https://www.marktechpost.com/2026/05/08/openai-releases-thre…

travel_explore

web search NEUTRAL — The update includes GPT-Realtime-2 for active reasoning, GPT-Realtime-Translate for live multilingual speech-to-speech workflows, and an upgraded GPT-Realtime-Whisper for ultra-low latency streaming t…
https://i10x.ai/news/openai-realtime-api-gpt-realtime-2-tran…

travel_explore

web search NEUTRAL — GPT‑Realtime‑Translate, a new live translation model that translates speech from 70+ input languages into 13 output languages while keeping pace with the speaker.
https://openai.com/index/advancing-voice-intelligence-with-n…

check_circle

Claim 9: “OpenAI has released three audio models designed to handle real-time voice interactions.”

CORROBORATED

Multiple independent sources (MarkTechPost, OpenAI API docs, and other tech news sites) confirm the release of three real-time audio models.

travel_explore

web search NEUTRAL — OpenAI Group PBC, doing business as OpenAI, is an American artificial intelligence (AI) research organization headquartered in San Francisco, consisting of a for-profit public benefit corporation (PBC…
https://en.wikipedia.org/wiki/OpenAI

travel_explore

web search NEUTRAL — OpenAI is launching a new $4 billion company to embed its AI into corporate businesses The OpenAI Deployment Company, backed by TPG and other private equity firms, will acquire consulting firm ...
https://qz.com/openai-deployment-company-launch-tpg-tomoro-0…

travel_explore

web search NEUTRAL — 6 hours ago · Topline OpenAI cofounder and former chief scientist Ilya Sutskever confirmed Monday a $7 billion stake in OpenAI during his testimony for the high-stakes trial between Elon Musk and the …
https://www.forbes.com/sites/aliciapark/2026/05/11/ilya-suts…

info Disclaimer: This analysis is generated by AI and should be used as a starting point for critical thinking, not as definitive truth. Claims are verified against publicly available sources. Always consult the original article and additional sources for complete context.