laitimes

OpenAI launches ChatGPT advanced voice mode and releases a dataset containing 14 languages

Recently, OpenAI has taken an important step in the globalization of AI.

Not only did the company launch ChatGPT's Advanced Voice Mode (AVM), but it also released a multilingual dataset of 14 languages to evaluate the performance of language models.

Both initiatives aim to increase the global accessibility and usefulness of AI technology.

OpenAI has announced that it is expanding its AVM to more paying users. This audio feature allows users to communicate with ChatGPT more naturally, and will initially be rolled out to ChatGPT Plus and Teams customers. Enterprise and education customers will begin gaining access next week.

As part of the promotion, AVM underwent a design revamp. For now, the blue Jumping Sphere is used as a display for this feature, replacing the black dot that OpenAI used when it demonstrated the technology in May.

If the AVM feature is opened to the user, the user will receive a pop-up window next to the voice icon in the ChatGPT app.

In addition, ChatGPT has added 5 new voices that users can try, namely: Arbor, Maple, Sol, Spruce, and Vale.

On top of that, ChatGPT's total number of voice types has reached nine, almost as many as Google's Gemini Live.

Interestingly, the origin of its name is inspired by nature, perhaps because it wants to make ChatGPT feel more intimate to use.

It's worth noting that the "Sky" sound that OpenAI showed during the Spring Update did not appear in this update. The reason is that the famous actor Scarlett Johansson · raised objections.

Johnson, who played an AI system in the movie Her, claimed that Sky's voice was too similar to her own.

In response, OpenAI quickly took Sky's voice down, saying it never intended to emulate Johnson's voice, even though several employees tweeted about the film at the time.

OpenAI launches ChatGPT advanced voice mode and releases a dataset containing 14 languages

(Source: OpenAI)

OpenAI told the media that they have made a series of improvements since announcing the AVM's alpha test.

ChatGPT's voice features now have a better understanding of accents and conversations are smoother and faster than before.

In addition, OpenAI has extended some of ChatGPT's customization features to AVMs, such as allowing users to customize ChatGPT's responses.

However, ChatGPT's video and screen-sharing features have yet to appear in this promotion. The feature was supposed to allow GPT-4 to process both visual and auditory information. At the moment, OpenAI has not provided a timeline for when these multimodal features will be rolled out.

除了高级语音模式,OpenAI 还在开放数据平台 Hugging Face 上发布了多语言大规模多任务语言理解(MMMLU,Multilingual Massive Multitask Language Understanding)数据集。

This new assessment tool is based on the MMLU benchmark.

Originally for English, the MMLU tests the AI system's knowledge in 57 subject areas such as math, law, and computer science. The new MMMLU dataset includes 14 languages, including Chinese, Arabic, German, and Bengali.

By incorporating these diverse languages into the new multilingual assessment, especially with limited training data resources for some of these languages, OpenAI has set a new benchmark for multilingual AI capabilities.

This benchmark could lead to more equitable global access to the technology. The AI industry has long been criticized for its inability to develop language models that can understand the language spoken by millions of people around the world.

Until recently, AI research focused on English and a handful of widely spoken languages, resulting in many low-resource languages being overlooked.

OpenAI decided to include languages including Kiswahili and Yoruba, which, despite their large number of speakers, are often overlooked in AI research. It's also a sign that AI technology is moving in a more inclusive direction.

To ensure the accuracy of the MMMLU dataset, OpenAI hired professional human translators, which are more accurate than comparable datasets that rely on machine translation, especially in languages with fewer training resources.

By relying on human expertise, OpenAI ensures that the dataset provides a more reliable basis for evaluating multilingual AI models.

For enterprises, the MMMLU dataset provides an opportunity to benchmark their own AI systems in a global context.

As companies expand into international markets, the ability to deploy AI solutions that can understand multiple languages becomes critical.

Whether it's customer service, content moderation, or data analytics, AI systems that perform well in multiple languages can provide a competitive advantage by reducing communication friction and improving the user experience.

In addition to the release of the MMMLU dataset, OpenAI has also launched the OpenAI Academy program to further its commitment to global AI accessibility.

OpenAI launches ChatGPT advanced voice mode and releases a dataset containing 14 languages

(Source: OpenAI)

According to the presentation, the academy aims to invest in developers and mission-oriented organizations that are using AI to solve critical problems in their communities, especially in low- and middle-income countries.

The Academy will provide training, technical mentorship, and $1 million in Application Programming Interface (API) credits to ensure local AI talent has access to cutting-edge resources.

By supporting developers who understand the unique social and economic challenges of their region, OpenAI hopes to empower communities to build AI applications tailored to local needs.

Resources:

https://techcrunch.com/2024/09/24/openai-rolls-out-advanced-voice-mode-with-more-voices-and-a-new-look/

https://venturebeat.com/ai/openai-tackles-global-language-divide-with-massive-multilingual-ai-dataset-release/

Operation/Typesetting: He Chenlong

Read on