Swimming Fish from Au Fei Temple

Quantum Position | 公众号 QbitAI

It's explosive! iFLYTEK Xinghuo showed off its voice recognition capabilities, and the applause at the scene was thunderous——

Three people speak at the same time, coupled with background music, such a scene of strong interference, but the large model said that they can understand and hear clearly, and it is instantly converted into text, and the "cocktail party" problem of voice recognition is not a problem~

iFLYTEK Spark 4.0 dominates the eight lists, and the big show of voice recognition is premeditated interference

Well, I only heard the last Peking duck, who understands......

I have to admit that the iFLYTEK press conference, which is held once every few months, is full of dry goods every time, and this time it also brought surprises.

iFLYTEK Xinghuo 4.0 version is coming, and this time the capabilities of the 7 bases have been improved, ranking first in the eight lists, and comprehensively benchmarking GPT-4 Turbo.

In addition, iFLYTEK Xinghuo APP/Desk and voice model have also ushered in a number of upgrades.

Come and see what kind of new release there is this time~

How strong is iFLYTEK Spark 4.0? No. 1 on the eight lists

First of all, let's take a look at the new upgrade of the base model iFLYTEK Xinghuo 4.0, mainly in these aspects:

In terms of basic capabilities: text generation, language comprehension, knowledge question and answer, logical reasoning, mathematical code and multimodal capabilities have been fully upgraded, and GPT-4 Turbo is fully benchmarked;
The ability of image and text recognition is also being continuously upgraded, especially in the complex understanding of layout, text recognition that integrates the semantics of the text, and symbol recognition in professional fields, which are stronger than GPT-4o in scientific research, finance, medical care, justice and other industries.

Complex instruction, complex logical reasoning, spatial reasoning, mathematics, and multi-modal understanding based on logical relationships have also been improved. For example, the logical relationship of the content in the graph can be sorted out according to several graphs, and the improvement of these capabilities can accelerate the practical application of large models.

Among the 12 mainstream Chinese and English test sets at home and abroad, Xinghuo V4.0 has achieved 8 firsts, including Chinese and English tests in the dimensions of comprehension and reasoning, comprehensive test, and mathematics.

However, Liu Qingfeng admitted that there is still a gap in code and multimodal capabilities this time.

It is worth mentioning that the general ability of Xinghuo's long text has also been newly upgraded, and the content traceability function has been released for the first time.

Liu Cong, president of the Xunfei Research Institute, also gave a live demonstration, threw a Chinese version of Journey to the West and the English version of Harry Potter to it, and asked:

What is the difference between Monkey King's golden wand and Harry Potter's wand?

In addition to the step-by-step answers, there is a small flag on the Chinese characters of the answers, and you will find out where the source is when you open it at a point.

In this way, the illusion of the large model can be greatly reduced, which is equivalent to Xinghuo answering your question, and telling you why it answered this way, which paragraph it was referring to, saving you the time to check the full text, just verify its traceability.

And note that this is not limited to Chinese, English traceability can also be realized. The Xinghuo large model does not translate English into Chinese, but directly finds the correspondence, which is truly based on the English traceability ability automatically trained in English.

Of course, this content source is not limited to text, including voice and video.

Well, the base ability has been basically understood, and now the web version and App side have also been fully upgraded, so let's take a brief test.

First of all, let's take a look at the college entrance examination mathematics that stumped a wave of large models some time ago, how to deal with iFLYTEK Xinghuo 4.0, and directly take the first 4 objective multiple-choice questions in the first volume of the college entrance examination:

Look at the question and give the answer to the question.

As a result, all four questions are correct, and the analysis is completely correct, whether to say it or not, there is something to it~

Let's take a look at its multi-mode comprehension ability, and whether it can find the corresponding logical relationship from several graphs.

For a cartoon, it can also clearly judge the content inside, and successfully answer the question given: after a year, will the child grow taller?

In addition, the speech recognition capability in strong interference scenarios has also achieved a breakthrough, and the accuracy rate of two-person aliasing scenarios has reached 91%; The three-person aliasing scene can also achieve 86% speech recognition accuracy; In the high-noise scene of -5dB, the noise is already much higher than that of human speech, and the accuracy rate of more than 90% can still be achieved - which is why there is a scene where "even if you talk nonsense, you can accurately recognize" at the beginning.

The ability of language recognition is also getting stronger and stronger, and the upgraded Xinghuo voice model can support 74 languages without switching, including 37 languages and 37 dialects, without switching, you can communicate freely.

Among them, the recognition effect of 37 languages is ahead of OpenAI whisper-V3, and the recognition effect of 37 dialects has increased by an average of 30%

Just a few days ago, iFLYTEK won the first prize of the National Science and Technology Progress Award for the project "Key Technology and Industrialization of Multilingual Intelligent Voice" as the first completion unit.

This is the first first prize of the National Science and Technology Progress Award in the field of artificial intelligence in the past decade since deep learning triggered the global wave of artificial intelligence.

On this basis, applications in the field of speech are also being refactored. The intelligent cockpit of Xinghuo Automobile has been newly upgraded, and it has "free interaction" in multiple languages and dialects, as well as super-anthropomorphic interaction with multiple emotions and modalities. At present, iFLYTEK's voice interaction products rank first in the domestic market share, and are widely exported to all over the world. The Xinghuo model is a highly intelligent interactive experience for many models of FAW, Chery, GAC, JAC, Great Wall and other car companies.

Featuring personalized AI assistants

With the upgrade of the base model capability, the application experience of Xinghuo in various industries and scenarios has also been further upgraded.

In iFLYTEK's own words: understand your AI assistant.

Compared with the previous positioning of "general AI assistant", Liu Qingfeng said that he mainly realized the stand-in at the three ability levels.

Personalized expression based on user portraits;
memory learning based on usage history;
Profile-based reinforcement learning;

Specifically, when constructing a user's personal portrait, the personality style can be selected by oneself, or it can be dynamically improved according to the dialogue and usage history, so as to form a personalized expression style. AI assistants, combined with profiles, can generate personalized and targeted content.

And now everyone can have their own personalized assistant through the iFLYTEK Xinghuo APP or Desk interface.

This time, the "Personal Space" has been upgraded, which can collect and manage all kinds of data you upload and build your own exclusive knowledge base. And large models can also do reinforcement learning based on your profile.

At the scene, Liu Cong uploaded his daughter's writing essay, and after selecting the label that conforms to his daughter's AI personality, the follow-up copywriting generation style is all with his daughter's personality style.

On the iFLYTEK Xinghuo APP, there is also an agent function, which integrates a variety of AI assistants, including medical assistants, English listening and speaking assistants, math answering assistants, recording assistants, manuscript writing assistants, code assistants and other practical functions, which you can call at any time.

At present, the first batch of 14 agents has been launched.

Focusing more on the application of specific industries, Xinghuo, as an "AI assistant that understands you", is constantly deepening and continuing to create value.

For example, medical care. At present, the iFLYTEK Xinghuo medical model has also been upgraded again, and its core medical capabilities have comprehensively surpassed GPT-4 Turbo, including medical-related knowledge quizzes, complex semantic understanding, professional document generation, diagnosis and treatment, and multiple rounds of dialogue.

The iFLYTEK Xiaoyi APP, which focuses on personal health assistants, has covered 1,600 common diseases, 2,800 common drugs, and 6,000 common examinations and tests, meeting the health needs of users in core scenarios before, during, and after medical treatment. So far, it has accumulated 12 million downloads. The user praise rate is 98.8%, and nearly half of it comes from user word-of-mouth recommendation.

You can ask it some general questions directly, such as, what if I have insomnia? Can people with gout drink soy juice?

iFLYTEK Xiaoyi APP has launched a "personal digital health space", which can be linked to your own and your family's health records, including electronic medical records, examination reports, physical examination reports and other information. When there are some minor illnesses, we will analyze the causes for you; Personalized judgments of drug contraindications are given when taking drugs, and data changes can also be given compared with previous reports.

Then there is the field of education. AI is becoming a teaching assistant for teachers and a learning assistant for students.

This time, the underlying Spark model has greatly improved its Chinese, mathematics, English ability and OCR recognition ability.

On the teacher's side, iFLYTEK released the Xinghuo intelligent review machine this time, which can automatically correct, scan and approve, and operate on the spot.

After the approval, it can also analyze the learning situation of the whole class, and assist the teacher to give each student's learning path plan.

The original 90 minutes of homework correction time can be turned into 5 minutes; The 60-minute learning statistics time is programmed for one minute, which greatly liberates the teacher's productivity.

On the student side, the AI learning machine equipped by the Xinghuo large model further realizes super-anthropomorphic Q&A tutoring based on the improvement of underlying capabilities.

Judging from the existing pilot data, the completion rate of children's independent learning has increased from 67% to 90%, and the problem solving rate has reached 93% from 72% in the past relying on video learning.

In addition, in the field of enterprise applications, the enterprise intelligent twins platform, as well as business opportunities, bid evaluation, code and other enterprise intelligent assistants were also released.

At the same time, the influence of iFLYTEK Xinghuo's developer ecology is still expanding——

Since the release of Xunfei Xinghuo V3.5 on January 30 this year, in just five months, the growth of Xinghuo's developer ecosystem has accelerated, with the number of developers increasing from 5.98 million to 7.02 million, with more than 1.04 million new developers, more than 400,000 overseas developers, and 570,000 large model developers.

Make large models easier to use and more practical

After watching the whole press conference, iFLYTEK released such a force signal;

Make large models easier to use and more practical.

And to further concretize it, it is the AI intelligent assistant.

It can be that the health of the whole family is guarded by AI; It can also be the lifelong learning ability of active thinking in the one-to-one personalized teaching of each child; There are also service scenarios such as in-depth enterprise management, where each worker can easily manage his or her own knowledge base.

And if throughout human civilization, behind every progress there is a great assistant, and every generation of assistants has its mission.

The mission of iFLYTEK is to liberate and unleash productivity.

Liu Qingfeng said that we hope that through our ability, we can achieve every great enterprise and help everyone become a great self.

As the "carrier" of AI assistants, the iFLYTEK Xinghuo APP is actually continuing to empower and has long been changing our production and life around us.

At the meeting, Liu Qingfeng provided these sets of key figures.

On the Android side, among all the apps related to downloading large models, the iFLYTEK Xinghuo APP ranks first in the tool category, with a total of 131 million downloads.

It means that all kinds of assistants of Xinghuo APP, including writing, programming, work, study, life, parenting, translation and other assistants, are used by us on a daily basis, and some of the calls have even reached millions or even tens of millions.

However, from the perspective of the entire industry, in fact, this is not a new concept, which has appeared in many science fiction TV series and movies, and has not been brought by the era of large models until now, and science fiction scenes have been brought into reality.

As for the ChatGPT boyfriend DAN who exploded before, and GPT-4o, which brought a new heated discussion on human-computer interaction, more general-purpose AI assistants with both functional and emotional attributes appeared, which made people shout: "Her" is really here.

But it is not easy to build it as an AI assistant.

I believe many friends have noticed that GPT Builder is about to end its service in July. This was highly anticipated because "everyone can create their own GPT", but now it is about to shut down less than half a year after its release.

I still remember that when it first came out, it was criticized by many people that some customized GPTs were no different from ChatGPT's original dialogue and could not solve complex instructions......

When a large-scale model product is directly facing users, people's expectations and requirements for it are far more stringent than ever. When the existing capacity of the product cannot meet the needs of users, it will soon be eliminated by users and eliminated by the market......

Only by constantly polishing product capabilities, directly hitting user pain points, and always maintaining an open ecology, can we continue to thrive in such a wave.

At least for now, the large-scale model products that are still alive and continue to bring services to users have undergone a test. iFLYTEK is one of them.

A recent decision by ChatGPT has once again made the proposition that large models are autonomous and controllable particularly important.

OpenAI's large model will not become the base of China's AI applications, and naturally it will not become the base of China's AI assistants. And players like iFLYTEK have focused on autonomy and controllability from the very beginning-

Until now, iFLYTEK Xinghuo 4.0 is still the only officially certified large model that is open to the whole people.

What is the concept?

It is a large model trained on the national computing power platform, and all algorithms, every line of code, and every data are our independent and controllable large models.

The release of the iFLYTEK Xinghuo large model is based on the country's first domestic Wanka computing power cluster "Feixing No. 1".

Liu Qingfeng said: The ability of the large model base determines the height of development, and China needs to establish an independent and controllable general large model base.

It is necessary to scientifically understand the boundaries of large model capabilities, and now with the upgrading of large model capabilities, it is possible for everyone to be AI intelligent assistants.

Spark represents a trend and is leading the way in its development.

— END —

QubitAI · 头条号签

iFLYTEK Spark 4.0 dominates the eight lists, and the big show of voice recognition is premeditated interference

How strong is iFLYTEK Spark 4.0? No. 1 on the eight lists

Featuring personalized AI assistants

Make large models easier to use and more practical

Read on

This issue brings the Song PLUSEV2023 Champion Edition 520KM flagship model, with a guide price of 189,800, a body size of 478518901660mm, and a wheelbase

Today I would like to share with you the Lynk & Co 08EM-P2023 120 long-range Halo, with a guide price of 215,800 and a body size of 4820*1 after the discount of 206,800

#纯电卷王小鹏G6限时立减2万元#我最近考虑买小鹏G6, can someone share its test drive experience?1. The battery life is quite solid, and the WLTP is fully charged

CPU, GPU, TPU, NPU !️are several different types of processors, each with its own advantages and disadvantages

Recently, it was learned that the blue electric E5 glory version was launched, and the new car launched 3 models, with a guide price range of 99,800 ~ 119,800, compared with the old model of 32,100-4

Sisters, 2024 is really my lucky year! A big gift I received this year was actually a new energy vehicle! Sent by my husband-to-be, hahaha! At a glance, Feifan F7 outside

Recently, it was learned that the Wuling Starlight EV has opened pre-sale, positioned as a mid-size sedan, and currently offers two models, with a pre-sale price of 109,800 yuan and 119,800 yuan, CLTC

The Rafale fighter has a 98.6% speech recognition rate, and the failure is 1.4%, and the Indians can't be blamed entirely on it

The era of AI is coming, share an easy-to-use local speech recognition input tool

I just fell asleep in the middle of the night last night, when the single landlady upstairs knocked on the door again, so I asked her impatiently what was the matter? I saw her bow her head and say to me, I said two days ago

I had just fallen asleep in the middle of the night last night, when the landlady upstairs came down and knocked on the door again. So I was very unhappy and asked her what was the matter? I saw her bow her head and ask me, two days ago

As a pioneer in the new energy vehicle industry, VOYAH has always adhered to the user-centered design concept and is committed to providing a car experience that exceeds expectations. In the wave of intelligence, VOYAH

The 2024 Wuling Binguo is right to buy like this: 1. 203km light enjoyment type: the price is 56,800, the maximum power of the motor is only 30 kilowatts, and the speed is also weak

The technology circle is boiling, iFLYTEK Spark 4.0 is coming! It will lead a new era of AI and create a personalized intelligent partner who understands you! On June 27, iFLYTEK Xinghuo 4.0 will be officially released. at

Good news! Follow and retweet this micro headline, and you will have a chance to get a 10 yuan cash red envelope 🧧 to see if you are the lucky one! Lottery details: iFLYTEK Spark 4.0 shocking release: super