Li Xingyu: What should the most ideal interaction mode of the smart cockpit look like?

Text | Li Xingyu

Vice President, Ecological Development and Strategic Planning, Horizon

introduction

Intelligent cars are the first form of robots, and the corresponding intelligent cockpit has also led to a new direction for the development of human-computer interaction in the robot era. Historically, every change in the way of interaction has reshaped the industrial landscape of smart devices. Just as the transition from DOS to Windows has brought about great changes in the industry, human-computer interaction has opened a door to a new industry.

Human-robot interaction will change the way we approach smart cars, pan-robots, and artificial intelligence. The most important invention of human beings is to create a language system for human interaction and bring about human civilization; today, human-computer natural interaction may be the next cornerstone invention, which is strongly combined with the autonomous decision-making of machines, which will bring machine civilization, reshape the relationship between people and machines, and will have a profound impact on our social work style and lifestyle.

Key conclusions

1. The most important trend of human-computer interaction in the future is that the machine moves from passive response to active interaction, from human adaptation to machine to machine continuous adaptation to human, the ultimate goal is to make the machine anthropomorphic, Turing test is the standard of measurement.

2. In the stage of human-machine co-driving, the human-computer interaction ability must match the automatic driving ability, otherwise it will bring serious safety problems. The cross-domain convergence of intelligent driving and intelligent cockpit is the direction of development.

3. In the future, physical screens and touch will no longer be the center of cockpit interaction, but will be replaced by natural interaction + AR-HUD.

4. Speech, gestures and eye tracking are the three axes of natural interaction, and sensors, computing power and algorithms are the material basis.

5. The current cockpit is dominated by the entertainment domain, but in the future, the positioning of the entertainment domain and safety domain (human-computer interaction and automatic driving) in the cockpit will be adjusted, and the safety domain will become the main control domain.

6. Intelligent cockpit human-computer interaction is an important breakthrough in the brand upward of China's intelligent automobile company.

What is the trend in the way humans and machines interact?

Where will the cockpit's human-machine interaction go in the future? The answer to this question may need to be sought from the history of smart device development.

The computer industry is the origin of the development of human-computer interaction technology. In fact, human-computer interaction was not called HMI at the beginning, but HCI, or Human–Computer Interaction. The history of the development of the PC is widely known, and the following figure is a simple division of development stages:

The development of computer-computer interaction

The beginning is the DOS system plus the keyboard, the operation of the command line interface requires very high professional skills, only a few professionals can use. The advent of the mouse and windows operating system changed everything, allowing pc users to explode. Next, touch became a simpler and more straightforward way to operate, and tablets like the Surface appeared. Microsoft Cortana represents the latest way of interaction, and we can use voice to interact with machines in a more natural way.

The history of the development of PCs and mobile phones reflects the development of the interaction between machines and people, that is, from complex to simple; from abstract operation to natural interaction. The most important trend in human-computer interaction in the future is the transition from reactive to active interaction.

Looking at the extension line of such a trend, the ultimate goal of human-computer interaction is to make machines anthropomorphic, it can be said that the development history of human-computer interaction is the history of human adaptation to machine to machine continuous adaptation of human development.

The development of the smart cockpit has gone through a similar process:

The development of human-machine interaction in the smart cockpit bears a lot of similarities with computers

Multimodal interaction is the ideal model for the next generation of human-machine interaction, what is multimodal interaction? Simply put, it is the use of gestures, eye tracking, voice and other ways to interact. The mode here is similar to the human "senses", and the multimodal state is to integrate multiple senses, corresponding to the five senses of human vision, hearing, touch, smell and taste.

But the naming of multimodal interactions is too technical, and I prefer to call it natural interactions.

For example, gestures can be said to be native "mice", and different gestures can express rich semantics.

Typical gesture semantics and corresponding implementations

What is the natural interaction implementation?

Intelligent cars are essentially humanoid robots, and the two most important capabilities of robots are autonomous decision-making capabilities and human-computer interaction capabilities, and without any of them, it is impossible to effectively serve humans. Therefore, creating intelligent human-computer interaction capabilities is a must.

How to measure the intelligence of human-computer interaction? One of my thoughts is to use the Turing test, which is whether machines can behave indistinguishable from humans in terms of interaction behavior.

How do you interact naturally? Sensors, computing power and algorithms are all essential. The following diagram makes an intuitive presentation:

More and more sensors will be integrated in the cockpit, on the one hand, the demand for computing power in the cockpit will continue to soar, and the demand for AI computing power in the cockpit will rise to more than 30 TOPS, or even the level of 100 TOPS. On the other hand, it also provides better perceptual support.

Li Xingyu: What should the most ideal interaction mode of the smart cockpit look like?

Cockpit sensors are rapidly increasing in number and variety

AI computing can realize the perception of multiple information such as faces, expressions, gestures, and speech, so as to achieve more intelligent human-computer interaction. The computation of cockpit human-machine interaction must rely on edge computing, not cloud computing. Because of three points: reliability, real-time, and privacy protection.

Personal privacy protection is probably one of the biggest challenges facing our generation in the ERA of AI, and the privacy of the privacy space in the cockpit is more prominent. Today's speech recognition, the vast majority of which is still carried out in the cloud, where biometric information such as voiceprints can easily reveal personal identity. By performing edge AI computing on the car side, personal biological information such as video and voice can be removed, converted into semantic information, and then uploaded to the cloud, which can effectively protect the privacy of personal data in the car.

In the era of autonomous driving, interactive intelligence must match driving intelligence

In the foreseeable future, human-machine co-driving will be a long-term state, and human-computer interaction in the cockpit is the first interface for people to understand the ability of automatic driving.

At present, there is an uneven evolutionary challenge in intelligent car technology, and the human-computer interaction ability lags behind the development of automatic driving capabilities, resulting in frequent automatic driving accidents and affecting the popularity of automatic driving.

Human-machine co-driving is characterized by human in the loop, so the human-computer interaction ability must match the automatic driving ability, otherwise it will bring serious expected functional safety problems, and almost all fatal accidents of automatic driving are related to this. Even if there is no accident, the lack of understanding of the state of autonomous driving can cause serious panic and anxiety.

For example, in the actual driving conditions of the automatic driving system, there is often a "ghost brake" situation. If the human-computer interface can display the perceptual results of autonomous driving, the driver may understand that the system misjudgment is caused by the identification of a can on the road as a car.

This is the reason why Tesla is showing more and more self-driving perception results. As the capabilities of autonomous driving become stronger and stronger, users will pay more and more attention to the processes and states presented by autonomous driving systems in a virtual 3D environment.

Human-computer interaction and automatic driving complement each other, and their specific role is shown in the following figure:

Human-computer interaction and autonomous driving complement each other

For example, in the future, more humane parking should be people-vehicle co-parking, including people-to-car takeovers and car-to-people takeovers, such as cars encountering difficult road conditions, may say that I am not very sure, ask to take over. For example, if people can't stop for a long time, AI algorithms recommend whether to turn on automatic parking.

This integrated cabin and parking solution can improve the overall experience of intelligent cockpit interaction and parking, and can also greatly save hardware costs: through the time-sharing reuse of AI chips, it can meet the needs of cockpit perception and APA parking perception at the same time, thereby providing a cost-effective solution for the industry, and also allowing intelligence to explore more low-end models. In China, Horizon and Yingchi Technology Cooperation are promoting the development of this program.

At present, the interaction mode of the smart cockpit is mainly an extension of the mobile phone Android ecology, mainly supported by the physical screen. Today the screen is getting bigger and bigger, even reaching 60 inches, which actually occupies the space of high priority functions with low-priority functions, and also brings additional information interference, which is easy to distract people and affect driving safety.

Physical screens will still exist in the future, but I have a judgment that in the future, physical screens and touch will no longer be the center of cockpit interaction, but will be replaced by natural interaction + AR-HUD, and we will do further analysis below.

The first reason: the human-computer interaction for automatic driving belongs to the problem of food and clothing, is just needed, belongs to the safety domain, and has the highest priority; human-computer interaction for music, games and comfort is a well-off demand, belongs to the entertainment domain, and can have enough room to play after achieving the previous stage of tasks.

The following figure provides a brief analysis and summary of the functions of the two domains.

The division of tasks for the entertainment domain and the security domain in the cockpit

Therefore, in the future, the positioning of the entertainment domain and safety domain (human-computer interaction and automatic driving) in the cockpit will be adjusted, and the safety domain will become the main control domain.

The second reason: the natural interaction mode + AR-HUD interface is more secure, such as through voice and gesture communication, can avoid the driver's line of sight shift, thereby improving driving safety. The large screen in the cockpit cannot do this, in contrast, AR-HUD can avoid this problem while displaying autonomous driving perception information.

The third reason: the natural interaction mode is invisible, simple, more emotional interaction, will not occupy too much of the valuable physical space in the car, but can be accompanied at any time, giving the driver and passengers more trust and security.

Based on the above analysis, the cross-domain integration of intelligent driving and intelligent cockpit in the future is a more certain development direction, and the final birth is the vehicle central computing platform.

Current stages of development, cutting-edge practices and challenges

At present, the speech recognition of the cockpit has been popularized, and the mainstream manufacturers of speech recognition mainly use end-to-end algorithms, and the speech recognition accuracy rate can be as high as 98% in the ideal experimental environment.

DMS is gaining popularity at a rapid pace, and it is predicted that by 2030, more than 50% of models equipped with in-car cameras will be used.

The popularity of DMS is booming

The next step will be a combination of voice + gesture + eye tracking + AR-HUD interactive interface, which is an intelligent interaction method corresponding to L3+ level autonomous driving. The industry's leading car companies have begun to lay out, as shown in the figure below.

The natural interaction between man and machine has become a hot spot in the competition of car companies

The practice of China's independent brands in this area is basically on a par with foreign leading brands, and it is faster in terms of iteration speed. In 2020, Changan launched the UNI-T model, which includes a number of active services. For example, if you are answering a phone call, the system will automatically reduce the multimedia volume; for example, when the central control screen of the car is in the off-screen state, looking at the screen for a second can wake up the screen. The solution is equipped with the Horizon Journey 2 chip, which supports the interaction of commands such as speech, action posture, and facial expression.

The ideal natural interaction goal is to start from the user experience, which needs to provide a stable, smooth, and predictable interactive experience. But no matter how full the ideal is, it must start from the reality of bone feeling, and the current challenges are still numerous.

For example, the current misidentification of natural interactions is still severe, and the reliability and accuracy of all-weather conditions and all-weather conditions are not enough. For example, gesture recognition, maybe you inadvertently move a gesture, will be mistaken for a command action, this is just one of the countless misidentification situations, in the moving state, lighting, vibration, occlusion and so on are huge engineering challenges. The fluency of natural interactions is also an urgent problem to be solved, which requires higher performance sensors, more powerful computing power and efficient algorithms to gradually improve. At the same time, natural language understanding (NLP) and intention understanding are still in their early stages, and algorithmic theory innovation is needed.

Human-computer natural interaction is the cornerstone invention of the robot era

summary

In the current fierce industry competition, intelligent cockpit has become a key move for automakers to achieve functional differentiation, cockpit human-computer interaction and human communication habits, language and culture are closely related, so it must be highly localized, intelligent cockpit human-computer interaction is an important breakthrough in the brand of China's intelligent car companies, but also a breakthrough in China's intelligent automobile technology to lead the global technology trend.

The intelligent cockpit industry chain will continue to extend, there will be more players entering the big ecology of smart cars, smart car players will also cross borders into more robot fields, and the future development theme of the intelligent cockpit ecosystem will revolve around "ecological collaboration" and "cross-border extension". This technological revolution will have a disruptive impact, not only opening up a new industrial ecology, but also having a profound impact on our social work style and lifestyle.

Attached: Introduction of Li Xingyu

Li Xingyu, Vice President of Horizon Ecological Development and Strategic Planning, focuses on autonomous driving, artificial intelligence chips and edge computing, and is responsible for Horizon strategic planning and ecological construction.

Mr. Li Xingyu is a senior expert in the autonomous driving and automotive chip industry, with 18 years of experience in the semiconductor industry. Prior to joining Horizon, he was Senior Marketing Manager for NXP's Application Processor Automotive business and Specialist in Microelectronics Security Technology at Silan.

Li Xingyu: What should the most ideal interaction mode of the smart cockpit look like?

Read on