laitimes

Intelligent driving "new love" end-to-end, not in a hurry

author:High-tech smart cars

End-to-end disruption, the commercialization of autonomous driving is emerging.

It was recently reported that General Motors will inject $850 million into its self-driving car subsidiary Cruise to keep Cruise in operation until the first quarter of next year, and the company will also consider Cruise's next steps in strategy and funding.

At a conference this month, General Motors Chief Financial Officer Paul Jacobson said that Cruise is in a very important stage of research and development, not only because of its robotaxi concept, but also because of the realization of personal autonomous mobility.

Since October last year, a Cruise's self-driving taxi has had its license to operate a driverless car revoked after hitting and dragging a woman several meters away. General Motors, the parent company of Cruise, has suspended all robotaxi operations.

It can be seen that after hitting fire trucks, blocking ambulances, and collectively turning off the engine causing traffic jams, GM still continues to bet on Robotaxi. In addition to receiving additional investment from its parent company, Cruise has also restarted testing of autonomous vehicles in the Phoenix and Dallas areas of the United States, as well as increased testing in Houston.

Unlike the previous frequent coldness of autonomous driving, which affects the commercialization progress of the whole industry, the disruption of new technology routes such as end-to-end seems to be widening the commercialization gap among players, and General Motors and Cruise, which have been burning money for many years, have to be anxious.

For example, Wayve, another self-driving company, not only achieved autonomous driving in central London, but also won a $1.05 billion Series C financing from SoftBank, Nvidia and Microsoft.

Wayve's achievements are essentially due to its newly upgraded autonomous driving model architecture, including four main models: end-to-end deep learning system (AV2.0), Fleet Learning Loop, LINGO model, and GAIA-1 world model.

This also means that in addition to Tesla, more players, including Wayve, have preliminarily verified the feasibility of end-to-end autonomous driving, overturned the algorithm that relies on code to continuously improve the rules, and is expected to break the deadlock of endless long-tail scenarios.

In the future, through the training system of a large amount of valuable data, combined with the interpretable traditional technology stack or model, it is expected to improve the "intelligence" level of the vehicle end-to-end, so that the system driving technology can reach the level of human veteran drivers and better handle complex driving tasks.

At present, in the field of autonomous driving or high-end intelligent driving, the role of end-to-end has emerged, and has gradually become the consensus of the industry, which is reflected in the technical demonstration and application of many car companies and intelligent driving Tier 1 in the perception and decision-making ends.

01 The window period is coming, and the gap between intelligent driving and intelligent driving will be opened at the same time

In essence, user experience is still the main driving force for end-to-end onboarding, and the focus of competition is mainly on two points: one is to solve long-tail scenarios and improve the safety of the entire system; The second is to realize the anthropomorphism of driving style, especially in dynamic game road/vehicle conditions.

For example, with the acceleration of end-to-end implementation, urban NOAs with focused functions, higher performance ceilings, and closer to human driving behavior will drive a new round of competition for high-end intelligent driving capabilities.

Compared with the rule-driven traditional modular scheme, the decision-making layer has poor generalization ability and cannot cope with long-tail scenarios that have never been coded, while data-driven end-to-end decision-making generalization ability is strong, especially for complex scenarios, and has a higher ceiling.

The above ceiling is largely determined by computing power and data.

This will also drive a new round of competition for the best computing platform solutions, including computing power, support for intelligent driving large models, etc., and then support for larger cache bandwidth and end-to-end customized operators.

For example, as early as 2016, Horizon took the lead in proposing the end-to-end evolution concept of autonomous driving, and in 2022, it proposed the industry-leading end-to-end algorithm for autonomous driving perception, Sparse4D. In 2023, UniAD, the industry's first publicly published end-to-end autonomous driving model by Horizon Robotics Scholars, won the best paper in CVPR 2023.

At the same time, the end-to-end deep learning algorithm based on interactive games accumulated by Horizon has greatly improved the traffic efficiency and success rate of the intelligent driving system in complex traffic environments.

For end-to-end mass production applications, in terms of hardware technology, Horizon Robotics' next-generation intelligent computing architecture BPU Nash, which is specially designed for large-parameter Transformers, can create industry-leading computing efficiency with a high degree of software and hardware collaboration, and provide intelligent computing optimal solutions for end-to-end autonomous driving and interactive games.

In April this year, Horizon released a high-end urban SuperDrive solution based on Journey 6P, which can rely on the end-to-end perception architecture of dynamic, static, and OCC occupancy grids, as well as data-driven interactive game algorithms, to take into account the scene pass rate, traffic efficiency and behavior anthropomorphism in any road environment.

Based on this architecture, SuperDrive's occlusion call rate can be increased by 70%, the number of dynamic code lines can be reduced by 90%, and the network load can be reduced by 50%, which can support algorithm vendors to do efficient iteration and continuously improve user experience, and is expected to achieve mass production in 2025.

Horizon believes that by providing flexible and open full-stack technologies that combine hardware and software, Horizon will accelerate the realization of end-to-end mass production for the entire industry.

However, even with the support of relevant computing platforms, the large model of autonomous driving will be disassembled into several levels of evolution, including modeling, end-to-end, and finally realizing the large model. Among them, in terms of modeling, the leading enterprises have basically completed the perception modeling, but the modeling of regulation and control has not been fully completed.

This also means that, based on the original accumulated perception algorithm foundation, rule-driven is accelerating to data-driven, and the gap between Tier 1 and car companies in intelligent driving capabilities will further widen in the future.

For example, the BEVDet launched by Jianzhi Robotics achieves end-to-end 3D perception of pure visual autonomous driving. Compared with image-to-bev projection using Transformer, BEVDet has better generalization performance and less data requirements, which can greatly reduce the data requirements.

Its latest R&D achievement, GraphAD, uses a graph model to describe the complex interaction in the traffic scene, and conducts explicit modeling of the interactive elements in the driving environment, so that the model can capture the relevant information more directly and quickly, significantly improving the learning efficiency and performance.

At present, the model has been successfully deployed on a mass-produced in-vehicle computing platform with real-time running performance.

In addition, at this year's Beijing Auto Show, Qingzhou Intelligent Aviation released the Qingzhou Chengfeng MAX intelligent driving solution based on the horizon journey 6, which adopts end-to-end technical architecture design, supports lidar access and light map mode, and can competently handle more complex urban scenes and create the ultimate full-scene NOA experience.

Yuanrong Qixing also showed its upcoming mass production of high-end intelligent driving platform DeepRoute IO, as well as the first DeepRoute IO-based solution, using NVIDIA DRIVE Orin system-on-chip, 200+ TOPS computing power, 1 solid-state LiDAR, 11 cameras, for an end-to-end solution without high-precision maps.

It is reported that the end-to-end autonomous driving solution of Yuanrong Qixing has been approved by Great Wall Motors and has cooperated with NVIDIA, and is expected to be adapted to NVIDIA's Thor chip in 2025. Recently, Yuanrong also reached a cooperation with BYD to be responsible for its POC end-to-end intelligent driving project.

02 Completely end-to-end, the road is long and difficult

As the endgame of autonomous driving, there are high hopes for end-to-end, but it is difficult to achieve it overnight.

In the view of Mu Lisen, chief architect of the Horizon algorithm platform.

That is, on the basis of relatively solid engineering mass production, the rapid iteration of the system is carried out, while increasing the upper limit of the system, and at the same time, it can also grasp the correctness of some basic functions based on rules to ensure the lower limit of system performance.

This process will also drive the development of end-to-end computing solutions to provide optimal computing efficiency for end-to-end, according to Mulisen.

It can be seen that the end-to-end model provides more anthropomorphic and flexible processing, and the original model and rules can ensure safety, and the end-to-end model and the original model will complement each other in the field of intelligent driving in the next few years.

After all, even the end-to-end benchmark Tesla FSD V12, although it performed well in the previous live broadcast, and the handling of various scenes was very silky, it will also make low-level mistakes such as running red lights and hitting the curb of the road. In previous generation scenarios, such errors were rare.

In fact, Tesla doesn't dare to rely entirely on end-to-end. Some Tesla owners found from the FSD software package that V12 is only suitable for urban scenarios, and V11 is still used for high-speed scenarios.

This also means that the lower limit of the end-to-end solution will continue to be improved to outperform the original traditional solution, which is also a milestone that needs to be conquered by the industry.

As an end-to-end nourishment and difficulty, the construction and collection of high-quality data is also crucial.

In the field of autonomous driving, the data required to train a model is a video related to the physical world, so the model needs to understand more physical rules, but it is also necessary to avoid using more data and computing power to train a larger model, and fall into the bottleneck of intelligent driving ability instead of rising.

Even Tesla, which already has hundreds of cars running on the road, admits that only 1 kilometer of driving data can train the model for every 10,000 kilometers of driving data today, and every time it is trained, it needs to consume a lot of computing power.

Tesla is also developing a more robust simulation system that generates a variety of data to train world models to understand driving scenarios using video generation and prediction technology, and to learn driving behaviors and strategies from these scenarios to enhance end-to-end.

However, relying on the world model to remove long-tail scenarios still needs to improve credibility.

For example, after the early version of GAIA-1 was launched in June last year, some researchers pointed out that some elements in the videos generated by the model would "suddenly disappear" in the follow-up after the launch of the early version of GAIA-1 in June last year.

Although in October of the same year, Wayve updated GAIA-1, not only expanding the scale of parameters, but also increasing the training time, and the details and resolution of the videos generated by the model have been significantly improved, it remains to be fully verified whether it has completely overcome the problem of "sudden disappearance of elements".

It can be seen that there is still a long way to go before the industry fully realizes end-to-end implementation, which not only has many technical problems to be solved, but also burns extremely money.

However, based on the broad prospect of end-to-end, capital has also shown high enthusiasm. Since last year, end-to-end intelligent driving Tier1, autonomous driving truck companies, chip companies, synthetic data vendors, etc. have successively won a new round of financing, which has also pushed up the industry bubble.

Perhaps in the next two or three years, after several rounds of de-bubble, the end-to-end pattern will become clearer when the oligarchy effect is highlighted.