On the morning of October 11, at the highly anticipated Spring Festival Gala ---- Tesla's "We Robot" press conference, Musk showed the world Tesla's latest achievements in the field of autonomous driving and artificial intelligence: Cybercab, Robovan and Tesla Bot.
Cybercab is actually a driverless taxi, Robovan is a driverless bus, and Tesla Bot is Tesla's humanoid robot that can walk, run and dance.
It is worth mentioning that these three new products are all FSD routes that Tesla has always admired, and according to Musk, these three products mean the arrival of the era of human intelligent driving.
Many people are strange, China's Radish Express unmanned taxis have been on the road for a year, what is so special about Tesla's unmanned taxis this time? Will Tesla's technology be applied to the military?
We all know that in the field of intelligent driving, there have always been two routes: lidar and pure vision.
Lidar has actually been developed for many years, as early as the 70s, tanks have popularized laser rangefinders to measure the distance of enemy targets, and then determine the artillery elements.
As a result, some people have opened their minds and increased the number of laser lines emitted, and the more areas and details they can perceive, the more? Wouldn't it be possible to get the three-dimensional shape of an area by rotating the reflected laser to scan?
This is the prototype of lidar.
Later, after the development of lidar, it has developed into three types: mechanical, semi-solid and solid-state, and is widely used in intelligent driving cars.
But what? Although the LiDAR route is good, it also has disadvantages.
The first is expensive, a lidar is often thousands of yuan, and the new generation of radish run has developed to 4 lidars, plus its cameras, millimeter-wave radars and other components, and a light sensor, the cost is 20,000 yuan, which greatly affects the market promotion.
The second is the reliance on algorithms. To put it simply, lidar is just a sensor that tells the host what the road conditions are, and the host has to make intelligent driving decisions based on the algorithm.
This brings a problem, the various situations encountered on the road are endless, but the algorithms are endless.
To put it simply, there will always be situations on the road that the algorithm does not envision, such as a car in front of the dual carriageway that stops inexplicably, and the middle of the lane is a solid line, the algorithm will often stop and not go, instead of compacting the line to bypass the car in front, because its weight of obeying traffic rules is the largest.
So in the past year or so, we have seen a lot of accidents with intelligent driving (in fact, it is at most assisted driving now), many of which are due to third parties, and there are also reasons why intelligent driving algorithms cannot cope with emergencies.
And what about the other route? It is Tesla's FSD, Full-Self Driving, the Chinese name "full self-driving", in 2021, Tesla officially released the FSD beta version, taking a pure visual solution.
The sensor of FSD is not a lidar, but a camera.
In Musk's words, "Anyone who uses lidar is a fool." "[LiDAR] is like a bunch of appendices on a person."
This is said as if lidar is a backward technology, but it is not. Musk abandons lidar and respects cameras, the main reason is that cameras are cheap, lidar costs thousands, and cameras only need a few hundred.
In fact, there is not much difference between the route at the beginning of FSD and the intelligent driving route, they are all "perception + algorithm", first use the camera to see what is ahead, whether there is a car, how much is the speed, what is the direction, whether it is a red light or a green light, whether it is a solid line or a dotted line, and then make decisions according to the prepared algorithm.
This is not very advanced, so the pure vision solution does not have much advantage over the LiDAR route at the beginning, so Tesla's intelligent driving accidents are not many at all.
But right? The situation has changed since the release of FSD V12 in the spring of 2024, and after the application of AI technology to FSD, FSD has achieved a radical change.
Tesla deleted more than 300,000 lines of manually written intelligent driving algorithm code, and after the deletion, only more than 2,000 lines of C++ code of FSD V12 remained, replaced by a large model.
This large model simulates people's driving habits. It is Tesla that uses the driving data of millions of Tesla cars to let AI perform visual learning, and after massive data learning, the large model "learns to drive".
Just like when you drive, when driving, many times you are subconsciously reacting to the muscles, such as steering, accelerating, braking, and so on. After a long period of large-scale learning, AI can also use the images seen by the camera to make the same thinking and control as a human, and there is no longer a need for software/hardware such as high-definition maps and lidars.
To put it simply, it is to evolve from rule-driven to data-driven, using the training dataset of countless old drivers to train an AI old driver.
▲The principle of end-to-end driving mode
In the practical application of FSD V12, this intelligent driving route seems to have tended to be perfect, FSD can smoothly complete the completion of avoiding stopped vehicles in the lane (even if the line is compacted), can turn left/right and then change lanes, can press the stop sign to stop, and give way to pedestrians in front of the zebra crossing; A stopped car can be avoided and overtaken by a vehicle that suddenly merges on the road, even at night.
Therefore, Musk radically believes that as long as the amount of data is large enough and the training time is long enough, FSD will be able to completely eliminate human drivers in the future and completely realize unmanned driving.
Judging from the history of human science and technology development, no matter what the latest science and technology, the first application is often military.
As early as 2009, before Google's Waymo, Tesla, Uber, GM Cruise, Aptiv and Intel-Mobileye began to develop intelligent driving technology, the United States had begun to explore the use of intelligent driving robots to rule out IEDs.
In 2018, the Defense Innovation Unit (DIU), the lead agency of the United States Department of Defense, persuaded Congress to increase support for dual-use and autonomous technologies to bring private civilian developers into the DoD procurement landscape. DIU is currently promoting a driverless vehicle program called the Ground Expeditionary Autonomous Retrofit System (GEARS), seeking solutions from private contractors and has already successfully tested it on a Humvee.
The short-term goals of the GEARS program are: to deploy unmanned vehicles as much as possible in high-risk operations such as route clearance, explosive ordnance disposal, casualty evacuation, resupply, and reconnaissance assistance; The long-term goal is to establish a large-scale unmanned vehicle force, and let each soldier command a squad of unmanned vehicles and drones through a combination of manned and unmanned vehicles, so as to expand the combat effectiveness of infantry.
There is no doubt that the United States military will be very interested in the success of the FSD and does not rule out the possibility of applying it to the military side.
For example, can a tank use FSD to learn how to march and fight through a large number of tank driving and combat videos? So as to achieve real unmanned operations?
For example, the take-off and landing of carrier-based aircraft is a difficult problem in any country that has an aircraft carrier, and it is very difficult to train carrier-based pilots. So can carrier-based aircraft apply FSD to realize the autonomous take-off and landing through a large number of take-off and landing visual training, and completely liberate the pilot's hands and feet?
For another example, can an unmanned fighter apply FSD and train it with video data of an air combat ace to make it an AI ace? In peacetime, they cruise in the sky for a long time, and if there is an intrusion of foreign aircraft, they will fly over to intercept it. This is much faster than an emergency take-off from the ground for a fighter on duty.
The principle of FSD is actually very simple, so once this idea spreads in the military field, then each class will come up with its own gameplay (just like the countless ways of playing after the crossing machine enters the battlefield), and the result is likely to be revolutionary.
But what? FSD is also not perfect, and it may still have a long way to go when applied in the military.
The first is the problem of its training data.
The training data is very important for FSD to be useful, but at the same time, its training data also directly affects the level of FSD.
In Tesla's application, R&D personnel have found that it is not suitable to use the large model trained by Tennessee drivers in New York, because Tennessee is vast and sparsely populated, and the driver's driving habits tend to be polite. The streets are extremely congested, and there is a daily traffic jam, and if you are polite and ensure that the traffic stoppers one after another, you will not be able to cope at all.
Similarly, the FSD model trained in United States may not be suitable for China when brought to China. After all, China's road conditions are too complicated, there are all kinds of situations, and the awareness of traffic rules is different from that of United States, and United States always does not dry wheat on national highways, right? There's no such thing as a triple jumper running around, right?
In the same way, in order for military FSD to be effective, there must be a large amount of targeted training data. For example, if you simulate a battle with the Russian army, you must first have training data with the Russian army, right? Where does the data come from? Ukraine does have, but Ukraine's way of fighting is not the same as the US military!
Even if you use an imaginary enemy squadron to simulate the enemy army, but simulation is simulation, and you will never be able to realistically simulate the Russian army with the performance of the Russian army's active equipment, the Russian army's operational tactics and practices, and the command habits of the Russian army. So even if you use the data of the simulated Russian army to train a large model, when you really face the Russian army, it may not be easy to use it at all.
Just like when the People's Liberation Army was used to fighting the American mechanics of the Kuomintang Army, they thought that the American mechanics were nothing more than that, but when they saw the real American mechanics on the Korean battlefield, they were surprised.
The second is the inherent flaw of the visual scheme.
FSD uses a visual scheme based on how much information is included in each frame seen before it can be identified and made based on algorithms.
Today's cameras, more than 100 million pixels are not uncommon, but what? When shooting videos, video tends to have much lower pixels than photos because of dynamic shooting, given the caching speed. If you take the very high-definition 8K video that we see, for example, how many pixels does it have? It's 7680x4320 pixels, for a total of 33.2 million pixels.
And what about the human eye? The normal human eye has about 576 million pixels, which is far more than the average camera.
As a result of this difference, the human eye may vaguely see and react to a target at a distance of 1 kilometer. But the camera can't see anything, let alone make decisions.
While optical zoom can solve this problem, once zoomed, the field of view is smaller, making it difficult to see in other directions.
The third is that the camera is easily occluded.
This is easy to understand, sometimes you go through a mud puddle, and your reversing camera is stuck in muddy water, and you can't see anything.
By the same token, when a military vehicle is fighting with the FSD vision solution, it may not know what to do when it encounters mud, blood, smoke or grass.
The camera on the fighter plane is not easy to deface and obscure, but the problem is that this thing is so fragile that the camera will be blinded by a laser shot by the enemy, and in severe cases, the CCD will be burned.
I don't believe that if you go to a concert casually, you can meet the mobile phone camera burned by the concert laser performance.
Therefore, although Musk's press conference is very cool, and FSD has great prospects for military application, at least for the time being, the best way out for FSD is in the field of intelligent driving that only needs to look at the 100-meter area ahead.