Elon Musk has been known for sharing a strange blend of memes and viable SpaceX or Tesla information on Twitter, including bold past opinions on the autonomous vehicle world’s radar versus vision debate. And in a new update, the Tesla CEO tweeted that it’s “better to double down on vision than do sensor fusion,” furthering his support for doing away with radar.
On Monday, Facebook software engineer Tristan Rice (@rice_fry), based in Seattle, Washington, broke down in-depth how Tesla plans to replace radar in recent firmware updates, including information on Tesla’s new neural network, based on a Recurrent Neural Network (RNN), which includes distance, velocity, and acceleration information, as newly-added binary outputs in Tesla’s Autopilot and Full Self-Driving (FSD) systems.
We recently got some insight into how Tesla is going to replace radar in the recent firmware updates + some nifty ML model techniques
— Tristan (@rice_fry) April 12, 2021
“We recently got some insight into how Tesla is going to replace radar in the recent firmware updates + some nifty ML model techniques,” said Tristan, in a series of tweets on Monday.
“From the binaries we can see that they’ve added velocity and acceleration outputs. These predictions in addition to the existing xyz outputs give much of the same information that radar traditionally provides (distance + velocity + acceleration),” added Rice.
“For autosteer on city streets, you need to know the velocity and acceleration of cars in all directions but radar is only pointing forward. If it’s accurate enough to make a left turn, radar is probably unnecessary for the most part,” wrote the Facebook software engineer.
Rice says Tesla’s neural network can’t figure out velocity and acceleration from static images. “They’ve recently switched to something that appears to be styled on a Recurrent Neural Network,” he noted.
“Net structure is unknown (LSTM? [long short-term memory]) but they’re providing the net with a queue of the 15 most recent hidden states. Seems quite a bit easier to train than normal RNNs which need to learn to encode historical data and can have issues like vanishing gradients for longer time windows,” wrote Rice.
“The velocity and acceleration predictions is new, by giving the last 15 frames (~1s) of data I’d expect you can train a highly accurate net to predict velocity + acceleration based off of the learned time series,” detailed Tristan.
“They’ve already been using these queue-based RNNs with the normal position nets for a few months presumably to improve stability of the predictions,” explains Rice, noting, “This matches with the recent public statements from Tesla about new models training on video instead of static images.
“To evaluate the performance compared to radar, I bet Tesla has run some feature importance techniques on the models and radar importance has probably dropped quite a bit with the new nets. See tools like https://captum.ai for more info,” added Rice.
The software engineer still thinks radar will remain, saying, “I still think that radar is going to stick around for quite a while for highway usage since the current camera performance in rain and snow isn’t great.”
He says “[Navigate on Autopilot] often disables in mild rain. City streets might behave better since the relative rain speed is lower.”
“One other nifty trick they’ve recently added is a task to rectify the images before feeding them into the neural nets,” Rice went on to explains. “This is common in classical CV applications so surprised it only popped up in the last couple of months.”
“This makes a lot of sense since it means that the nets don’t need to learn the lens distortion. It also likely makes it a lot easier for the nets to correlate objects across multiple cameras since the movement is now much more linear,” said Rice.
For more background on LSTMs (Long Short-Term Memory) see https://t.co/oenyFFupPv
They're tricky to train because they need to encode history which is fed into future runs. The more times you pass the state, the more the earlier frames is diluted hence "vanishing gradients".
— Tristan (@rice_fry) April 12, 2021
Tristan previously detailed what Tesla insurance knows while you’re driving based on telemetry data last week, for those concerned about what the company knows about past driving records and more.
The news also comes following Musk’s claims that Tesla’s FSD beta version 9.0 will have “pure vision, no radar,” also including that the update is almost ready.
In a follow-up to Saturday’s tweet, Musk wrote, “Sensors are a bitstream and cameras have several orders of magnitude more bits/sec than radar (or lidar).”
Musk continued, “Radar must meaningfully increase signal/noise of bitstream to be worth complexity of integrating it. As vision processing gets better, it just leaves radar far behind.”
Sensors are a bitstream and cameras have several orders of magnitude more bits/sec than radar (or lidar).
Radar must meaningfully increase signal/noise of bitstream to be worth complexity of integrating it.
As vision processing gets better, it just leaves radar far behind.
— Elon Musk (@elonmusk) April 10, 2021
Tesla’s neural network continues to get smarter and smarter, thanks to the massive amounts of data available from owner vehicles. It’s one of the major advantages Tesla has over competitors when it comes to Autopilot and Full Self-Driving, with the full picture still yet to be seen, albeit FSD beta testers are already demonstrating what’s possible.