I don't quite agree on this one. While I think that Musk choices to go full vision when he did was foulish because he made his product worse, his main point is not wrong: human do drive well while using mostly vision. Assuming you can replicate the thought process of human driver using AI, I don't see why you could not create a self-driving car using only vision.
That's also where I would see transformers or another AI architecture with reasoning capabilities shine: the fact that it can reason about what is about to happen would allow it to handle edge cases much better than relying on dumb sensors.
As a human, it would be very difficult to drive a car just looking at sensor data. The only vehicule I can think of where we do that is submarines. Sensors data is good for classical AI but I don't think it will handle edge case well.
To be a reasonable self-driving system, it should be able to decide to slow down and maintain a reasonable safety space because it is judging the car in front to be driving erratically (ex: due to driver impairement). Only an AI that can reason about what is going on can do that.
Sure but humans do a lot more with vision than just convolutions. So maybe we need to wait for AI to invent new techniques equally revolutionary and equally impactful to convolutions to the point where it's believable that AI models can handle the range of exceptions humans handle. Humans are very good at learning from small data where AI tends to be pretty terrible at one-shot learning by comparison. That's going to continue being hugely relevant for edge cases. We've seen many examples now where a self-driving car crashes due to too much sunlight distorting its perception of where objects are. We can either bury our heads in the sand and pretend AI models work like humans and need the exact same inputs humans do or we can admit there are limitations to the technology and act accordingly.
I also think dumb sensors is unfair, there are Neural Network solutions for processing LIDAR data so we are talking about a similar level of intelligence applied over both sensors.
> As a human, it would be very difficult to drive a car just looking at sensor data.
What is vision if not sensor data?? Our brains have evolved to efficiently process and interpret image data. I don't see why from-scratch neural network architectures should ever be limited to the same highly specific input type.
Can’t argue with this logic, more data points certainly helps. I was arguing about vision vs lidar, vision + lidar is certainly better than vision alone.
That's also where I would see transformers or another AI architecture with reasoning capabilities shine: the fact that it can reason about what is about to happen would allow it to handle edge cases much better than relying on dumb sensors.
As a human, it would be very difficult to drive a car just looking at sensor data. The only vehicule I can think of where we do that is submarines. Sensors data is good for classical AI but I don't think it will handle edge case well.
To be a reasonable self-driving system, it should be able to decide to slow down and maintain a reasonable safety space because it is judging the car in front to be driving erratically (ex: due to driver impairement). Only an AI that can reason about what is going on can do that.