The models running on $50k GPUs will get better but the models running on commodity hardware will hit an inflection point where they are good enough for most use cases.
If I had to guess I would say that's probably 10 or 15 years away for desktop class hardware and longer for mobile (maybe another 10 years).
Maybe the frontier models of 2040 are being used for more advanced things like medical research and not generating CRUD apps or photos of kittens. That would mean that the average person is likely using the commodity models that are either free or extremely cheap to use.
ok, you can technically upload all your photos to Google cloud for all the same semantic labeling features as iOS Photos app, but having local, always available and fast local inferencing is arguably more useful and valuable to the end user.