Sutton is talking about a long term trend. Would Google have been able to achieve this w/o a lot of computation? I don't think it refutes the essay in any way. If anything, model compression takes even more computation. We can't scale heuristics, we can scale computation.