Matrix multiplication is not ugly, but matrices themselves are ugly, mainly because they encode the arbitrary operation of choosing a basis. There's nothing especially nice about the pixel basis for images, or about the token basis for language. But of all the things that make up modern deep learning, matrix multiplication is surely the _least_ ugly. Relu/gelu is not pretty! Batch normalization is vomit-inducing!! Imagenet normalization? JFC!!!