Yes, although the trick is used in the first place in order to replace a 6-cycle division by a 1-cycle multiplication (on x86_64, assuming no CPU stalls). Roughly, in order to still make it a net positive, we have a 4-cycles budget for additional operations. That's very tight for reversing the bits of a uint32_t.
> used in the first place in order to replace a 6-cycle division by a 1-cycle multiplication
Look a generation or two back and you'll see that the division is a LOT slower than 6 cycles, too.
But it should be noted that if the modulus is a compile time constant the compiler will turn the % into a multiply and add.
The riscv bitmanip instruction set has permutes that can reverse the bits in a word (along with other useful permutations). ... still doesn't exist in an actual asic yet. Maybe some time in my life we'll get it on common platforms.