> Radix sort is theoretically O(N), Nothing theoretical about it: Sorting a list...

FeepingCreature · on Dec 2, 2022

The latency of accessing physical memory is asymptotically a function of N for sufficiently large N - ie. big enough that signal propagation delay to the storage element becomes a noticeable factor.

This is not generally relevant for PCs because the distance between cells in the DIMM does not affect timing; ie. the memory is timed based on worst-case delay.

patrec · on Dec 2, 2022

> The latency of accessing memory is not a function of N.

How could it not be, given that any physical medium has finite information density, and information cannot propagate faster than the speed of light?

And on a practical computer the size of N will determine if you can do all your lookups from registers, L1-L3, or main RAM (plus SSD, unless you disable paging).

geocar · on Dec 2, 2022

> How could it not be, given that any physical medium has finite information density, and information cannot propagate faster than the speed of light?

You could have a tape of infinite length, and if you only ever request "the next one" then clearly the latency is constant.

> And on a practical computer the size of N ...

Don't be silly. N is the size of the input set, not the size of the universe.

patrec · on Dec 2, 2022

I feel we must be talking past each other. I thought the examples so far were about random access of N units of memory. The standard convention of pretending this can be done at O(1) always struck me as bizarre, because it's neither true for practical, finite, values of N (memory hierarchies are a thing) nor asymptotically for any physically plausible idealization of reality.

aidenn0 · on Dec 2, 2022

Random access times are irrelevant for radix sort since no random access to the input data is needed.

naasking · on Dec 2, 2022

You still need to write each element to the target, which costs O(sqrt(N)), so the "strict" running time of radix sort would be O(N^1.5) in layered memory hierarchies. In flat memory hierarchies (no caching) as found in microcontrollers, it reduces back to O(N).

BeeOnRope · on Dec 3, 2022

But radix sort doesn't write randomly to the output array like that. Yes, each element ends up in the right place, but through a series of steps which each have better locality of reference than random writes.

In this way, radix sort is usually faster than an "omniscient sort" that simply scans over the input writing each element to its final location in the output array.

naasking · on Dec 2, 2022

> The author ran out of memory; They ran a program that needs 10GB of ram on a machine with only 8GB of ram in it.

No, you can clearly see the O(sqrt(N)) trend at every level of the memory hierarchy.

geocar · on Dec 2, 2022

> No, you can clearly see the O(sqrt(N)) trend at every level of the memory hierarchy.

Lines goes up, then goes down. Very much unlike sqrt.

naasking · on Dec 2, 2022

If you can't see the clear sqrt(N) behaviour in those plots, then I recommend using an online graphic calculator to see what such a graph actually looks like. The sqrt trend is plain as day.