Sure, but back then the standard was not relevant. You had your x86 computer, yo...

pjmlp · on Sept 24, 2018

When I was writing portable UNIX software in the late 90's, early 2000's, the standard was quite relevant.

wruza · on Sept 23, 2018

I can’t tell for sure, since I was too young back then and didn’t write something complex enough, but I suspect that compilers were not evil genies back then. It was probably later gcc who first used UB as an excuse to evil optimizations.

I also don’t understand a reason behind dragging all that legacy into current standards. Does a real application of it make at least 0.1% of all usage? You cannot even buy a chip that implements segments, tagged pointers, etc etc. at least they could make a special mode where you can span across symbol whatever it means or see a pointer as a cpu sees it.

All this can be solved with simple ptr_untag(p), ptr_span(expr) and similar constructs, but instead we have resort to outsmarting the specific compiler logic or introducing ourself with complicated type systems. That went insane. Personally I just want my bytes in linear address space and a way to tell which bytes point to which and which do not. I liked asm and then C, they were two of my first three languages, but what C became today is just a horrible mess.

jcranmer · on Sept 23, 2018

> You cannot even buy a chip that implements segments

Not only can you buy a chip that implements segments, the computer you wrote that statement on probably has such a chip.

wruza · on Sept 23, 2018

It is arm-based, so I believe it doesn’t. If you’re about x86s, then segment registers are there, but were not actually used (except for fs-gs utility cases) for almost a couple of decades, afaik. Segmented memory model is simply slow, cumbersome and unnecessary in the presence of decent pmmu.

Edit: though strictly speaking I was obviously wrong on that, clarifications are welcome.

slededit · on Sept 23, 2018

While it may be “cumbersome” from the programmers perspective - it’s certainly a lot faster than an MMU.

There’s no universe where traversing a page table (actually a tree) in memory is faster than an offset and a bounds check.

wruza · on Sept 23, 2018

What are you talking about? On x86 page tables were cached in TLB since their introduction. No mmu at all (80286) means that you're subject to fragmentation, and swapping segments is as expensive as a syscall since you have to lookup the descriptor through GDTR. x86 segments are scarce, expensive and cumbersome resources invented to cover bus width mismatch. Today it is not ever an issue. Even TSS is shared one per cpu, so much useless and slow its hardware part is.

We could imagine more segment registers and bigger descriptor tables, but that would be just poor man's manual TLB (manual always failed in cpus). If you concerned with constant checks, you may order yourself a cpu with BOUND instruction support and put it everywhere with the same result. Oh, it is already in x86-64, nevermind.

>actually a tree

It is 2 level "tree", afair. Directory and table. Please make your homework and research why segmented model was kicked off software arena by virtually everyone involved.

slededit · on Sept 24, 2018

Spoken like a programmer that has never had to fix TLB misses as a bottleneck. It’s actually quite common as the TLB is rather small. Not even enough for 2MB worth of 4K pages.

You can use huge pages but it has all of the drawbacks of segments and none of the benefits.

VMWare used to have a super fast 32-bit hypervisor based on segments long before special instructions were added. This of course had to be reworked completely for X64.

Also Intel’s bound check instructions are still extremely rare and don’t work that well in practice. I’ve used them.

wruza · on Sept 24, 2018

A recipe for fixing TLB misses (as any cache misses) is simple: don’t thrash your cache. Ofc I didn’t, I cannot even imagine what does one do to bottleneck at TLB – LSD? It is one of these problems like “doc, if I turn my finger 180, it hurts”.

>VMWare 32-bit before special instructions

DOS also was pretty fast, but that didn’t make it a good multi-user protected-mode OS. All these early emulations and monkey patching of guests cannot substitute hardware vt in the wild.

ChrisSD · on Sept 23, 2018

But memory isn't a linear address space. If you're forced to treat memory as strictly linear than you're denied many optimizations.

twtw · on Sept 23, 2018

> But memory isn't a linear address space

Would you mind expanding on this point a bit?

ChrisSD · on Sept 23, 2018

Modern memory is layered. You have registers, you have three layers of CPU cache, etc, etc.

At higher levels of abstraction we pretend like it's all flat but we do need to allow for optimizations to occur.

wruza · on Sept 23, 2018

So what that does with linear addressing and optimization? How the presence of registers and caches depend on [non-]linearity?

twtw · on Sept 23, 2018

Thanks for responding.

I would however argue that the existence of hierarchy in physical memory is orthogonal to the existence of a linear address space. Caches work really hard to maintain the ability of programs to use a single linear address space.

fulafel · on Sept 23, 2018

Ways in which memory is not linear: many asymmeyric multiprocessing setups (eg gpu/cpu), distributed processing (eg openmp), multiprocess systems w several virtual spaces, harvard architecture systems, segmented architectures, disk vs ram. Many of these may be considered performance motivated departures from linear address space, especially if performance includes reliability in addition to execution speed.

(Otoh virtual memory can be used to create the appearance of single address space for any of the above, and sometimes has been - eg as/400)

wruza · on Sept 23, 2018

I think it was mostly aliasing-related, and that’s why I mentioned “tell which bytes point at which”. Managing C today is like managing that girlfriend who doesn’t say anything straight and avoids direct conversations. I’m not against optimizations, I’m against compiler pretending to be smarter than average me and spoiling our relationship when I only have basic demands.

If not that, I’m also curious.

Too · on Sept 23, 2018

Sorry but it's the other way around. You are the one pretending to be smarter than the compiler by imagining objects are laid out sequentially in an infinite char[] when they they are in fact just abstract objects laid out in whichever way the compiler thinks is best. Just keep it this way and it will be much easier to manage C, it doesn't need micro management and neither does girlfriends.

wruza · on Sept 23, 2018

I'm saying that I don't like this model, not that I want to see everything as char[] in it. You insist that I should just embrace the model, since char[] doesn't work in it. I know, ok? If that tauthology was an argument, it completely missed the point.

Below the compiler, I'm pretty familiar how things are laid out, since there is ABI, packing/alignment rules, documented hardware and 15 years of dealing with them. Please don't mock up magic and wisdom out of few conventions.