Sure, but back then the standard was not relevant. You had your x86 computer, your compiler (MSVC, DJGPP, CodeWarrior, Borland...) and maybe a "how to write C/C++" book. You did have weird things like segmented addressing, but for most purposes, pointers were just memory addresses, and you could use casts to reinterpret data as you wished. In fact, using bit-cast-magic and doing pointer arithmetics were the preferred way of doing things.
It's strange that we put so much effort into learning all this arcane knowledge and now we're supposed to unlearn it...
I can’t tell for sure, since I was too young back then and didn’t write something complex enough, but I suspect that compilers were not evil genies back then. It was probably later gcc who first used UB as an excuse to evil optimizations.
I also don’t understand a reason behind dragging all that legacy into current standards. Does a real application of it make at least 0.1% of all usage? You cannot even buy a chip that implements segments, tagged pointers, etc etc. at least they could make a special mode where you can span across symbol whatever it means or see a pointer as a cpu sees it.
All this can be solved with simple ptr_untag(p), ptr_span(expr) and similar constructs, but instead we have resort to outsmarting the specific compiler logic or introducing ourself with complicated type systems. That went insane. Personally I just want my bytes in linear address space and a way to tell which bytes point to which and which do not. I liked asm and then C, they were two of my first three languages, but what C became today is just a horrible mess.
It is arm-based, so I believe it doesn’t. If you’re about x86s, then segment registers are there, but were not actually used (except for fs-gs utility cases) for almost a couple of decades, afaik. Segmented memory model is simply slow, cumbersome and unnecessary in the presence of decent pmmu.
Edit: though strictly speaking I was obviously wrong on that, clarifications are welcome.
What are you talking about? On x86 page tables were cached in TLB since their introduction. No mmu at all (80286) means that you're subject to fragmentation, and swapping segments is as expensive as a syscall since you have to lookup the descriptor through GDTR. x86 segments are scarce, expensive and cumbersome resources invented to cover bus width mismatch. Today it is not ever an issue. Even TSS is shared one per cpu, so much useless and slow its hardware part is.
We could imagine more segment registers and bigger descriptor tables, but that would be just poor man's manual TLB (manual always failed in cpus). If you concerned with constant checks, you may order yourself a cpu with BOUND instruction support and put it everywhere with the same result. Oh, it is already in x86-64, nevermind.
>actually a tree
It is 2 level "tree", afair. Directory and table. Please make your homework and research why segmented model was kicked off software arena by virtually everyone involved.
Spoken like a programmer that has never had to fix TLB misses as a bottleneck. It’s actually quite common as the TLB is rather small. Not even enough for 2MB worth of 4K pages.
You can use huge pages but it has all of the drawbacks of segments and none of the benefits.
VMWare used to have a super fast 32-bit hypervisor based on segments long before special instructions were added. This of course had to be reworked completely for X64.
Also Intel’s bound check instructions are still extremely rare and don’t work that well in practice. I’ve used them.
A recipe for fixing TLB misses (as any cache misses) is simple: don’t thrash your cache. Ofc I didn’t, I cannot even imagine what does one do to bottleneck at TLB – LSD? It is one of these problems like “doc, if I turn my finger 180, it hurts”.
>VMWare 32-bit before special instructions
DOS also was pretty fast, but that didn’t make it a good multi-user protected-mode OS. All these early emulations and monkey patching of guests cannot substitute hardware vt in the wild.
I would however argue that the existence of hierarchy in physical memory is orthogonal to the existence of a linear address space. Caches work really hard to maintain the ability of programs to use a single linear address space.
Ways in which memory is not linear: many asymmeyric multiprocessing setups (eg gpu/cpu), distributed processing (eg openmp), multiprocess systems w several virtual spaces, harvard architecture systems, segmented architectures, disk vs ram. Many of these may be considered performance motivated departures from linear address space, especially if performance includes reliability in addition to execution speed.
(Otoh virtual memory can be used to create the appearance of single address space for any of the above, and sometimes has been - eg as/400)
I think it was mostly aliasing-related, and that’s why I mentioned “tell which bytes point at which”. Managing C today is like managing that girlfriend who doesn’t say anything straight and avoids direct conversations. I’m not against optimizations, I’m against compiler pretending to be smarter than average me and spoiling our relationship when I only have basic demands.
Sorry but it's the other way around. You are the one pretending to be smarter than the compiler by imagining objects are laid out sequentially in an infinite char[] when they they are in fact just abstract objects laid out in whichever way the compiler thinks is best. Just keep it this way and it will be much easier to manage C, it doesn't need micro management and neither does girlfriends.
I'm saying that I don't like this model, not that I want to see everything as char[] in it. You insist that I should just embrace the model, since char[] doesn't work in it. I know, ok? If that tauthology was an argument, it completely missed the point.
Below the compiler, I'm pretty familiar how things are laid out, since there is ABI, packing/alignment rules, documented hardware and 15 years of dealing with them. Please don't mock up magic and wisdom out of few conventions.
It's strange that we put so much effort into learning all this arcane knowledge and now we're supposed to unlearn it...