Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are there any experimental systems programming languages exploring the opposite of "pointer provenance" (and the resulting compiler complexity)?

E.g. treat memory strictly as some sort of separate storage device, and do all computations in local register-like variables (which could be backed by a special invisible scratch space to allow variables to spill from registers to scratch space.

Basically a language which would require explicit load/stores from and to regular memory, instead of trying to provide the illusion of an infinitely large register bank.



I think LLVM IR (and other IRs) works a bit like this. There you have an infinite set of registers while memory is separate and you need to explicitly load and store between them. Unfortunately it is not very suitable for humans to write.


Compiler IRs have the same semantics as the language they came from, otherwise it wouldn't be possible to optimize away any memory writes. I forget how it works in LLVM, but in general it's a combination of just saying which language it originally was and lots of alias set metadata.


Correction: they have the same provenance issues as the languages they come from; LLVM IR allows optimizing out memory operations based on aliasing and doesn't explicitly track provenance or aliasing in all cases, therefore it inherits these problems from the C it was designed to compile.


Assembly languages sound like what you describe, and they do not make the burden of memory management go away, just shift it.

You still have to manage a separate storage device somehow (most often via file systems drivers). That's why GCs are so popular: you have a smart "driver" between your code and a RAM storage that makes life so much easier.


It's not so much about removing the burden or making life easier, but instead making manual optimization more straightforward and predictable. Currently, manual optimization often means appeasing or even fighting the compiler's optimizer passes, with small innocent changes completely changing the output of the compiler.


Now that I think about exposing cache hierarchy in a programming language, that makes much more sense. One can imagine a programming language with explicit control over CPU caches and their hierarchy. Also, this could make GPU/CUDA programming more explicit, safe and efficient.

Still, this requires the hardware to cooperate with software instead of pretending that random access memory is truly random access. This functionality just isn't there at this point.

Edit: this would require programmable caching hardware to make caching analysis and algorithms introspectable and changeable. For now, fixed caching algorithms implemented in hardware do provide lots of benefits.


DSPs and other application-specific processors expose the cache hierarchy as a set of scratchpads. This works very well for them, but not for any CPU that is shared between applications, like a server.


What you're describing is reminding me of how Itanium tried to bring VLIW to non-embedded spaces and found "make the compiler smart enough to use this well" to be much easier said than done.


I do wonder if the long-term future of programming looks like https://www.microsoft.com/en-us/research/publication/coq-wor... , and providing libraries to help the programmer optimize themselves rather than black boxes that do it for them.


Are you looking for a language that does function local, bounded optimization, and dumps state directly to memory at various well-defined boundaries?


I think more like a medium level language somewhere between assembly and C, with a clear distinction between a "virtual register bank" and "mass storage memory". Moving data between the two would be an explicit operation, not something the compiler would ever do automatically.


So, one way you can do this is to mark all your "memory" pointers as volatile, then load them into register variables and store them back manually. This would actually allow for very aggressive optimizations in the region that you've fenced off with "register" since the compiler can assume there is no aliasing, while letting you define the boundary where you'd like to writes to "go to the hardware". In C this might be a bit of boilerplate but in C++ once could assume you could RAII the boilerplate away…might be worth exploring.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: