Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A Few Synthesizing Superoptimizer Results (regehr.org)
36 points by jcr on July 26, 2015 | hide | past | favorite | 13 comments


I feel like super optimizing llvm code isn't the best idea.

I feel like the way other superoptimizers get good results is by taking advantage of really strange interactions of some of the less commonly understood opcodes. LLVM doesn't have those funky opportunities.


You didn't understand how Souper works. They already found a lot of very good optimizations, which were included into LLVM proper. Check out http://blog.regehr.org/archives/1146 and http://blog.regehr.org/archives/1192

Here he talks about the end-results of Souper and Alive, which should be presented more user-friendly and generalized, with some help from a solver. Encoding every single edge-case into the optimizer is not as good as finding better generalizations, automatically.


I understand that is the approach.

Another idea I want to see tried is a source code annotation which indicates to an llvm plugin to run machine specific super optimization on a specific basic block and cache the result. Might be an interesting way to enhance tight loops.

A third idea I had would be to use super optimization as a way of automatically generating isel selectors with only the lowest level descriptions of what the ir and the opcodes do. That is harvest a program for ir sequences, automatically find good mappings to assembly constructs, automatically emit a backend.


It's not either/or, adding superoptimiser-derived optimisations at the IR level doesn't mean you can't also add different ones at the backend.


The problem is that superoptimised IR may obscure more useful optimisations lower down.


Is there any evidence for this?


Is there any proof that they wouldn't?


My advisor once said "Optimization means: sometimes, it is not getting worse", and that's excactly what happens if you perform any optimization on IR level. You don't have any guarantee that your result is optimal (regarding whatever). That's exactly the reason why Massalin introduced the term superoptimization, because on machine level you can provide such guarantees for the considered instruction sequence.

Of course there a cases where preventing an IR optimization can trigger a more powerful one and the resulting code gets better. However, in general I would say it's the other way around: Performing the IR superoptimization leads to better code quality.


I'd guess either approach will become stuck in local minima. Ideally, you'd want to check all permutations systematically. Of course, there is probably no polynomial-time algorithm to find the optimal, but adding some basic depth bounds should give a practical implementation that can explore at least some of the space, instead of relying on gut instinct about which approach may or may not be better.


The permutation problem is not related to IR superoptimization itself.

The goal of an optimization pass is to improve the performance of an average, and that's exactly what the IR superoptimization does.


My intuition is that stacking general super optimizers will be at best equally or less effective than just running a machine specific one for twice as long.


The advantage of performing superoptimization on IR level is that you have explicit data dependencies (due to SSA form). Thus you don't have unrelated instructions in between. Figure 1 of the cited Optgen paper http://pp.info.uni-karlsruhe.de/uploads/publikationen/buchwa... shows an example that can be optimized with IR-based superoptimization but not with a machine-specific one.

In summary, no approache subsumes the other one. Thus, it is worthwhile to have them both.


I believe the intention here is to use the super-optimiser to derive specific generally-applicable optimisations that can be contributed back to LLVM as peephole optimisations, rather than running the super-optimiser as part of the usual compilation process.

That being the case, it seems like it would be advantageous to create optimisations that work at both stages of compilation - not least because the IR-level ones will be more widely applicable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: