No. You cannot infer the layout of the cache based on size alone. You don’t know how many ways there are, how many sets their are etc. you don’t even know their LRU scheme.
For mutating branch predictor state at will, you will need your compiler to write self modifying cache. This will pollute the icache and further mutate the state based on where the cache line lies, and the current state of the icache.
You could maybe restrict your target to open source hardware and come up with something.
> You cannot infer the layout of the cache based on size alone.
No – but the compiler would presumably be told how the specific processor's cache works. If it's possible for a human to write cache-flushing code, it's possible to describe it to a computer. (And that was just an example.)
> For mutating branch predictor state at will, you will need your compiler to write self modifying code.
Only for arbitrary modifications. For specific modifications, it's fine to go with ordinary code – though self-modifying code isn't actually all that hard to model, if you generalise “state” to also encompass the state of the self-modifying section. (Compilers already have “all branches” type things.)
> You could maybe restrict your target to open source hardware
There's no need to make this restriction. You only need sufficiently-understood hardware; you underestimate the ability of reverse-engineers.
How exactly would you modify branch predictor state without self modifying code? I am very interested in this.
Another major question: a lot of x86 ops for manipulating things you want to manipulate are ring 0 instructions.
We haven’t even discussed the effect of somewhat non deterministic delays caused by contention from other processes. This can change the true execution order of instruction inside the pipes, even if you somehow manage to enforce in order issue.
For mutating branch predictor state at will, you will need your compiler to write self modifying cache. This will pollute the icache and further mutate the state based on where the cache line lies, and the current state of the icache.
You could maybe restrict your target to open source hardware and come up with something.