How is this better than applying the minimum description length principle formal...

maaku · on Nov 28, 2013

It's basically that, formulated into a reinforcement learning algorithm.

It's innovative because this combination which is unique to Marcus Hutter's Ph.D. thesis basically "solves" the problem of general AI.. if given infinite computation resources. It's provable mathmatically to the be the best possible artificial general intelligence, with constant-time operation.. it's just that constant happens to be larger than the known universe.

That sounds useless, but it's not. What is shows is that artificial (general) intelligence is really an optimization problem, and provides a (theoretical) benchmark comparison, and directs current research into more practical algorithms which approximate that ideal within realistic space and time constraints.

jkbyc · on Nov 28, 2013

Thanks, that sheds a bit more light on it.

To me it still sounds like "I solved the problem because I formalized it." The formalization is probably novel (since he got his PhD for it) but somehow I find it difficult to believe noone else had a similar idea before (it definitely sounds familiar to me having learned about reinforcement learning, minimum description length, etc. before). I'll have to read up on it.

jmmcd · on Nov 29, 2013

It's not constant-time. It's not even computable, even in theory, even with infinite resources.

Houshalter · on Nov 29, 2013

True but the approximations of it are (i.e. limiting the amount of running time each hypothesis has and limiting the number of hypothesis you test.)

This is sort of like criticizing the concept of a Turing machine because no one has built one with infinite tape or running time.

jmmcd · on Nov 29, 2013

I'm not criticising AIXI, I'm just pointing out that calling it "constant time" is about as wrong as you can get, in the world of time complexity. Especially for AIXI, but also (less so) for an approximation like AIXItl.

maaku · on Nov 29, 2013

Further, you can show with information theory that limiting the number of hypothesis tested or the running time does not reduce generality so long as you test up to a certain amount for a problem domain (e.g. hypothesis sizes up to the maximum representable if the causally observable universe where converted into computium, and maximum number of steps equal to the number of possible combinatorial configurations of program state).

Of course as mentioned, this gives you constant factors equal to the size of the observable universe. No one said it was practical :)

jmmcd · on Nov 29, 2013

"Constant time" is still wrong even in the approximations. Constant time means O(1). But your comment already refers to a running-time complexity dependence on several parameters of the input, such as the size of the observable universe.

maaku · on Nov 29, 2013

Size of the observable universe is constant. And maximization by exhaustive search requires touching every possible state. So yes, it is O(1). This is explained in Marcus' Ph.D. thesis, I believe.

jmmcd · on Nov 29, 2013

If he says so, I should admit I'm wrong, but maybe not just yet.

This way of thinking (size of the observable universe is constant) avoids the whole question of time complexity. To actually measure the time complexity of the algorithm, you need to consider inputs of different sizes. You could run exactly the same algorithm in a different universe (that is why it's called "universal" after all) and you'd get a different running time. The time complexity is then the relationship between the two running times and the universe sizes.

Dn_Ab · on Nov 28, 2013

This idea is a generalization of Solomonoff's work on induction from the dawn of Computing. Solomonoff was one of the initial 3 discoverers of Kolmogorov Complexity (Chaitin being the other). MDL is an attempt at a computable Kolmogorov Complexity, however creating a codebook is difficult so MDL is no panacea. Hutter's work is more broad, being interested in the intelligence of an Agent and uses ideas like MDL and compression.

kleiba · on Nov 28, 2013

...applying the minimum description length principle...

Or the Minimum Message Length principle (from 1968 [1])

[0] https://en.wikipedia.org/wiki/Minimum_message_length