Kolmogorov complexity/entropy is more suitable for this purpose, under the impli...

canjobear · on Jan 17, 2022

How would you estimate the Kolmogorov complexity for the author's example?

iflp · on Jan 17, 2022

Kolmogorov complexity is only unambiguously defined asymptotically, and "asymptotics is merely a heuristic". It is also uncomputable. So, to use entropy arguments for passwords, the only correct way I could think of is to generate long and (elementwise) random passwords.

canjobear · on Jan 17, 2022

You asserted that Kolmogorov complexity will disagree with Shannon entropy in this example, so how do you know what the Kolmogorov complexity of this example is?

iflp · on Jan 17, 2022

It is a random variable in this setting, as it is a function of the randomly generated password. Given a deterministic sequence, you find the definition of its Kolmogorov complexity in textbooks/Wikipedia/etc. By saying the Kolmogorov complexity will disagree with Shannon entropy, I meant the former, which is a random variable here, does not converge to the latter, contrary to the standard asymptotic setting which probably gives people the idea of using entropy to characterize password (I don't know, don't work in security).

The point of my original post is that the asymptotics break down here, and this phenomenon is not poorly understood, at least in some other communities. It is not meant to provide an alternative that is always well-defined and useful, although as I said in the grandparent comment, there is the useful implication that you can stay safe by sticking to the asymptotic regime.

lmm · on Jan 17, 2022

It's easy to calculate an upper bound on the Kolmogorov complexity of a given value (you just have to exhibit a Turing machine that computes it), but very hard to prove a lower bound (you would have to prove something about all possible Turing machines).

lapinot · on Jan 18, 2022

By giving the password to a good compressor? (and then computing the shannon entropy of the result) Yet i'm not sure i know a good compressor for short strings... Perhaps something like gpt2tc tailored to passwords instead of english text.

aeternum · on Jan 17, 2022

The implicit assumption however isn't good. Password crackers regularly make use of prior knowledge. A password that consists of a Shakespearian Sonnet for example has very high complexity but makes for a bad password.

lapinot · on Jan 18, 2022

Kolmogorov complexity kinda does account for "prior knowledge" (that's why it's not computable). A shakespearian sonnet will have low kolmogorov complexity (there's redundancy).