Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Kolmogorov complexity/entropy is more suitable for this purpose, under the implicit assumption that password crackers don't have tailored prior knowledge and are just enumerating "simple" sequences. It only agrees with Shannon entropy on long ergodic sequences. The author basically constructed an example where the two notions don't agree.


How would you estimate the Kolmogorov complexity for the author's example?


Kolmogorov complexity is only unambiguously defined asymptotically, and "asymptotics is merely a heuristic". It is also uncomputable. So, to use entropy arguments for passwords, the only correct way I could think of is to generate long and (elementwise) random passwords.


You asserted that Kolmogorov complexity will disagree with Shannon entropy in this example, so how do you know what the Kolmogorov complexity of this example is?


It is a random variable in this setting, as it is a function of the randomly generated password. Given a deterministic sequence, you find the definition of its Kolmogorov complexity in textbooks/Wikipedia/etc. By saying the Kolmogorov complexity will disagree with Shannon entropy, I meant the former, which is a random variable here, does not converge to the latter, contrary to the standard asymptotic setting which probably gives people the idea of using entropy to characterize password (I don't know, don't work in security).

The point of my original post is that the asymptotics break down here, and this phenomenon is not poorly understood, at least in some other communities. It is not meant to provide an alternative that is always well-defined and useful, although as I said in the grandparent comment, there is the useful implication that you can stay safe by sticking to the asymptotic regime.


It's easy to calculate an upper bound on the Kolmogorov complexity of a given value (you just have to exhibit a Turing machine that computes it), but very hard to prove a lower bound (you would have to prove something about all possible Turing machines).


By giving the password to a good compressor? (and then computing the shannon entropy of the result) Yet i'm not sure i know a good compressor for short strings... Perhaps something like gpt2tc tailored to passwords instead of english text.


The implicit assumption however isn't good. Password crackers regularly make use of prior knowledge. A password that consists of a Shakespearian Sonnet for example has very high complexity but makes for a bad password.


Kolmogorov complexity kinda does account for "prior knowledge" (that's why it's not computable). A shakespearian sonnet will have low kolmogorov complexity (there's redundancy).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: