Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Jap...Checking for ID3 is not good enough in checking for correctly decrypted MP3. Brute Forcing with only small letters, you get approx 43k possibilities with "ID3" as the first 3 letters, that makes ~10% of all 26*4 possibilities. Jap, you only have to decrypt those 43k possibilities, but you have to look at the whole file.

Even if you have garbage in the file, it is not the correct file, as the codec will ignore it, and the output is garbage.

I haven't tried how many of these 43k actually work, or give you at least partialy good result.



On those 10%, look for another magic number, or entropy in general: correctly decrypted data must* have (usually) significantly lower entropy/randomness than encrypted data.

A highly optimized (for this _exact_ context) hash/bloom function may yield comparable results, in general.

Or you can compute an efficient delineation algorithm using the docs:

https://www.loc.gov/preservation/digital/formats/fdd/fdd0001...

If the so many bytes of the rolling context don't match any numbers, keep brut'in the key til you have magic numbers and non-random garbage.


> correctly decrypted data must* have (usually) significantly lower entropy/randomness than encrypted data

I'd expect the average difference in entropy of random data vs. a compressed format to be so small as to be a useless filter. After all if it wasn't so, it would be bad compression. Now I don't know about mp3 specifically...


You are correct, the format would have to be lossless for this to be maximally effective. Since MP3 was explicitly designed to encode voice-range audio efficiently, this would not be the *ideal* metric.

However, since the format is also resilient and flexible, it should have a structure - even if that complexity is hidden in the codec - with a magnitude less entropy than the mis-decrypted possible plaintexts.

But yes, you compress, than encrypt. If your encrypted data is compressible or otherwise distinguishable from random noise, than you have (exploitable, correlatable) bias in your encryption functions, or you have inefficiency/redundancy in your compression.

I would think a very limited set of magic numbers/possible hex ranges would yield the highest performance per error.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: