Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Demystifying DVDs (hiddenpalace.org)
224 points by boltzmann-brain 9 days ago | hide | past | favorite | 22 comments




I did some research in dumping damaged dvds lately because I had some little treasures laying around that were just unusable.

I had good success using DVD Decrypter (hardly available, I got my copy years ago) ignoring bad sectors in the settings, Redumper[1] and, if DVD is only scratched very badly using car-anti-scratch / polish chemicals with a lot of patience [2] (some hints: Toothpaste does not work unless it is whitening toothpaste, don't use tools like a drill, will make the problems even worse, prefer the slow process of polishing by hand).

1: https://github.com/superg/redumper

2: https://www.youtube.com/watch?v=AN8ET7axaNk


Wow. Thanks for the heads up on DVD Decrypter being harder to get these days. I had no idea. When I return home I'm going to have the sift through all my old files and see if I still have a copy somewhere. Great little piece of software.

The Wikipedia article about it has a link to an unofficial mirror that appears to be hosting a copy of the installer of the last release of the program, from 2005.

https://en.wikipedia.org/wiki/DVD_Decrypter


MakeMKV is where it's at now as far as I know.

Yes, with a LibreDrive compatible drive ideally.

I bet that somehow you can make ImgBurn (the successor) include the old features (e.g. by setting a hidden key in the environment variables or something).

If not, the code was probably patched out in the build process...


i doubt it, the author got sued and has no desire to be sued again.

I've done something similar a long time ago; using raw read commands, reversing the descrambler output, and then statistical accumulation on the actual bitstream. By showing the output in real-time on a bad-sector you can actually see the signal appearing above the noise.

It's strange to see no mention of cleaning the drives themselves, although maybe it was implicit --- if you have a pile of old drives sitting around, chances are they're not going to be perfectly clean. A tiny bit of dirt on the lens can have a huge effect on the read signal, especially on a marginal disc.

Related article from 18 years ago: https://news.ycombinator.com/item?id=21242273


So I've recovered a lot of damaged DVDs and I think in my research it showed that DVDs also do ECC across larger than the 2048 data blocks (maybe 16 of them?)

So when I used ddrescue, I would read in that block size (instead of just 2048) as if I would get lucky and get a good read (or enough signal that ECC could repair it on the large block).

This was very effective at recovering DVDs with repeated reads vs when I had previously done it with 2048 byte reads only I would end up with 2048 byte reads scattered all over (which if ECC is done on 16x2k 32k byte block size, means there was a lot of data I was leaving on the floor that should have been recovered on those reads).

Ddrescue was also good for this in the sense that if I was trying to recover a DVD (video) from multiple damaged DVDs, as long as they were not damaged in the same location, i was able to fill in the blanks.

Perhaps you can correct me about the 16 block mechanism, perhaps it was just random that it worked and my understanding at the time was wrong.


You are both correct and the article discusses it accurately:

> Then you have 2048 bytes of user data, scrambled for the reasons mentioned before. The best way to look at the sector as a whole is to think of each sector as 12 “rows” consisting of 172 bytes each. After each 172-byte row is 10 bytes of ECC data called Parity Inner (PI), which is based on Reed-Solomon and applied to both the header and scrambled user data per row within the sector itself. Then, after the user data and parity inner data, is the 4-byte EDC, which is calculated over the unscrambled user data only. Then, finally, Parity Outer (PO) is another form of ECC that is applied by “column” that spans over an entire block of multiple sectors stacked horizontally, or in other words, a group of 16 sectors. Altogether, this adds up to 2366 bytes of recorded sector data.


I see that now, though I wonder if my assumption about reading 32KB aligned at a time, really does improve or not.

PO works on the 32KB block (after PI fixes what it can of the 2KB blocks).

So if PO works, it means that it was able to correct any errors in any blocks in the 32KB block, but it doesn't mean it will be able to do it every time. But my assumption is that if I read 32KB aligned that the hardware operates on the 32KB block once.

But if the hardware only operates on 2KB blocks, so a a 32KB read would be internally treated as 16 2KB reads, just that if a 2KB read fails even with PI, it will try to read 32KB and correct with PO, but then forget everything it just did if it succeeded. Then my assumption of how to do it better fails, as each 2KB block (even within a 32KB aligned read), would still need to be lucky, vs just needing to get lucky once for each 32KB aligned block.

the reason I'm wondering is that the "raw bytes" cache the author demonstrates the drive as having is only 2+KB in size (based on what they are reading) and that makes me wonder about my assumptions.


Extremely interesting read. I need to go back over it again in detail on my computer not just my phone while holding my baby.

A key theme in a future fiction I am writing (slowly) is that all digital data has been lost and the time we are in now is known as a digital dark age where little is known about our society and culture. Resurrecting an archeologically discovered DVD is a key plot point I am working through. That it will be the first insight into our time in over a millennium. Other conflicting interests will be finally succeeding at re-introducing corn at commercial scale after all hope had been lost and past attempts at re-germinating from the frozen seed bank had failed for hundreds of years. It's a work in progress.


Which drives and parameters for the READ BUF SCSI command yielded the expected 2366 bytes per sector? I imagine that it was combined with seeks to each sector before reading from the buffer (as it would be harder to isolate multiple sectors data in cache?).

It seems like it was a follow-up from previous bruteforce efforts, which include a spreadsheet with various results, but it would help to have some conclusions on which were best: http://forum.redump.org/topic/51851/dumping-dvds-raw-an-ongo...

Also, couldn't find any source/download for DiscImageMender.


Someday I hope we see the equivalent of something like greaseweasel for optical media where someone can collect an image of the disc itself that can later be postprocessed using software to extract data.

The whole article is about the heroic efforts to dump a DVD that has bad sectors by using a combination of different methods that ultimately yielded a fully read disc.

I have a burned CD with some bad sectors that I'd like to recover (it's not a game, it's a rendered animation from 1999… and unfortunately, it's zipped to save 0.1% or so); this makes me want to go hunting for a couple hundred CD-ROM drives to save it…

ddrescue to image the CD. mount the image. copy off files (bad sectors will all be 0). unzip and see what you got. if its a small amount i'd bet you would have something usable if not eperfect.

slight edit: unzip with option to keep the broken file (with the hope, unsure if this is true) that unzip can be resilient in the face of errors and keep on unzipping. (I know from experience rar is).

if it's just "some" bad sectors, maybe you can just bruteforce guessing what was in those sectors.

With a sector being 2048 bytes, it would seem optimistic to try 2^2048 different possibilities in case some of them contain what looks like good data.

As someone interested in preservation, the question that comes to mind is what modern/maintained tools are able to use all this information to make good dumps.

I was going to post: Isn't this information about 15 years too late? But I can see by the comments, it's not.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: