Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I researched various archiving alternatives for something I needed recently. I subscribe to a paid Substack for an educational course that will end mid-year, and I want to archive the course posts before it ends (the course provider has even recommended people end their Substack subscription after it ends).

For this purpose, I found the SingleFile browser extension to be the best fit. It's a browser extension, so paywall cookies are already present, and I just manually archive the previous week's content, after the discussion phase has concluded. It creates a single self-contained file with all images and comments, etc., but all non-page-local links still resolve externally (which is as-desired, for my use case). It can be configured to auto-generate a convenient filename, and to use self-extracting compression.

I preferred this to an automated process based on, e.g., RSS, because I can ensure the archive occurs after all the useful course comments back-and-forth has concluded, and it's trivial to set up and use.



SingleFile is amazing. I also recommend ArchiveWeb.page / browsertrix. Both projects truly do more to solve the hard problems of internet archiving than ArchiveBox (which is just a wrapper + admin UI for a collection of tools).

ArchiveBox actually uses SingleFile internally as one our methods to save every page (among others), and we try to send a portion of our donations periodically to @gildas-lormeau to support his awesome work on it!


I also use some of the browser extensions to save a replica of certain pages ( I also use single File ) FireShot and/or GoFullPage ( I use the paid option on both extensions ) I like singlefile extension because it is can be configured to save pages automatically. Videos are recordable with Camtasia (Paid ) , but there are free options ...


singlefile is so good i am upset that firefox can't screenshot correctly by itself, again. I used to run a URL to image service for both archival and sharing that was dead simple - just fetch it with firefox headless and take a screenshot. The floating footers on a lot of sites, as well as some adware interfere with firefox screenshots now, so i just stopped backing up pages. Singlefile is getting a lot of use since i found out about it.

My primary concern about archivebox (and the WARC stuff) is the TB of existing archival stuff i already have.


That is a great solution for local copies. Archivebox is on a web server to make the archives available to anyone on the internet.


I serve the output of SingleFile on my home network. It generates html, so I just push it to my file store. That said, my use-case (archiving a paid Substack course that is well worth paying for) is definitely only for personal use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: