I researched various archiving alternatives for something I needed recently. I subscribe to a paid Substack for an educational course that will end mid-year, and I want to archive the course posts before it ends (the course provider has even recommended people end their Substack subscription after it ends).
For this purpose, I found the SingleFile browser extension to be the best fit. It's a browser extension, so paywall cookies are already present, and I just manually archive the previous week's content, after the discussion phase has concluded. It creates a single self-contained file with all images and comments, etc., but all non-page-local links still resolve externally (which is as-desired, for my use case). It can be configured to auto-generate a convenient filename, and to use self-extracting compression.
I preferred this to an automated process based on, e.g., RSS, because I can ensure the archive occurs after all the useful course comments back-and-forth has concluded, and it's trivial to set up and use.
SingleFile is amazing. I also recommend ArchiveWeb.page / browsertrix. Both projects truly do more to solve the hard problems of internet archiving than ArchiveBox (which is just a wrapper + admin UI for a collection of tools).
ArchiveBox actually uses SingleFile internally as one our methods to save every page (among others), and we try to send a portion of our donations periodically to @gildas-lormeau to support his awesome work on it!
I also use some of the browser extensions to save a replica of certain pages ( I also use single File ) FireShot and/or GoFullPage ( I use the paid option on both extensions ) I like singlefile extension because it is can be configured to save pages automatically. Videos are recordable with Camtasia (Paid ) , but there are free options ...
singlefile is so good i am upset that firefox can't screenshot correctly by itself, again. I used to run a URL to image service for both archival and sharing that was dead simple - just fetch it with firefox headless and take a screenshot. The floating footers on a lot of sites, as well as some adware interfere with firefox screenshots now, so i just stopped backing up pages. Singlefile is getting a lot of use since i found out about it.
My primary concern about archivebox (and the WARC stuff) is the TB of existing archival stuff i already have.
I serve the output of SingleFile on my home network. It generates html, so I just push it to my file store. That said, my use-case (archiving a paid Substack course that is well worth paying for) is definitely only for personal use.
For this purpose, I found the SingleFile browser extension to be the best fit. It's a browser extension, so paywall cookies are already present, and I just manually archive the previous week's content, after the discussion phase has concluded. It creates a single self-contained file with all images and comments, etc., but all non-page-local links still resolve externally (which is as-desired, for my use case). It can be configured to auto-generate a convenient filename, and to use self-extracting compression.
I preferred this to an automated process based on, e.g., RSS, because I can ensure the archive occurs after all the useful course comments back-and-forth has concluded, and it's trivial to set up and use.