It's not quite the same, but in the past I've written (in python) scrapers that run off of the cache. E.g. it would extract recipes from web pages that I had visited. The script would run through the cache and run an appropriate scraper based on the url. I think I also looked for json-ld and microdata.
The down sides were that it only works with cached data, and I had to tweak it a couple of times because they changed the format of the cache keys.
The down sides were that it only works with cached data, and I had to tweak it a couple of times because they changed the format of the cache keys.