Hacker Newsnew | past | comments | ask | show | jobs | submit | espeed's commentslogin

Rather than develop its own AI (https://news.ycombinator.com/item?id=45926779), Firefox should develop a system to pipe your html rendered browsing history in real time so external local services can process it (https://connect.mozilla.org/t5/ideas/archive-your-browser-hi...). See https://news.ycombinator.com/item?id=45743918

Firefox probably won't suddenly have the best AI, but it could be the only browser that does this. Previous: https://news.ycombinator.com/item?id=46018789


Someone needs to convince Firefox rather than develop its own AI (https://news.ycombinator.com/item?id=45926779) to develop a system to pipe your html rendered browsing history in real time so external local services can process it (https://connect.mozilla.org/t5/ideas/archive-your-browser-hi...). See https://news.ycombinator.com/item?id=45743918

Firefox probably won't suddenly have the best AI, but they could have the only browser that does this.


You can already do what you're looking for by reading the browser cache as new data is cached. This would allow you to see the site as it was loaded originally, instead of simply fetching an updated view from a URL. The data layout for the cache in Firefox and Chrome is available online.


Does the cache store the rendered DOM?


They'd probably reject that idea under some bullshit privacy or security excuse Wayland-like reasoning. Also why we don't have XUL extensions anymore and why they'll eventually copy chrome on that manifest crap.


I paid for Gemini Pro. Am I getting Gemini 3 Pro (https://gemini.google.com)? "To be precise: You are currently interacting with Gemini 1.5 Pro." https://x.com/espeed/status/1991333475098718601



Knowing this is the direction things were headed, I have been trying to get Firefox and Google to create a feature that archives your browser history and pipes a stream of it in real time so that open-source personal AI engines can ingest it and index it.

https://connect.mozilla.org/t5/ideas/archive-your-browser-hi...


AFAICS this has nothing to do with "open-source personal AI engines".

The recorded history is stored in a SQLite database and is quite trivial to examine[0][1]. A simple script could extract the information and feed them to your indexer of choice. Developing such a script isn't the task for an internet browser engineering team.

The question remains whether the indexer would really benefit from real-time ingestion while browsing.

[0] Firefox: https://www.foxtonforensics.com/browser-history-examiner/fir...

[1] Chrome: https://www.foxtonforensics.com/browser-history-examiner/chr...


Due to the dynamic nature of the Web, URLs don't map to what you've seen. If I visit a URL at a certain time, the content I see is different than the content you see or even if I visit the same URL later. For example, if we want to know the tweets I'm seeing are the same as the tweets you're seeing and haven't been subtly modified by an AI, how do you do that? In the age of AI programming people, this will be important.


I'm confused, do you want more than the browser history then? ...something like Microsoft's Recall? Browsers currently don't store what they've seen and for good reasons. I was with you for a sec, but good luck convincing Mozilla to propagate rendered pages to other processes then!


Being able to index and own your data changes the model of the Web.


So you're one of those people trying to attach history to everything!

Yeah I am sure lots of people want their pornhub history integrated into AI...

If that is the "future" (gag), we better be able to opt out


It's your personal AI running locally on your machine, you can opt out of what you index. You own your data.


Why not Chrome Devtools MCP?


I understand GP like they want to browse normally and have that session's history feed into another indexing process via some IPC like D-Bus. It's meant to receive human events from the browser.

Chrome Devtools MCP on the other hand is a browser automation tool. Its purpose is to make it trivial to send programmed events/event-flows to a browser session.


The universities need to get together and develop their own open-source search engine as part of an ongoing research project. It should be hosted in a distributed fashion from the universities themselves. They have the expertise and the resources these days to do it. And much of the high quality content on the public web originates from the universities anyway. It will be like the Library of Alexandria and not subject to censorship.


There needs to be a browser that archives your browser history and pipes a stream of it in real time so that open-source personal AI engines can ingest it and index it. The future of the Web will be built on this. Google may not do it. Firefox could.


Knuth's ChatGPT Experiment Insights https://gemini.google.com/share/3768c883b67c



Depending on the precise launch time (4:36/4:37 PM CST) "Ship exploded at ≈T+00:08:26": https://en.wikipedia.org/wiki/Starship_flight_test_7


Is CNN the only network that provides an archive of transcripts online?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: