I like this idea! Have the usual official results... then have an option to go t...

pbhjpbhj · on Nov 17, 2020

Per your 2nd para Google used to have some options to tailor the results more, like allinurl or inurl or title or link (IIRC the word had to be in a link pointing to that page) or whatever.

I expected that to evolve to get more specificity but things went completely the otherway and we can't even specify a term is on a page reliably with Google now.

Similarly, I was all in on xhtml and semantics (like microformats) where you'd be able to search for "address: high street AND item:beer with price:<2" to find a cheap drink.

6510 · on Nov 18, 2020

I use to use inauthor: a lot.

I imagine for a FOSS solution we would have to make configurable every separable ranking algo and the option to toggle them in groups as well as build cli like queries around them (with a gui)

I'm starting to see a picture now. In stead of wondering how to build a search engine we should just build things that are compatible. A bit like The output of your database is the input of my filter.

Take site search, it is easy to write specs for with tons of optional features and can easily outperform any crawler. Meta site search can produce similar output. Distributed diy cralwers can provide similar data.

Arguably top websites should not be indexed at all. They should provide their own search api.

The end user puts in a query and gets a bunch of results. It goes into a table with a colum for each unique property. The properties show up in the side bar to refine results (sorted by howmany results have the property) Clicking on one/filling out the field/setting a min max displays the results and sends out a new more specific query looking for those specific properties. New properties are obtained that way.

pbhjpbhj · on Nov 18, 2020

Yes, I was thinking along similar lines IIUC of a sort of federated search using common db schemas and search apis so that I could crawl pages and they could be dragged in to your SERP by a meta-search engine. I think the main thing you lose is popularity and ranking from other people's past searches - that could be built in but it relies on trusting the individual indices, which would be distributed and so could be modified to fake popularity or return results that were not wanted (though then one could just cut off that part of one's search network).