Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> How is this common sense?

Because it's in practice impossible to index every page. Index selection has always been a core quality feature in search engines. (Both re: which pages get included, and re: which pages get included in which layer of index in multi-layered index schemes).

> Yes we do know. He was using Google to find his own old stuff over the years. Some content he was referring regularly disappeared from Google's results. These pages had previously been included in the results.

That's just a guess, it's not actually stated anywhere in Tim's article. But yes, given he did not say otherwise, what you propose is probably what happened. He had a couple of pages which he knew were not found on Google, and checked whether they could be found on Bing.

But my whole point is that this kind of methodology is total garbage. And then he's making pretty absolute statements, like his tweet about the post "TIL that both Bing and DuckDuckGo apparently index a lot more of the Web than Google does".



People used to say that Google would always be the best search index because it had the biggest index, and nobody could match Google there. Being more selective about what you include seems like a big change from past practice, or at least past narrative.


Yes, but what jsnell is saying it, perhaps if you perform the same experiment with Bing, you'd find pages it didn't index, but that were present in other search engines.

You can't say A is better than B with a few data points. You can say you think B's behavior has changed compared to the past. But that's also erroneous.

It's possible the behavior was always there, you just never tripped over it and rare enough that most people don't, either because the web was too smaller before, or your own content was smaller, or it's recent link and access patterns changed enough to trigger the behavior.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: