PageRank was never as important as people thought it was.
Think of it this way: a search engine needs a relevance score that connects a query to a document. If the number of documents is vast (e.g. billions and billions) a search engine also benefits from a document-dependent quality score.
The first is more important than the second. You'd rather get a poor quality document that is relevant to the topic than a high quality document which isn't relevant.
It took several years before papers in the literature came out that found PageRank useful in search results, the key thing is that you need a real excess of documents. With millions of documents you are better off without it (being more effective at finding relevant documents improves performance), you really need 100 million + to reach the point where you have so many relevant documents for typical queries that filtering on quality doesn't get in the way of relevance.
PageRank can be thought of as simulating a Markov process where a user clicks a random link on a page most of the time but with some probability jumps to an entirely random page. PageRank is proportional to the probability that a user visits the page, or alternately how much traffic a page gets.
Google very quickly developed a few ways to sample this directly, such as (1) making Google analytics almost ubiquitous, (2) making Google ads almost ubiquitous, (3) analytics from the Chrome browser.
Google denies using the above for ranking, but they've been known to lie about Google's relevance factors before. Even a small sample from the above 3 could be used to calibrate models based on other info.
Think of it this way: a search engine needs a relevance score that connects a query to a document. If the number of documents is vast (e.g. billions and billions) a search engine also benefits from a document-dependent quality score.
The first is more important than the second. You'd rather get a poor quality document that is relevant to the topic than a high quality document which isn't relevant.
It took several years before papers in the literature came out that found PageRank useful in search results, the key thing is that you need a real excess of documents. With millions of documents you are better off without it (being more effective at finding relevant documents improves performance), you really need 100 million + to reach the point where you have so many relevant documents for typical queries that filtering on quality doesn't get in the way of relevance.
PageRank can be thought of as simulating a Markov process where a user clicks a random link on a page most of the time but with some probability jumps to an entirely random page. PageRank is proportional to the probability that a user visits the page, or alternately how much traffic a page gets.
Google very quickly developed a few ways to sample this directly, such as (1) making Google analytics almost ubiquitous, (2) making Google ads almost ubiquitous, (3) analytics from the Chrome browser.
Google denies using the above for ranking, but they've been known to lie about Google's relevance factors before. Even a small sample from the above 3 could be used to calibrate models based on other info.