SEO Question: does a non-indexed page pass on PageRank?

Last week I was reading about Google’s PageRank formula as it was originally created. As you probably know, PageRank is a numeric value which Google assigns to all the pages in its index, based on the incoming links to a page, to determine the relative value of a page.

Google founders Larry Page (recognize the similarity with PageRank ;-) ) and Sergei Brin stated in their original paper that a page need to be in Google’s index to pass on PageRank.

Want PageRank? Get indexed

That made me wonder if that’s still the case today. At first it seemed logical to me that a page needs to be known by Google to accumulate PageRank and to pass it on to other pages.

Google calculates PageRank on the total network of sites Google has in its index. From that point of view a non-indexed page can’t accumulate PageRank, because Google can’t determine the relative value the page. Or can it?

What about Robots.txt and Meta robots?

But what about pages which have the meta robots tag configured to not index the page but to follow all the links on it (noindex,follow)? And what about pages which are excluded from the index with the Robots.txt file?

I thought I might just ask my network what they thought about the issue. Among others link building expert Wiep Knol replied to me on Twitter:

SEO%20question%20 %20Wiep SEO Question: does a non indexed page pass on PageRank?

With the group feature of LinkedIn I thought I might as well ask the question to some experts in the LinkedIn SEO group. At first people had the same reaction as I first did.

But I asked the same follow up questions as above. A very sound answer followed by Marie-Claire Jenkins:

SEO%20question%20 %20Marie Claire%20Jenkins SEO Question: does a non indexed page pass on PageRank?

She summarizes it nicely:

“NoIndex is a request to not show the page in the results. PR still passes. NoFollow, PR doesn’t pass but does accumulate”

Not indexed? PageRank still passes on!

If we have to believe what people like Wiep and Marie-Claire Jenkins are saying, than the answer to our question “Does a non-indexed page pass on PageRank?” would be ‘Yes’. And I think they are right.

A page could be not in a search engine index (or Google’s index for this matter) for many reasons. But people can still link to that page. So therefore it could accumulate PageRank.

Following the links

Of course those links need to be on pages which are indexed by Google, so Google will end up at the non-indexed page. Let’s assume it stays out of the index because it’s blocked by Robots.txt.

That means that Google won’t show the page in search results retrieved from its index, but it knows the links to and from that page. Therefore it can add this page to the PageRank calculations of the entire Google network of sites.

Guideline

We actually need to rephrase what the definition is of being in Google’s index. I would opt for the following guideline:

“A page need to have an incoming link visible to Google to accumulate and pass on PageRank”

In this guideline the part ‘visible to Google’ means that the link isn’t nofollowed or is unvisible for search engines for other reasons, like with the use of JavaScript or links from within a password protected environment.

So, does a non-indexed page pass on PageRank? Yep, it does! (if the links to and/or from the page are visible to Google ;-) ).

36 Comments

EdWords (Eduard Blacquière)November 26th, 2008 at 08:04

What do you think: does a non-indexed page pass on PageRank? http://tinyurl.com/5qd429

Arjan SnaterseNovember 26th, 2008 at 15:32

Interesting.. I was wondering the following:

Tag pages usally get high rankings, but Google says that they shouldn’t be indexed, because they won’t have search results pages in their search engine.

If you give your tag pages a noindex, follow tag, would they pass on the same value as if they were indexed?

Better said: would a not-indexed page give the same amount of value as an indexed page?

If so, we could generate lots of tag pages, so much that some people would call it spam. But can you get penalized by pages that are not being indexed? :-)

Russell RockefellerNovember 26th, 2008 at 15:32

I’ve actually done a bit of experimentation with this in the past. Google responds best to pages that are indirectly promoted through a proxy. For example if you wanted to increase page rank for site X you do SEO on site Y and link to site X.
In this example it wouldn’t matter if site Y was no-indexed but it would need to pass page rank via dofollow links.

Michel BonvanieNovember 26th, 2008 at 15:56

Nice post :)

I do think it passes PageRank. 1 year ago i did some research where ‘Page A’ (not-indexed) links to ‘Page B’. In this case ‘Page B’ did rank in the SERP. If ‘Page A’ would not pass juice ‘Page B’ should not be ranking that well.

I do wonder if they pass the same amount of Juice if the page was indexed or not indexed? Because a non-indexed page should be considered less or not relevant.

Sint SmedingNovember 26th, 2008 at 16:09

This means the term ‘noindex’ is a bit confusing. Maybe ‘nolist’ would be a better word, because the result of adding a noindex metarobots-tag is that a page will still get indexed but is not listed in the SERP’s. I don’t believe the standard will be adapted, but it is usefull to keep this in mind. So Eduard, Wiep and Marie-Claire, thank you for addressing this! :-)

Alex ChudnovskyNovember 26th, 2008 at 18:22

An uncrawled URL won’t have any outgoing links from it and thus won’t pass any PR out – it can’t pass it because outgoing links are not known.

What I think people confuse here is that “non-indexed” page in Google does not mean it was not crawled and used in PR calculation – so non-indexed (present in full text index for you to find) may well still pass PR, however if it was not crawled at all (due to robots.txt restriction) or anything else then it can’t pass PR out simply because the algorithm won’t know where to pass it to.

Being part of the Google’s huge web-graph may result in passing PR – that’s the whole point of it because they use it to decide which URLs from many candidates actually deserve to be in index or not.

Bryan PhelpsNovember 26th, 2008 at 18:40

In Google’s view, do you think a link from a non-indexed page is as valuable as an indexed page? If I were Google, I would be skeptical of why the page isn’t indexed and possibly question any outgoing links.

RoyNovember 26th, 2008 at 19:39

That’s where you can noindex follow al you paginated categories. I’d rather see you putting up a test like this, instead of asking, because nobody knows and you know best from your own experience and could share it with us ;)

@arjan: Google’s guidelines are stating that you don’t want to create pages that are not for users. Does 150.000k tag pages sound usable for users to you :)

@sint the idea is that your page should be in the index. (which is the way Google calls your ‘list’ :) )

Btw, this is also the reason that you actually should nofollow every link to you pages that are in your robots.txt, at least if you completely go the ‘pagerank algorithm” way. And still, if there’s enough external linklove, Google will show your page if they think it relevant enough…

Russell RockefellerNovember 26th, 2008 at 20:19

Just because a page is protected with .htaccess, nofollow, noindex, etc doesn’t mean Google won’t crawl it. The SERPS are loaded with content that is supposed to be out of bounds.

Alex ChudnovskyNovember 26th, 2008 at 20:21

> The SERPS are loaded with content that is supposed to be out of bounds.

SERPs can contain URLs that can’t be crawled (for whatever reason) – those urls can be ranked on the basis of anchor text and their own PR, however if they are not crawled themselves then they can’t pass PR, this I think is the main question under discussion here :)

WordsmithNovember 26th, 2008 at 21:38

For the novice: why would you not want a page to be indexed by Google that would be interesting enough to be linked to from outside sources? Doorway pages?

Myron TayNovember 27th, 2008 at 11:07

Is there exactly a point to this experiment other than proving the founders wrong?

Eduard BlacquièreNovember 27th, 2008 at 12:26

@Arjan
Interesting discussion indeed. Theoretically a link from a non-indexed page weights the same as an indexed page.

Of course Google could have made an adjustment to the PageRank formula, but I don’t think the PageRank formula is that advanced yet.

@Russel
That sounds logical because the PageRank of page A is of course accumulated by pages linking to page A.

@Michel
Thanks :)

If in your example page B only receives 1 link (from page A) than it would be the case.

As I said before to Arjan, I do think links from indexed and non-indexed pages are treated equally

@Sint
Your welcome! :)

@Alex
You’re right that there’s a definition problem with “non-indexed”.

Using Robots.txt or Meta robots to exclude pages from a search engine index DOESN’T keep Google from crawling the page and taking it into account with the PageRank calculation.

Eduard BlacquièreNovember 27th, 2008 at 12:46

@Bryan
As discussed in my previous comment, I personally think that the PageRank algorithm weights non-indexed pages the same as indexed pages.

This might be different or it might change, but my feeling is that the PageRank algorithm isn’t that advanced at the moment.

@Roy
Agree, but I it would be even more valuable if you would run a test as well to see if we get the same results ;-)

Thanks for adding the note about nofollowing links to pages which are blocked by Robots.txt.

@Russel
We indeed need to make a clear distiction between crawling/indexing and listing a page in the search results (as the algorithm decides), because that’s not the same!

@Wordsmith
That’s a good question, I think in most cases the answer to your question is that you wouldn’t :) (exept for grey/black purposes as you mention)

@Myron
The point here is to discuss this issue and learn from eachother! :)

Ian Macfarlane [LBi Netrank]November 27th, 2008 at 13:16

Robots.txt is a different situation from meta robots – if the page is excluded via robots.txt, then this prevents search engines from crawling it in the first place. Search engines cannot give any values to links they have never seen.

Sint SmedingNovember 27th, 2008 at 15:02

@Roy: here is where the slight difference between the words crawling and indexing comes in I think? :-)

Eduard BlacquièreNovember 27th, 2008 at 18:02

@Ian
There indeed is a difference in crawling between Robots.txt and Meta robots, but there isn’t a difference when it comes to accumulating PageRank.

Also see this nice comparison from Aaron Wall:

http://www.seobook.com/robots-txt-vs-rel-nofollow-vs-meta-robots-nofollow

Samuel LavoieNovember 27th, 2008 at 22:07

There is definitively an definition problem with the index vs crawl expression. I’ll point you to http://sebastians-pamphlets.com/crawling-vs-indexing/
Basically, crawling is fetching content to any form of database without processing the result and indexing is to make sense of all this content.

Eduard BlacquièreNovember 28th, 2008 at 09:57

@Samuel
Thanks for point us to that good explaination! Here’s a good extract from the article:

A crawler directive like “disallow” in robots.txt can direct crawlers, but means nothing to indexers.

An indexer directive like “noindex” in an HTTP header, an HTML document’s HEAD section, or even a robots.txt file, can direct indexers, but means nothing to crawlers, because the crawlers have to fetch the document in order to enable the indexer to obey those (inline) directives.

Sint SmedingNovember 28th, 2008 at 10:41

When the conclusion is that non-indexed pages passes on PageRank, then calculating and passing PageRank would be part of the crawling process and not the indexing.
If this is correct, I think this could be the conclusion of this post!

Ian Macfarlane [LBi Netrank]November 28th, 2008 at 10:44

@Eduard – it does not pass PageRank, however, which is the question asked by the article.

PuneetNovember 28th, 2008 at 11:44

nice topic. I am refreshed with your ideas.

[...] yes. Dutch Search Marketer Eduard Blacquière received the answer through the LinkedIn group LinkedSEO. It said: “NoIndex is a request to not [...]

websitejudgeDecember 2nd, 2008 at 11:21

I think that pages pass PageRank in the process of crawling, not in the process of indexing.

As i stated earlier on netters.nl it is well known that the indexing of pages happens in a couple steps.

Simply said: crawler/bot 1 takes note of the urls and pages that are there, crawler 2 visits the pages behind the url and puts them in a (shown) list. Crawler 1 may pass pagerank to pages that will not be shown in the list.

PuneetDecember 2nd, 2008 at 13:30

how to make blogs crawl by robots. i have my blog in blogspot.com

[...] yes. Dutch Search Marketer Eduard Blacquière received the answer through the LinkedIn group LinkedSEO. It said: “NoIndex is a request to not [...]

Veille SEO and CODecember 8th, 2008 at 13:48

[...] SEO Question: does a non-indexed page pass on PageRank? [...]

pligg.comDecember 8th, 2008 at 15:29

SEO Question: does a non-indexed page pass on PageRank? – Eduard Blacquière’s Search Marketing Blog…

Les pages non indexées et le pagerank (PR)…

[...] oplettende lezer heeft deze vraag (+ bijbehorende discussie) afgelopen week al langs zien komen op m’n Engelstalige blog, maar deze wilde ik jullie hier ook zeker niet [...]

[...] that for you: What should NOINDEX do? 5. At last, juice does not pass on from noindexed pages? http://eduardblacquiere.com/non-inde…pass-pagerank/ Back to SEO basics: The "noindex" robots meta tag directive is for a single page and [...]

[...] yes. Dutch Search Marketer Eduard Blacquière received the answer through the LinkedIn group LinkedSEO. It said: “NoIndex is a request to not [...]

Matt CuttsJanuary 15th, 2010 at 03:58

Marie-Claire Jenkins is correct

ArunSunder (Arun Sundar C)March 5th, 2010 at 09:17

RT @EdWords Eduard Blacquière’s Search Marketing Blog » SEO Question: does a non-indexed pag.. http://tinyurl.com/5qd429

[...] yes. Dutch Search Marketer Eduard Blacquière received the answer through the LinkedIn group LinkedSEO. It said: “NoIndex is a request to not [...]

[...] SEO Question: does a non-indexed page pass on PageRank? [...]

[...] ook de discussie hierover op m’n Engelstalige blog VN:F [1.9.14_1148]please wait…Rating: 0.0/5 (0 votes cast) [...]

Leave a comment

Your comment