Google Canonical problems - www versus non.www

By default your website can be accessed with both www.domain.com and domain.com. Since Google penalizes this due to duplicated content reasons, you should restrict the access to either www.domain.com or domain.com. Some links may be outside of your website scope and/or the search engines may have already indexed your website under both addresses.

How is your site indexed?

Is your website indexed as your-domain.com AND www.your-domain.com? How can you tell? Type site:your-domain.com in Google Search (Figure 1.)

Screenshot Google Site Search.
Figure 1: Google Site Search

What you should see is something like Figure 2 with a list of all the pages Google has indexed.

Screenshot Google Site Search Results.
Figure 2: Google Site Search Results

Now type in site:www.your-domain.com/

Do the search results show the same number of pages? No, then you may have a problem with Google considering your pages duplicate content because of Google Canonical problems. That is, when you have your site accessible both under your-domain.com and www.your-domain.com.

Create a 301 redirect forcing all http requests to use either www.domain.com or domain.com

N.B.  If your site is hosted on the RootsWeb servers you will NOT have access to the .htaccess file.

Redirect domain.com to www.domain.com:

  1. Open Notepad or any plain text editor
  2. Type the  following lines
    Options +FollowSymLinks
    RewriteEngine On RewriteCond %{HTTP_HOST} ^expression-web-tutorials\.com$ [NC]
    RewriteRule ^(.*)$ http://www.expression-web-tutorials.com/$1 [R=301,L]
  3. Save the file as .htaccess
  4. Upload the file to the root directory of your server with your ftp program making sure to select ASCII as the transfer type.

Redirect www.domain.com to domain.com

  1. Open Notepad or any plain text editor
  2. Type the  following lines
    Options +FollowSymLinks
    RewriteEngine On RewriteCond %{HTTP_HOST} ^www.expression-web-tutorials\.com$ [NC]
    RewriteRule ^(.*)$ http://expression-web-tutorials.com/$1 [R=301,L]
  3. Save the file as .htaccess
  4. Upload the file to the root directory of your server with your ftp program making sure to select ASCII as the transfer type.

NOTE: Of course you should change the domain name used in the above examples to YOUR domain name.

This will redirect any request for your-domain.com to www.your-domain.com. OR vice versa.  You can check to see if the redirect is working by doing a header check which will return the following (Figure 3)

Screenshot header check for 301 permanent redirect.
Figure 3: Header check for non www version of site

Figure 3 shows you the my-domain.com has been permanently moved (redirected) to www.my-domain.com

Explanation of this .htaccess 301 redirect

Redirect expression-web-tutorials.com to www.expression-web-tutorials.com.

RewriteEngine On tells apache to start the rewrite module. The next line:

RewriteCond %{HTTP_HOST} ^www.expression-web-tutorials\.com$ [NC]

specifies that the next rule only fires when the http host (that means the domain of the queried url) is not (- specified with the "!") www.expression-web-tutorials.com.

The $ means that the host ends with www.expression-web-tutorials.com - and the result is that all pages from www.expression-web-tutorials.com will trigger the following rewrite rule. Combined with the inversive "!" is the result every host that is not www.expression-web-tutorials.com will be redirected to this domain.

The [NC] specifies that the http host is case insensitive. The escapes the "." - because this is a special character (normally, the dot (.) means that one character is unspecified).

The final line describes the action that should be executed:

RewriteRule ^(.*)$ http://www.expression-web-tutorials.com/$1 [R=301,L]

The ^(.*)$ is a little magic trick. Can you remember the meaning of the dot? If not, this can be any character(but only one). So .* means that you can have a lot of characters, not only one. This is what we need because ^(.*)$ contains the requested url, without the domain.

The next part http://www.expression-web-tutorials.com/$1 describes the target of the rewrite rule. This is our "final" used domain name, where $1 contains the content of the (.*).

The next part is also important, since it does the 301 redirect for us automatically: [L,R=301]. L means this is the last rule in this run. After this rewrite the webserver will return a result. The R=301 means that the webserver returns a 301 moved permanently to the requesting browser or search engine.

Additional Resources

Pat Geary.
Updated and Revised, May 19, 2010