Canonicalization for SEOs refers to normalizing (redirecting to a single dominant version) multiple URLs.
Canonicalization can be a challenging concept to understand (and hard to pronounce - "ca-non-ick-cull-eye-zay-shun"), but it's essential to creating an optimized website. The fundamental problems stem from multiple uses for a single piece of writing - a paragraph or, more often, an entire page of content will appear in multiple locations on a website, or even on multiple websites. For search engines, this presents a conundrum - which version of this content should they show to searchers? SEOs, refer to this issue as duplicate content.
To provide the best searcher experience, search engines will rarely show multiple, duplicate pieces of content and thus, are forced to choose which version is most likely to be the original (or best).
For SEOs, canonicalization refers to individual web pages that can be loaded from multiple URLs. This is a problem because when multiple pages have the same content but different URLs, links that are intended to go to the same page get split up among multiple URLs. This means that the popularity of the pages gets split up. Unfortunately for web developers, this happens far too often because the default settings for web servers create this problem. The following lists show the most common canonicalization errors that can be produced when using the default settings on the two most common web servers:
Luckily for SEOs, web developers developed methods for redirection so that URLs can be changed and combined. Two primary types of server redirects exist: 301 redirects and 302 redirects:
Canonicalization is not limited to the inclusion of alphanumeric characters. It also dictates forward slashes in URLs. If a web surfer goes to http://www.google.com they will automatically get redirected to http://www.google.com/ (notice the trailing forward slash). This is happening because technically the latter is the correct format for the URL. Although this is a problem that is largely solved by the search engines already (they know that www.google.com is intended to mean the same as www.google.com/) it is still worth noting because many servers will automatically 301 redirect from the version without the trailing slash to the correct version. By doing this, a link pointing to the wrong version of the URL loses between 1 percent and 10 percent of its worth due to the 301 redirect. The takeaway here is that whenever possible, it is better to internally link to the version with the backslash.
One common mistake when implementing canonicalization fixes to accidentally create a infinite loop between http://www.example.com and http://www.example.com/index.html. The solution to this common glitch is discussed in this post about redirecting an index file to your domain without looping.
Open Site Explorer
Open Site Explorer is a free tool that provides webmasters the ability to see up to 10000 links to any site or page on the web via the Linkscape web index.
Linkscape
A professional quality inlink tool that uses patent-pending SEOmoz metrics. Inlinks, anchor text distribution and more.
HTTP Status Codes
W3's official documentation for HTTP Status codes.
SEO Advice: URL Canonicalization
Matt Cutts, head of the Webspam Team at Google, advice on canonicalization.
Read the original post here - Canonicalization | SEOmoz
Canonical Tag is written by adding rel="canonical"
Apache web server:
- http://www.example.com/
- http://www.example.com/index.html
- http:/example.com/
- http://example.com/index.html
Microsoft Internet Information Services (IIS):
- http://www.example.com/
- http://www.example.com/default.asp (or .aspx depending on the version)
- http://example.com/
- http://example.com/default.asp (or .aspx)
- or any combination with different capitalization.
What is Canonicalization?
Canonicalization can be a challenging concept to understand (and hard to pronounce - "ca-non-ick-cull-eye-zay-shun"), but it's essential to creating an optimized website. The fundamental problems stem from multiple uses for a single piece of writing - a paragraph or, more often, an entire page of content will appear in multiple locations on a website, or even on multiple websites. For search engines, this presents a conundrum - which version of this content should they show to searchers? SEOs, refer to this issue as duplicate content.
To provide the best searcher experience, search engines will rarely show multiple, duplicate pieces of content and thus, are forced to choose which version is most likely to be the original (or best).
SEO Best Practice
For SEOs, canonicalization refers to individual web pages that can be loaded from multiple URLs. This is a problem because when multiple pages have the same content but different URLs, links that are intended to go to the same page get split up among multiple URLs. This means that the popularity of the pages gets split up. Unfortunately for web developers, this happens far too often because the default settings for web servers create this problem. The following lists show the most common canonicalization errors that can be produced when using the default settings on the two most common web servers:
Apache web server:
- http://www.example.com/
- http://www.example.com/index.html
- http:/example.com/
- http://example.com/index.html
Microsoft Internet Information Services (IIS):
- http://www.example.com/
- http://www.example.com/default.asp (or .aspx depending on the version)
- http://example.com/
- http://example.com/default.asp (or .aspx)
- or any combination with different capitalization.
Luckily for SEOs, web developers developed methods for redirection so that URLs can be changed and combined. Two primary types of server redirects exist: 301 redirects and 302 redirects:
- A 301 indicates an HTTP status code of “Moved Permanently."
- A 302 indicates a redirect that is temporary
Canonicalization is not limited to the inclusion of alphanumeric characters. It also dictates forward slashes in URLs. If a web surfer goes to http://www.google.com they will automatically get redirected to http://www.google.com/ (notice the trailing forward slash). This is happening because technically the latter is the correct format for the URL. Although this is a problem that is largely solved by the search engines already (they know that www.google.com is intended to mean the same as www.google.com/) it is still worth noting because many servers will automatically 301 redirect from the version without the trailing slash to the correct version. By doing this, a link pointing to the wrong version of the URL loses between 1 percent and 10 percent of its worth due to the 301 redirect. The takeaway here is that whenever possible, it is better to internally link to the version with the backslash.
One common mistake when implementing canonicalization fixes to accidentally create a infinite loop between http://www.example.com and http://www.example.com/index.html. The solution to this common glitch is discussed in this post about redirecting an index file to your domain without looping.
Related Resources
Related Elements
Related Tools
Open Site Explorer
Open Site Explorer is a free tool that provides webmasters the ability to see up to 10000 links to any site or page on the web via the Linkscape web index.
Linkscape
A professional quality inlink tool that uses patent-pending SEOmoz metrics. Inlinks, anchor text distribution and more.
External Resources
HTTP Status Codes
W3's official documentation for HTTP Status codes.
SEO Advice: URL Canonicalization
Matt Cutts, head of the Webspam Team at Google, advice on canonicalization.
Read the original post here - Canonicalization | SEOmoz
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.