Repeatedly my sales and consulting staff find themselves explaining that using duplicate content can and will negatively affect search engine rankings and it is heartbreaking to see clients having to rebuild rankings due to such a simple mistake. As a result, I felt it was time to write this article and hopefully dispel many misled website owners.
Why write an entire article on something as simple as duplicate content? Well probably because it is not as simple as it sounds and many website owners find themselves in the grey area of duplication; where they don’t know for sure whether they are risking rankings or not.
The following is a sectional breakdown of the most common duplicate content issues we see defined from the standpoint of a question – hopefully making this article a little easier to read. After all, I have no illusions that reading up on duplicate content rules is exciting.
Definition: a duplicate website is a website that has many if not all of the same pages as another live website.
Note: the following questions are based on a person who owns two websites that are duplicates.
Q: “Why is a duplicate website such a bad idea?”
A: The major search engines are constantly trying to improve the quality of their search engine results in an effort to provide the best quality content for users. When duplicate content is indexed by search engine spiders, valuable time and processing power is wasted. As a result, search engines have blocked sites that used duplicate content from their database, ultimately favouring the site that either had the content first, or I believe, the one site that has the greater online history. In addition, the major search engines have a bad taste after dealing with so much duplicate content created by spammers over the past several years. As a result, posting a duplicate website is an offense that can quite literally blacklist a domain; there are few things the search engine properties dislike more than being gamed by spammers.
Q: “What should I do with my duplicate website then? Just delete it?”
A: Deleting the site is the only option unless you want to create an entire new website with unique content and a unique purpose. That said, by deleting the website you can still ensure the effort you put into promoting the old site does not go to waste by pointing the domain to your new website’s domain using a 301 redirect. A 301 is a term used to describe a server protocol which Google and other search engines will ‘see’ when they visit the old site. The protocol essentially says that your content from the old site can be found on the new site and that this is a permanent forwarding of all traffic. 301 redirects are by far the best way to minimize your losses from shutting down a website that just might have traffic or inbound links.
Q: “Which website should I shut down? Is there anything I should consider first?”
A: Yes, it is very important that you keep the website that has the most backlinks and has been online the longest. The reason I say this is that Google tends to favour entrenched websites; they have been around a while, are well backlinked and overall appear to have a positive history.
Whatever your decision is, it is vital you understand switching a website to a new domain is a dangerous step. This is because of Google’s famed ‘sandbox’. The ‘sandbox’ is really only an overused turn of phrase that represents a portion of the Google algorithm which considers the age of the domain as a signifier of trust. Generally, new websites will require 6 months to a year before substantial rankings are evident; this is kind of a right of passage that Google appears to be enforcing on the average website. Sites that are obviously popular and quickly gain a load of legitimate link popularity will easily avoid the sandbox (because Google can not afford to miss a ‘great’ website) but this is not the common scenario.
Q: “Will using a 301 redirect pass on the benefit of the deleted site’s link popularity?”
A: Link popularity is passed onto the other website when a 301 is used but how much this pass-over will benefit the website seems to fluctuate on a case-by-case basis. Usually the fluctuation is only present when popularity from one domain is passed to another with differing content/topic. In this case, since the link popularity is being redirected to an identical website I expect the benefit to be virtually lossless.
Definition: content appearing within a website that is duplicated elsewhere on the same website or elsewhere on the Internet.
Q: “I need content for my website; can I just copy content from industry journals and benefit from that quality content?”
A: No, aside from the copyright concerns of using content that is not yours, your rankings (if they exist) would suffer because it is highly likely the major search engines would detect the duplicate content. As a result, the page that you create may get flagged as duplicated and it would be ignored at the very least. The page could even devalue your site’s overall credibility. Credibility is a critical component of Google’s algorithm so sites with less credibility tend to have a harder time staying (‘sticking’ if you will) in a particular ranking.
Q: “I use a content management system to manage my site and it uses a particular set of templates. These templates have some duplicate content within them and they are spread all throughout my website. Should I be worried?”
A: No, in most cases the amount of duplicate content used within a template in a content management system (CMS) is negligible. If, however, you have a large number of pages created using a page where 90% of the text is duplicated and only 10% is unique you do have a reason to make some changes. In my opinion it is crucial that every page within a website be composed mostly of unique content with the exception of catalogues and shopping carts where text simply has to be reused over and over.
Whatever your situation make certain that your site contains a large number of pages composed of unique content that has been well optimized by yourself or your search engine optimizer (SEO).
Q: “How much of my page should be unique? Is there a standard ratio or percentage you can share?”
A: There is no industry standard formula but if I had to state a percentage I would say a minimum of 70% of the page should be completely unique to thwart any concerns of duplication. You may be able to get away with less than 70% unique content but I would suggest this is playing with fire. Either way, this statistic is moot since every page you create needs to be created with the intention to provide a powerful resource; after all search engines are only a small part of the plan – you do need visitors to like what they see and buy your product or service!
Q: “My blog currently has many different ways to find content and depending on the route a visitor may find the page is actually shown on a different URL (i.e. archives, search by label, etc.). In this scenario am I not in danger of a duplicate content penalty?”
A: Yes and no. Yes that this is duplicated content but no you are not likely to be penalized by this simply because a majority of blogs offer these additional methods of finding content so it would be detrimental if search engines penalized this application right now. That said, search engines do have to have some way to handle this duplicate content. I expect when Google (picking the most advanced search engine) finds duplicate blog postings on a website its algorithm chooses the most popular posting as the primary page to provide in its ranking results. In other words, the posting URL that has the most number of inbound links or was spidered first will be the page that attains rankings.
For those unfamiliar with blogs, the following is an example how a blog can easily have 3 duplications of a single article. In this scenario, I recently posted an article on our SEO Blog called “SEO Answers #12”. Upon posting this article was immediately posted in 3 places: once on the home page (because it is the latest article), second on its own page for permanent linking purposes, and third within the label “Local Search” a topic related to this posting.
1) SEO blog home URL: http://news.stepforth.com
2) Permalink URL: http://news.stepforth.com/blog/2007/01/
3)”Local Search” label URL:
In the future I expect blog systems will offer an option to specifically add a NO INDEX tag to the top of posts located within the labelled search section. After all, every additional label I added to this article created a duplicate version which is something that I expect search engines will soon either ignore or require a NO INDEX tag.
I am sure I didn’t cover every question regarding duplicate content but I am fairly certain I touched on the most common questions we see at StepForth. If you would like to submit a duplicate content question or any other SEO question please go to our submission page and I will endeavour to respond as soon as possible; likely in an article format or SEO blog posting.