How to find duplicate content in a site?
Duplicate content is not good for SEO. But what causes duplicate content and how do one find out duplicate content within a site? Here’s how.
According to Google, when a site has duplicate content, it brings down the relevancy of the website. But, many a times, duplicate content on a website occurs because of webmaster’s ignorance.
It is not easy to fully avoid duplicate content, but it’s not impossible.
If you have certain checks done periodically, you don’t have to worry about it at all. Here are a few.
What is duplicate content?
Usually we assume that duplicate content is content that is copy-pasted from elsewhere. Sure. But internal duplicate content occurs when one allows the same content to repeat at different places on the website.
For example, on WordPress, an article is shown at different places like date based archives, tag pages, category pages and even the homepage.
Typically this is ignored by webmasters because it is the default setting in WordPress.
But any instance where an article repeats itself improves the chances of creating duplicate content unless proper rules and directions are set for search engines.
Also, some webdesigns have a “common area” where the same content is shown on all pages, for instance the sidebars on a blog, or the footer and header.
The more the common area is, the higher the chances of duplicate content is.
Haven’t you noticed certain websites where the actual content would only be just few paragraphs but the “common area” content would be more than the actual intended content? Those kind of websites too increase the chances of duplicate content on your site.
Below is an example from About.com
In the above screenshot, you’ll see that other than the highlighted portion, every other space is “common area” and is possibly duplicate.
How to find duplicate content within a website?
Go to Siteliner – this website checks internal duplicate content for you easily.
Punch in your domain name and the tool will find you all the duplicate content within that domain, find the broken links and gives you a near report.
Check out the tool here – Siteliner.