Duplicate Content has become a huge
topic of discussion lately, thanks to the new filters that search engines have
implemented. This article will help you understand why you might be caught in
the filter, and ways to avoid it. We'll also show you how you can determine if
your pages have duplicate content, and what to do to fix it.
Search engine spam is any deceitful
attempts to deliberately trick the search engine into returning inappropriate,
redundant, or poor-quality search results. Many times this behavior is seen in
pages that are exact replicas of other pages which are created to receive
better results in the search engine.
Many people assume that creating multiple
or similar copies of the same page will either increase their chances of
getting listed in search engines or help them get multiple listings, due to the
presence of more keywords.
In order to make a search more
relevant to a user, search engines use a filter that removes the duplicate
content pages from the search results, and the spam along with it.
Unfortunately, good, hardworking webmasters have fallen prey to the filters
imposed by the search engines that remove duplicate content.
It is those
webmasters who unknowingly spam the search engines, when there are some things
they can do to avoid being filtered out. In order for you to truly understand
the concepts you can implement to avoid the duplicate content filter, you need
to know how this filter works.
First, we must understand that the
term "duplicate content penalty" is actually a misnomer. When we
refer to penalties in search engine rankings, we are actually talking about
points that are deducted from a page in order to come to an overall relevancy
score.
But in reality, duplicate content pages are not penalized. Rather they
are simply filtered, the way you would use a sieve to remove unwanted
particles. Sometimes, "good particles" are accidentally filtered out.
Knowing the difference between the
filter and the penalty, you can now understand how a search engine determines
what duplicate content is. There are basically four types of duplicate content
that are filtered out:
- Websites with Identical Pages - These pages are considered duplicate, as well as websites that are identical to another website on the Internet are also considered to be spam. Affiliate sites with the same look and feel which contain identical content, for example, are especially vulnerable to a duplicate content filter. Another example would be a website with doorway pages. Many times, these doorways are skewed versions of landing pages. However, these landing pages are identical to other landing pages. Generally, doorway pages are intended to be used to spam the search engines in order to manipulate search engine results.
- Scraped Content - Scraped content is taking content from a web site and repackaging it to make it look different, but in essence it is nothing more than a duplicate page. With the popularity of blogs on the internet and the syndication of those blogs, scraping is becoming more of a problem for search engines.
- E-Commerce Product Descriptions - Many eCommerce sites out there use the manufacturer's descriptions for the products, which hundreds or thousands of other eCommerce stores in the same competitive markets are using too. This duplicate content, while harder to spot, is still considered spam.
- Distribution of Articles - If you publish an article, and it gets copied and put all over the Internet, this is good, right? Not necessarily for all the sites that feature the same article. This type of duplicate content can be tricky, because even though Yahoo and MSN determine the source of the original article and deems it most relevant in search results, other search engines like Google may not, according to some experts.
So, how does a search engine's
duplicate content filter work? Essentially, when a search engine robot crawls a
website, it reads the pages, and stores the information in its database. Then,
it compares its findings to other information it has in its database.
Depending
upon a few factors, such as the overall relevancy score of a website, it then
determines which are duplicate content, and then filters out the pages or the
websites that qualify as spam. Unfortunately, if your pages are not spam, but
have enough similar content, they may still be regarded as spam.
There are several things you can do
to avoid the duplicate content filter. First, you must be able to check your
pages for duplicate content. Using our Similar Page Checker, you will be able to
determine similarity between two pages and make them as unique as possible. By
entering the URLs of two pages, this tool will compare those pages, and point
out how they are similar so that you can make them unique.
Since you need to know which sites
might have copied your site or pages, you will need some help. We recommend
using a tool that searches for copies of your page on the Internet: www.copyscape.com.
Here, you can put in your web page URL to find replicas of your page on the
Internet. This can help you create unique content, or even address the issue of
someone "borrowing" your content without your permission.
Let's look at the issue regarding
some search engines possibly not considering the source of the original content
from distributed articles. Remember, some search engines, like Google, use link
popularity to determine the most relevant results. Continue to build your link
popularity, while using tools like www.copyscape.com to find
how many other sites have the same article, and if allowed by the author, you
may be able to alter the article as to make the content unique.
If you use distributed articles for
your content, consider how relevant the article is to your overall web page and
then to the site as a whole. Sometimes, simply adding your own commentary to
the articles can be enough to avoid the duplicate content filter; the Similar Page Checker could help you make your
content unique.
Further, the more relevant articles you can add to compliment
the first article, the better. Search engines look at the entire web page and
its relationship to the whole site, so as long as you aren't exactly copying
someone's pages, you should be fine.
If you have an eCommerce site, you
should write original descriptions for your products. This can be hard to do if
you have many products, but it really is necessary if you wish to avoid the
duplicate content filter. Here's another example why using the Similar Page Checker is a great idea. It can tell
you how you can change your descriptions so as to have unique and original
content for your site.
This also works well for scraped content also. Many
scraped content sites offer news. With the Similar Page Checker, you can easily
determine where the news content is similar, and then change it to make it
unique.
Do not rely on an affiliate site
which is identical to other sites or create identical doorway pages. These
types of behaviors are not only filtered out immediately as spam, but there is
generally no comparison of the page to the site as a whole if another site or
page is found as duplicate, and get your entire site in trouble.
The duplicate content filter is
sometimes hard on sites that don't intend to spam the search engines. But it is
ultimately up to you to help the search engines determine that your site is as
unique as possible. By using the tools in this article to eliminate as much
duplicate content as you can, you'll help keep your site original and fresh.
No comments:
Post a Comment