Welcome our webmaster and SEO forum
Please enjoy the forum, contribute what you can, and wind up the Moderators!
Closed Thread
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 18

Thread: Duplicating Content to Avoid Duplicate Content

  1. #1
    anhblog.net is offline Member anhblog.net is on a distinguished road
    Join Date
    Apr 2009
    Posts
    36

    Default Duplicating Content to Avoid Duplicate Content

    Duplicate content
    Duplicate content is loosely defined as being several copies of the same content in different parts of the web. It causes a problem when search engines spider all the copies, but only want to include a single copy in the search engine index to maintain relevancy. Generally, all other copies of this ‘duplicate’ content are ignored in the search results, bunged into the supplemental index, or perhaps not even indexed at all.

    Doing the right thing
    Fundamentally, search engines want to do the right thing and let the original author rank in the search engines for the content they wrote. We have just said that only one copy of the content can rank in the search engines, and search engines probably want this to be the original.
    When you search for something, and some Wikipedia content appears in the search results, you want to see the copy from wikipedia.org, not one of the thousands of copies of the content on various rubbish scraper sites.
    I believe that search engines want to give the search result / traffic to the original author of the content.
    Determining the original author of the content
    Consider the following 4 copies of the same article on different domains. How is Google to know which is the original copy ?
    Duplicate content example
    The above example shows 4 copies of the same content. The date indicates the date the content was first indexed, and the PageRank bar indicates, um, PageRank. Let’s assume for this simplified example that PageRank is an accurate measure of link strength / domain authority / trust etc. The smaller pages pointing to each larger page represent incoming links from other websites.
    * Document 1 was first indexed a couple of weeks after the other copies, so as a search engine you might decide that this is not the original because it wasn’t published first.
    * Documents 2 and 3 have the same good PR, were first indexed about the same time, and have the same number of incoming links.
    * Document 4 was indexed slightly after documents 2 and 3, and it also has less PR and less links, so as a search engine you might conclude this is not the best copy to list either.
    As a search engine, we are stuck deciding between document 2 and document 3 as to which is the original / best copy to list. At this point, Google is likely to take it’s best guess and leave it at that, which will see the original author “penalised” on many occasions.
    Enter the cheesy scraper sites
    Let’s recycle that same example, but this time we are going to add an “author credit” link to the bottom of document 4. Document 4 could be considered a cheesy low PR, low value, scraper site, but one that was kind enough to provide a link back to the original document.
    Duplicate content example 2
    All of a sudden, there is a crystal-clear signal to the search engines that document 2 is the original.
    When there are a collection of identical pages out there on the web and it’s hard to decide who is the author - it’s likely that search engines look at how those copies link between each other and use that data to determine an original.
    All other things being equal, this seems like a logical assumption to make.
    Duplicating your own content
    So, if you know your content is being duplicated on scraper sites - I’m saying you can prevent getting penalised by making sure some of the scrapers provide a link back to your original document.
    If none of the scrapers are polite enough to do this, then I’m suggesting you should create your own scraper site, scrape your own content, and provide a link back to yourself.
    As RSS feeds become more popular, and content is recycled all over the web, this problem is only going to get worse.
    Disclaimer: I’m yet to back this up with any real testing, so don’t blame me if you duplicate your own website and find yourself having duplicate content problems. I wouldn’t even consider this tactic unless you were having problems with high PR sites scraping your RSS feed.
    Full of How to Optimization of Google ( SEO on Page , Incoming Links, Sucess Ranking in Google )
    SEO Tips - Make a Blog - Blog Installation

  2. #2
    rajupatel544 is offline Junior Member rajupatel544 is on a distinguished road
    Join Date
    Apr 2009
    Posts
    14

    Default

    Hello, thanks for sharing this good tips

  3. #3
    stanneon is offline Senior Member stanneon is on a distinguished road
    Join Date
    Sep 2008
    Posts
    106

    Default

    I have read about that, and hey! I just found tool that lets me chack if the content is a duplicate content. The Plagiarism Checker

  4. #4
    bobwilson37 is offline Member bobwilson37 is on a distinguished road
    Join Date
    Jan 2009
    Posts
    78

    Default

    Thank you very much for your post. It informs a lot.
    Keep sharing.
    Niche For SEO : SEO services London

  5. #5
    Robdale is offline Senior Member Robdale is on a distinguished road
    Join Date
    Jan 2009
    Posts
    384

    Default

    Impressive article, Thanks for writing such a wonderful article.

  6. #6
    YooHee's Avatar
    YooHee is offline Senior Member YooHee is on a distinguished road
    Join Date
    Sep 2008
    Posts
    320

    Default

    i think using this new code from google can really help to stop duplicating content of a site

  7. #7
    surreypcsupport's Avatar
    surreypcsupport is offline Senior Member surreypcsupport is on a distinguished road
    Join Date
    Nov 2008
    Location
    surrey
    Posts
    569

    Default

    Quote Originally Posted by anhblog.net View Post
    Duplicate content
    Duplicate content is loosely defined as being several copies of the same content in different parts of the web. It causes a problem when search engines spider all the copies, but only want to include a single copy in the search engine index to maintain relevancy. Generally, all other copies of this ‘duplicate’ content are ignored in the search results, bunged into the supplemental index, or perhaps not even indexed at all.

    Doing the right thing
    Fundamentally, search engines want to do the right thing and let the original author rank in the search engines for the content they wrote. We have just said that only one copy of the content can rank in the search engines, and search engines probably want this to be the original.
    When you search for something, and some Wikipedia content appears in the search results, you want to see the copy from wikipedia.org, not one of the thousands of copies of the content on various rubbish scraper sites.
    I believe that search engines want to give the search result / traffic to the original author of the content.
    Determining the original author of the content
    Consider the following 4 copies of the same article on different domains. How is Google to know which is the original copy ?
    Duplicate content example
    The above example shows 4 copies of the same content. The date indicates the date the content was first indexed, and the PageRank bar indicates, um, PageRank. Let’s assume for this simplified example that PageRank is an accurate measure of link strength / domain authority / trust etc. The smaller pages pointing to each larger page represent incoming links from other websites.
    * Document 1 was first indexed a couple of weeks after the other copies, so as a search engine you might decide that this is not the original because it wasn’t published first.
    * Documents 2 and 3 have the same good PR, were first indexed about the same time, and have the same number of incoming links.
    * Document 4 was indexed slightly after documents 2 and 3, and it also has less PR and less links, so as a search engine you might conclude this is not the best copy to list either.
    As a search engine, we are stuck deciding between document 2 and document 3 as to which is the original / best copy to list. At this point, Google is likely to take it’s best guess and leave it at that, which will see the original author “penalised” on many occasions.
    Enter the cheesy scraper sites
    Let’s recycle that same example, but this time we are going to add an “author credit” link to the bottom of document 4. Document 4 could be considered a cheesy low PR, low value, scraper site, but one that was kind enough to provide a link back to the original document.
    Duplicate content example 2
    All of a sudden, there is a crystal-clear signal to the search engines that document 2 is the original.
    When there are a collection of identical pages out there on the web and it’s hard to decide who is the author - it’s likely that search engines look at how those copies link between each other and use that data to determine an original.
    All other things being equal, this seems like a logical assumption to make.
    Duplicating your own content
    So, if you know your content is being duplicated on scraper sites - I’m saying you can prevent getting penalised by making sure some of the scrapers provide a link back to your original document.
    If none of the scrapers are polite enough to do this, then I’m suggesting you should create your own scraper site, scrape your own content, and provide a link back to yourself.
    As RSS feeds become more popular, and content is recycled all over the web, this problem is only going to get worse.
    Disclaimer: I’m yet to back this up with any real testing, so don’t blame me if you duplicate your own website and find yourself having duplicate content problems. I wouldn’t even consider this tactic unless you were having problems with high PR sites scraping your RSS feed.
    Full of How to Optimization of Google ( SEO on Page , Incoming Links, Sucess Ranking in Google )

    Talking of duplicate content I just found this exact same information in the following locations:

    Duplicating content to avoid duplicate content | RagePank SEO

    Duplicating Content to Avoid Duplicate Content | FedEx's Blog

    Blogger and Tips Site Details & Statistics - Blog Top Sites

    Maybe you should practice what you preach!

  8. #8
    anhblog.net is offline Member anhblog.net is on a distinguished road
    Join Date
    Apr 2009
    Posts
    36

    Default

    Quote Originally Posted by YooHee View Post
    i think using this new code from google can really help to stop duplicating content of a site
    Which one is that ? Can you explain something
    SEO Tips - Make a Blog - Blog Installation

  9. #9
    surreypcsupport's Avatar
    surreypcsupport is offline Senior Member surreypcsupport is on a distinguished road
    Join Date
    Nov 2008
    Location
    surrey
    Posts
    569

    Default

    Quote Originally Posted by anhblog.net View Post
    Duplicate content
    Duplicate content is loosely defined as being several copies of the same content in different parts of the web. It causes a problem when search engines spider all the copies, but only want to include a single copy in the search engine index to maintain relevancy. Generally, all other copies of this ‘duplicate’ content are ignored in the search results, bunged into the supplemental index, or perhaps not even indexed at all.

    Doing the right thing
    Fundamentally, search engines want to do the right thing and let the original author rank in the search engines for the content they wrote. We have just said that only one copy of the content can rank in the search engines, and search engines probably want this to be the original.
    When you search for something, and some Wikipedia content appears in the search results, you want to see the copy from wikipedia.org, not one of the thousands of copies of the content on various rubbish scraper sites.
    I believe that search engines want to give the search result / traffic to the original author of the content.
    Determining the original author of the content
    Consider the following 4 copies of the same article on different domains. How is Google to know which is the original copy ?
    Duplicate content example
    The above example shows 4 copies of the same content. The date indicates the date the content was first indexed, and the PageRank bar indicates, um, PageRank. Let’s assume for this simplified example that PageRank is an accurate measure of link strength / domain authority / trust etc. The smaller pages pointing to each larger page represent incoming links from other websites.
    * Document 1 was first indexed a couple of weeks after the other copies, so as a search engine you might decide that this is not the original because it wasn’t published first.
    * Documents 2 and 3 have the same good PR, were first indexed about the same time, and have the same number of incoming links.
    * Document 4 was indexed slightly after documents 2 and 3, and it also has less PR and less links, so as a search engine you might conclude this is not the best copy to list either.
    As a search engine, we are stuck deciding between document 2 and document 3 as to which is the original / best copy to list. At this point, Google is likely to take it’s best guess and leave it at that, which will see the original author “penalised” on many occasions.
    Enter the cheesy scraper sites
    Let’s recycle that same example, but this time we are going to add an “author credit” link to the bottom of document 4. Document 4 could be considered a cheesy low PR, low value, scraper site, but one that was kind enough to provide a link back to the original document.
    Duplicate content example 2
    All of a sudden, there is a crystal-clear signal to the search engines that document 2 is the original.
    When there are a collection of identical pages out there on the web and it’s hard to decide who is the author - it’s likely that search engines look at how those copies link between each other and use that data to determine an original.
    All other things being equal, this seems like a logical assumption to make.
    Duplicating your own content
    So, if you know your content is being duplicated on scraper sites - I’m saying you can prevent getting penalised by making sure some of the scrapers provide a link back to your original document.
    If none of the scrapers are polite enough to do this, then I’m suggesting you should create your own scraper site, scrape your own content, and provide a link back to yourself.
    As RSS feeds become more popular, and content is recycled all over the web, this problem is only going to get worse.
    Disclaimer: I’m yet to back this up with any real testing, so don’t blame me if you duplicate your own website and find yourself having duplicate content problems. I wouldn’t even consider this tactic unless you were having problems with high PR sites scraping your RSS feed.
    Full of How to Optimization of Google ( SEO on Page , Incoming Links, Sucess Ranking in Google )

    Talking of duplicate content I just found this exact same information in the following locations:

    http://www.ragepank.com/articles/137/duplicating-content-to-avoid-duplicate-content/

    http://anhblog.net/search-engine-optimatization/duplicating-content-to-avoid-duplicate-content/

    http://www.blogtopsites.com/sitedetails_14019.html

    Maybe you should practice what you preach!

  10. #10
    surreypcsupport's Avatar
    surreypcsupport is offline Senior Member surreypcsupport is on a distinguished road
    Join Date
    Nov 2008
    Location
    surrey
    Posts
    569

    Default

    Quote Originally Posted by anhblog.net View Post
    Duplicate content
    Duplicate content is loosely defined as being several copies of the same content in different parts of the web. It causes a problem when search engines spider all the copies, but only want to include a single copy in the search engine index to maintain relevancy. Generally, all other copies of this ‘duplicate’ content are ignored in the search results, bunged into the supplemental index, or perhaps not even indexed at all.

    Doing the right thing
    Fundamentally, search engines want to do the right thing and let the original author rank in the search engines for the content they wrote. We have just said that only one copy of the content can rank in the search engines, and search engines probably want this to be the original.
    When you search for something, and some Wikipedia content appears in the search results, you want to see the copy from wikipedia.org, not one of the thousands of copies of the content on various rubbish scraper sites.....

    Talking of duplicate content I just found this exact same information in the following locations:

    http://www.ragepank.com/articles/137/duplicating-content-to-avoid-duplicate-content/

    http://anhblog.net/search-engine-optimatization/duplicating-content-to-avoid-duplicate-content/

    http://www.blogtopsites.com/sitedetails_14019.html

    Maybe you should practice what you preach!

Closed Thread
Page 1 of 2 1 2 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

     

Tags for this Thread

Bookmarks

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124