I’m learning about Copyscape (external link) and I am confused. I have many questions, but are they the right ones?

Copyscape is a web service that detects plagiarism on the net. It also reports on whether submitted content is unique. First things first.

Let’s say you have a web page that’s been on-line for a while. You want to know if people are copying its content. You enter your URL into Copyscape’s interface and it will return, if you are unlucky, a list of pages that are plagiarizing your writing.

Copyscape’s Premium service goes further. Submit content into its search field and it will tell you if that writing already exists on web pages and to what degree. It’s a good checker for a teacher to validate the originality of an essay, or a web site builder to check on whether a freelance has provided their own work. Using methods I don’t completely understand, it will return a percentage rating. “Your content is 32% unique.” Or 7%. Or whatever. To demonstrate that rating, Copyscape will show you the pages where copied work appears and it will highlight the exact words and phrases it has problems with. What’s my problem? Let me give you a typical example, and, again, I am only slowly understanding this technology.

If you are a dentist in Sacramento, California, you probably already have a website with the usual pages. You have an “About” page, a “Home” page, a “FAQ” and so on. To make your site more appealing to the search engines, you might have some pages on the general practice of dentistry, original content, written for your site to add value to your readers. Now we get tricky.

Your practice has grown and you are expanding to three more cities. Naturally you’d like to port the content that you’ve paid for to the new websites you’re building for each new office. Apparently that won’t make the search engines happy. They don’t like to see copied text and will rank the new sites much lower than they should be. Google will run a Copyscape like search across the web, see what it thinks is plagiarized or copied content, and whoosh, down goes your ranking.

Despite Google’s supposed super-sophistication, it can’t see that your websites are all run by the same group. Or perhaps it sees that they are but still insists that each page be individual. What it considers, as does Copyscape, “Unique.” What, then is “Unique?” Is 60% unique good enough for Google? Or does it have to be 90% unique? Does one have to rewrite every single duplicate page for every website? Can I just rearrange the sentence structure or do I have to build each page anew? Good questions. All I can find out is that it all depends. Good grief.

I’ll have more on this in future posts as I struggle to understand it more. It appears that rephrasing and rewording are not good enough. In those cases you are not adding anything to something that has already been written. You are not bringing anything new. And, no, it can’t be just fluff or padding. One forum had this quote, which is pointing me in the direction I will continue investigating. “There is no content on the web, not even peer reviewed articles, that are 100% unique. The uniqueness or the originality of content lies in your ability to add some information or value to what others have done.”


An excellent page on search and the quality of unique:

Update: April 6, 2015. Does the word unique mean the same thing to Google and Copyscape? I don’t know. I will be looking into that.

