Sunday, August 17, 2008

Stolen Content

By Leigh

This post has absolutely nothing to do with weaving, spinning, dyeing, or knitting. But it is the reason why I haven't gotten much accomplished in the past several days.

The other day I checked my free StatCounter account, and was browsing visitor paths. This is often interesting as well as useful information. Imagine my surprise when I clicked on one referring link, only to see this (click on the picture for a closer look) -

Yup. It's my entire Computer Hex Code Dyeing 2: Wrestling With Recipes on another website, stolen along with about 4 or 5 other of my posts.

I took a look around this site, and quickly figured out that almost all of its content was stolen from other blogs. In other words it is a "splog" (spam blog). A number of the stolen posts showed copyright information and symbols, but this were stolen along with the text and content.

The question is, what to do? Evidently this is a fast growing problem, much to the frustration of folks like us, who respect one another's intellectual property and just want to have ours respected in return. Is it possible to try and fight back? Well, here's what I've been doing about it:

First I did a whois search at Network Solutions, to try to find someone to complain to....

This gave me the administrative and technical contact whom I emailed, asking them to remove my content. I received no response and neither was my content removed.

My next step was to contact the blog host to report the theft. Toward the bottom of the whois information is an IP address link -

From there, I found this -
This tells me who is hosting the offending website, and how to contact them with abuse problems. After emailing their abuse department, I received a quick response with instructions of what to document and where to send it. By law, they will have to remove it, unless the offender offers a counter claim. Then it's left up to the two parties to duke it out.

In the meantime, I have discovered Who Is Hosting This? You can type any URL into the search box and are immediately taken to that website's host. If the offender doesn't remove the stolen content upon your request, the host is the next one to contact.

Another possibility is to contact Google's AdSense, if that's what the splog was set up to take advantage of. Complete instructions of how to do that can be found at What to do if you're getting as sick as I am of having your blog copied by Ian in Hamburg.

In researching all this, I learned quite a bit. Some folks honestly don't understand about copyright. Others, like sploggers, do it intentionally, without regard to copyright. Sploggers often use hacking tools such as 1-More Scanner or Site Import for Dreamweaver, which find and copy content from all over the Internet. It appears that in my case, the scraping (content stealing) software searched the internet for text with certain keywords. The keyword in my posts was "recipe" (as in dye recipes), which is the category the filed my posts were under on that website.

The only reason I found out about it, is because someone clicked on one of the links in that post which brought them back to me. I have been purposely doing this with every post I publish, and for that very reason. Thieves and their hacking software usually just copy and paste, which leaves links intact (though I have learned that some hacking software can ignore the links).

While links back to yourself might alert a reader that the content has been stolen, they don't necessarily let you know about it. Another helpful online tool is Copyscape, which searches for duplicate pages on the Internet. They allow a couple of free searches a day, or unlimited searches with a subscription.

What else am I doing? Since most theft takes place from feed readers, there are two options.
  1. Change the settings so that only a summary of the post is seen in the reader. This may be a minor nuisance for folks who like to read entire posts in their reader of choice, but it allows only the first paragraph or so of a post to be scarped. For Blogger, click on "Settings" then "Site Feed." Look for "Allow Blog Feeds" and choose "short". Then save settings.
  2. Leave the blog feed setting to "full" but put a copyright notice in the post feed footer at the bottom of the site feed settings page. This notice will show up at the end of each post in the feed reader, which means it will be scraped too.
For the moment, I've chosen to do #2, being sure to include my blog url, http://leighsfiberjournal.blogspot.com/ (in addition to my blog name, Leigh's Fiber Journal) in the feed footer. This is in case the scraper copies the text without their links.

You also probably noticed that I added "By Leigh" at the top of the post, and a posted date and "by" at the bottom. Both link back to my blog.

While there is no way to stop content theft, we can all take measures to protect our digital property. Here are some articles that have been helpful to me, so I'm passing them on to you:

5 Content Theft Myths & Why They Are False
by Jonathan Bailey
The 6 Steps to Stop Content Theft by Jonathan Bailey
Fighting Scrapers With Your Left Jab by Darren Rowse
People Stealing My Content - Blah Blah Blah by TheLostGirl
How to deter thieves from stealing your images and server bandwidth by David Airey

If you do find that some of your content has been stolen:
What Do You Do When Someone Steals Your Content by Lorelle

Since most scraping takes place pretty quickly (sometimes almost instantaneously) after publication, adding all this to back posts may or may not help. I'm still trying to decide whether or not to add a copyright or watermark to all my photos. More bother.

Anyway, that's my tale of woe. I hope all of this will be useful to you!

Posted 17 Aug 2008 by http://leighsfiberjournal.blogspot.com

Related Posts:
Update on Stolen Content (& a little more info)
A Note About Watermarks

25 comments:

  1. thank you for all the wonderful information that you have shared.

    Such a pity that your content has been stolen - it's ridiculous.
    and so sorry that you have had to go through all the hassle of trying to get it removed.
    sigh. internets.

    ReplyDelete
  2. Thanks so much for this, Leigh. It has happened to me a couple of times (the label "colour" seems to attract this behaviour) but I had no idea there was anything I could do about it. I think the copyright statement is a v sensible first step.

    ReplyDelete
  3. Yes, a copyright statement is good, but please include it with each post and include your blog url! Even better, have it link back to you!

    ReplyDelete
  4. Thanks for mentioning my hotlinking resource, and good luck getting this sorted (I'm another who has fallen victim to many splogs).

    All the very best.

    ReplyDelete
  5. RRRRRRRRRRRrr this makes me so mad! People!
    At least you know your stuff is so good others want it......

    ReplyDelete
  6. It's like a new twist on identity theft and oh so frustrating! I'm glad you've chosen the second option for now since I'm a feed reader. I don't know how else to be allerted by the blogs I read are updated. Now to go read all that stuff you provided~

    ReplyDelete
  7. Sorry about your blog content theft problem. In your explaination I found out why some blogs only have a summary making a reader open the actual blog. (If I can't read them as is they usually don't get read).
    I do want to thank you for the bar code generator site. I will be using it a lot.
    Maggie

    ReplyDelete
  8. Wow, what a pain in the backside to have to go thru. Good info to have, thanks for doing your usual excellent job of documenting things! :-) T.

    ReplyDelete
  9. Unbelievable!!!!! Some thing I would have never expected but then if it's something you would never think of doing it doesn't enter your mind that someone else would.
    Thanks for sharing this with us and I'm so sorry this happened to you.

    ReplyDelete
  10. I'm bemused by this, what an absurd thing to do, building pages out of random stolen content, why do they do it?

    The internet is fundamentally an anarchic space, that's both it's greatest strength and greatest flaw.

    I'm glad you tracked down a US company that you could potentially take action against. Thanks for sharing all your research. I don't suppose any of us is safe against this.

    ReplyDelete
  11. Thanks for the information, Leigh. I believe the same thing is happening to me, as some of the referal URLs I am getting have absolutely nothing to do with fibers ("credit reports" is the only one I can repeat in polite company.)

    Once I start posting patterns, I might just stick to Ravelry, hoping that has better control. Watermarking photos also seems to be in my future. Ugh.

    ReplyDelete
  12. Leigh, I know there is really nothing funny about this and I too appreciate your research and will make use of your findings, still, it appeared on a FOOD recipe site! I'm sorry, but I think that is absolutely hilarious! Are we still friends.......I hope? I may have a question or two for you later as I don't know that I am totally clear on a couple of points, but then I have only read through all of this quickly. Again, thank you!

    ReplyDelete
  13. Wow! I would have never guessed that someone could do all that.

    ReplyDelete
  14. Though I certainly think this kind of thing is terrible, I think that the real danger comes not from sploggers but from "fellow" fiber folks who decide to copy our content and create a workshop for which they get paid or, if they are really even bolder, use it in a publication as their own work.

    ReplyDelete
  15. that's annoying - having to waste time on figuring out how to protect your thoughts and ideas - instead of using that time on weaving or spinning etc.! but Peg is not alone - I too had a grin on my face when I imagined how readers of a cooking blog might react when they read about your colour experiments:)) I don't understand either where the sense is in "splogging" - if you add texts at random - who is going to read that waffle in the end? it really takes the fun out of blogging and publishing patterns for free, if you have to fight tooth and nail not to be copied willy-nilly:( I hope this doesn't annoy you so much that you stop blogging about your very interesting tests and experiments!

    ReplyDelete
  16. Not only is it theft, it's dangerous the formula you've given for dying yarn on a food site!

    ReplyDelete
  17. If you look around that site, it doesn't take long to realize that the posts don't fit the categories, giving the whole thing an oddly disjointed atmosphere. That alone should alert someone to the fact that something is wrong with it and to get the heck out. Evidently they do it for a couple of reasons.

    1) to make money off of Google AdSense but are too lazy to write their own content
    2) to infiltrate with links for supposed prescription drugs (which are usually not pure) or porn

    Peg is right that this is only one kind of theft. But this is what happens when folks have no respect for one another.

    ReplyDelete
  18. I looked up one of the sites that filched a colour post I did a few weeks back and have started to track them down... and have also added the date, my name and URL to today's post!

    ReplyDelete
  19. Great work researching this and laying it all out so clearly - and thanks for linking to me.

    ReplyDelete
  20. Thank you for all the info. I check your blog for fun things to learn and this is so important to all of us who blog.

    ReplyDelete
  21. I read your blog, here in Ireland, as it is so informative and interesting. Your adventures in weaving, and the careful way you document them are great to read. I think it's dreadful that there are people out there who steal your work and pass it off as their own. Congrats on trying to get to the bottom of it all, and best of luck in the future

    ReplyDelete
  22. I thought your post was really interesting! I am definitely checking out some of the links your posted. I also wanted to recommend using Google Alerts: http://www.google.com/alerts

    What it does is send you an email every time a set of particular key words, that you choose, are used on the internet. I use it with my site to see if anybody has mentioned my yarn. Although, I can also see it as a helpful way to prevent theft of content.

    ReplyDelete
  23. holey moley, I had NO idea this sort of thing was going on!!!

    (goes off to investigate own blog stuff....)

    ReplyDelete
  24. Good grief, Leigh! What an ordeal! I hope this problem is resolved now. Thanks for the info - I hope I'll never need to know it.

    ReplyDelete
  25. Hi Leigh,

    thanks for the Ideas and Tips!!

    I'll try it out. There are a lot of thieves in the Open Source Software area.

    I'm trying to find a way to keep Open Software --> Open. I will not in the long run contribute to the Open Source Software if my work is stolen and not recognized.

    over

    ReplyDelete

Thank you for taking the time to leave a comment!