Content theft and other dastardly deeds
Everyone understands the idea of theft. If you take something that doesn’t belong to you, it’s stealing. This includes intellectual property, such as written content. If you didn’t write it, it doesn’t belong to you and you can’t use it without permission.
Easy, right? Well, apparently, it’s not such a clear idea any longer.
The Internet has made content theft simple and pervasive. From taking music and artwork to the wholesale heisting of entire Web sites, theft happens all the time.
One of the most common forms of content theft is the stealing of blog content through a technique known as “feed scraping.”
Every blog, including this one, publishes a “feed.” The idea behind feeds is to syndicate your content so that it gets wider circulation. You can, for example, subscribe to the Direct Creative feed here. When I publish a new item, you either get it e-mailed to you or it shows up in whatever feedreading program you choose, including many popular browsers.
But what some people do is take this feed and republish it on their own site, usually as a fast, easy way to add content that attracts traffic for their own ads and affiliate links.
Why am I talking about this? Because my content is stolen frequently. At least one site I’ve seen is made up of nothing but my articles and a bunch of Google AdSense ads. Recently, I found a site purportedly on direct mail that had republished about a dozen of my blog posts with no permission, no byline, and no links to my original posts.
I looked up the contact information with a WHOIS search, and e-mailed the site administrator. I told him to stop stealing my content. To his credit, he responded and was nice about it, but maintained that he wasn’t stealing anything.
I’m sorry that you feel I’m stealing your content, but I’m simply using a common feed aggregator to beef up content for some of my blogs.
“Beef up.” Translated: I don’t want to write my own content. I’d rather steal it.
Why is this important for you? If you have a business blog, post good content, and have a fair amount of traffic, it’s likely that your content is being stolen. This is not only irritating, unethical, and illegal, it can actually hurt your blog. Republishing your articles will create duplicate Web content that can siphon off traffic and lower your placement on search engine results.
And just in case you’re not convinced that this sort of thing is wrong, here’s what Jonathan Baily from Plagiarism Today has to say about it:
… scraping is a classic example of what is not fair use. It takes the whole work, reproduces it, usually for commercial purposes and often without attribution, while offering no commentary, criticism or educational value. It also significantly damages the market for the work by creating a duplicate version of it.
Is there anything you can do to stop content theft? Yes.
- You can publish only excerpts in your feed instead of full content. This is an option in FeedBurner and other feed services. This makes your feed less useful, but knocks out most feed scrapers.
- You can add a copyright notice or Creative Common License to your feed that specifies what rights you grant to others. Again, some feed services such as FeedBurner let you insert some kind of copyright notice automatically.
- You can install a blog plugin, such as CopyFeed for WordPress, that will insert a configurable copyright warning and set up a digital fingerprint that lets you search the Web for content theft. I’ve just installed this and will see how it works.
- You can also do manual searches for the post titles on your blog to discover who is lifting your content. Or you can set up Google Alerts to automatically search for and report on key words, post titles, or your blog address.
And what do you do when you find a thief? It depends on how miffed you are.
I suggest that you contact the Web site owner and either ask him or her to remove the content or comply with whatever reprint conditions you want. I generally want people to ask me for permission, give me a byline with the article, and provide a clickable link back to my site or to the original content.
There’s a lot of debate on whether or how much duplicate content hurts you. I’d rather not have it, but my articles are all over the Web, so that cat’s out of the bag for me. I usually fair pretty well in the search rankings and my site has been around for many years, so I’m probably not penalized too much. But if your site is new, it could be a different story.
This is just another business issue you have to deal with since the Web is such an important part of marketing now and since blogs are one of the primary tools of the trade.
Have you experienced content theft? What did you do about it? I want to hear what you have to say on this issue.
Update: The guy who scraped my content and said he wasn’t stealing just e-mailed me to say that since he sees scraped content all over the place, he really didn’t know it was wrong. He gave a detailed explanation as his apology … and I believe him. He also made good by adding bylines and links on his site.
A lot of theft is dastardly, but perhaps a lot of it is due to ignorance of copyright and just not taking the time to think about it. Now I feel bad for being terse with the guy, but in the end all is well.
Comments
7 Responses to “Content theft and other dastardly deeds”
Leave a Reply
RSS
Email
Twitter
LinkedIn
Interesting post. Sorry you had to go through that bad experience.
Cynthia:
It wasn’t really a bad experience. I find my posts stolen frequently, but at least the one guy was willing to talk about it and change the way he used feed content. Not everyone is that nice, of course. I generally let it slide if there are links back to me.
Your post opened my eyes related to the inherent risks of raising our web visibility. Thank you for sharing it.
Interesting post Dean – I think I was almost sad to see that apparently no one considers my posts good enough to steal! I guess the upside here is that – you have one heckofa good blog! Thanks for sharing your story and the extra tips. John
John:
Don’t be so sure. Do a Google search for the titles of some of your posts and you just might find them posted in places you never expected.
One of the best ways to combat this is to use a little “digital judo”.
With WordPress (and a few other popular blogging platforms) you can create a specific “excerpt” for each post. For a copywriter or direct response marketer, publishing a handcrafted excerpt (ad) in the feed is a gold mine.
For each post create an excerpt that offers a little information on the post and sells whoever is reading it on coming to read the post. Then salt in a few links to the post, the tags and categories for the post, and a related post or two.
It can take some practice getting all of that into a short blurb about your post, but the payoff if you get widely syndicated, as you do, can be large.
Of course if some one is really scraping, they are taking your content from the pages, not the feed, and this doesn’t help with them.
You are right. The unfortunate truth is that your content IS leaving your site. RSS feeds are one way that is happening.
People innocently copy and paste content as well. A product like Tynt’s Tracer can tell you what is being copied.
Dane’s suggestion is great regarding RSS feeds because it achieves that main goal and that is to get traffic to YOUR site.
Trevor
Tynt.com
Do you know what is being copied from your site?