Everyone understands the idea of theft. If you take something that doesn’t belong to you, it’s stealing. This includes intellectual property, such as written content. If you didn’t write it, it doesn’t belong to you and you can’t use it without permission.
Easy, right? Well, apparently, it’s not such a clear idea any longer.
The Internet has made content theft simple and pervasive. From taking music and artwork to the wholesale heisting of entire Web sites, theft happens all the time.
One of the most common forms of content theft is the stealing of blog content through a technique known as “feed scraping.”
Every blog, including this one, publishes a “feed.” The idea behind feeds is to syndicate your content so that it gets wider circulation. You can, for example, subscribe to the Direct Creative feed here. When I publish a new item, you either get it e-mailed to you or it shows up in whatever feedreading program you choose, including many popular browsers.
But what some people do is take this feed and republish it on their own site, usually as a fast, easy way to add content that attracts traffic for their own ads and affiliate links.
Why am I talking about this? Because my content is stolen frequently. At least one site I’ve seen is made up of nothing but my articles and a bunch of Google AdSense ads. Recently, I found a site purportedly on direct mail that had republished about a dozen of my blog posts with no permission, no byline, and no links to my original posts.
I looked up the contact information with a WHOIS search, and e-mailed the site administrator. I told him to stop stealing my content. To his credit, he responded and was nice about it, but maintained that he wasn’t stealing anything.
I’m sorry that you feel I’m stealing your content, but I’m simply using a common feed aggregator to beef up content for some of my blogs.
“Beef up.” Translated: I don’t want to write my own content. I’d rather steal it.
Why is this important for you? If you have a business blog, post good content, and have a fair amount of traffic, it’s likely that your content is being stolen. This is not only irritating, unethical, and illegal, it can actually hurt your blog. Republishing your articles will create duplicate Web content that can siphon off traffic and lower your placement on search engine results.
And just in case you’re not convinced that this sort of thing is wrong, here’s what Jonathan Baily from Plagiarism Today has to say about it:
… scraping is a classic example of what is not fair use. It takes the whole work, reproduces it, usually for commercial purposes and often without attribution, while offering no commentary, criticism or educational value. It also significantly damages the market for the work by creating a duplicate version of it.
Is there anything you can do to stop content theft? Yes.
- You can publish only excerpts in your feed instead of full content. This is an option in FeedBurner and other feed services. This makes your feed less useful, but knocks out most feed scrapers.
- You can add a copyright notice or Creative Common License to your feed that specifies what rights you grant to others. Again, some feed services such as FeedBurner let you insert some kind of copyright notice automatically.
- You can install a blog plugin, such as CopyFeed for WordPress, that will insert a configurable copyright warning and set up a digital fingerprint that lets you search the Web for content theft. I’ve just installed this and will see how it works.
- You can also do manual searches for the post titles on your blog to discover who is lifting your content. Or you can set up Google Alerts to automatically search for and report on key words, post titles, or your blog address.
And what do you do when you find a thief? It depends on how miffed you are.
I suggest that you contact the Web site owner and either ask him or her to remove the content or comply with whatever reprint conditions you want. I generally want people to ask me for permission, give me a byline with the article, and provide a clickable link back to my site or to the original content.
There’s a lot of debate on whether or how much duplicate content hurts you. I’d rather not have it, but my articles are all over the Web, so that cat’s out of the bag for me. I usually fair pretty well in the search rankings and my site has been around for many years, so I’m probably not penalized too much. But if your site is new, it could be a different story.
This is just another business issue you have to deal with since the Web is such an important part of marketing now and since blogs are one of the primary tools of the trade.
Have you experienced content theft? What did you do about it? I want to hear what you have to say on this issue.
Update: The guy who scraped my content and said he wasn’t stealing just e-mailed me to say that since he sees scraped content all over the place, he really didn’t know it was wrong. He gave a detailed explanation as his apology … and I believe him. He also made good by adding bylines and links on his site.
A lot of theft is dastardly, but perhaps a lot of it is due to ignorance of copyright and just not taking the time to think about it. Now I feel bad for being terse with the guy, but in the end all is well.