Learn how content theft can affect your business and how to prevent your website content from being stolen…
Content theft is an unfortunate reality of having an online presence.
If your website has unique and valuable content, your content may end up being “borrowed” by your competitors without giving you credit or attribution, or just plain “stolen” for various reasons.
Knowing what you can and can’t do to protect your content from being stolen and implementing effective ways to prevent content theft from happening is a challenge but also an important aspect of good content management.
In this lesson, we cover the following:
- How Content Theft Can Affect Your Business
- Content Scraping: What Is It?
- Protecting Your Content – Your Rights
- Preventing Content Theft – Options
How Content Theft Can Affect Your Business
Content theft is a serious problem.
If your website has original, high-quality articles, tutorials, reviews, images or well-written information describing your products or services, this not only makes your content unique and valuable to your site visitors and potential customers, it also makes it appealing to your competitors and to people with nefarious motives.
It is relatively easy to steal content from most websites. You can simply copy and paste the content, right-click to download and save images or other media, take screenshots, etc.
It can be very frustrating to invest a great deal of time, effort, money, and resources and then have worked so hard to create unique and valuable content and then have this stolen from you by a content scraper and used on another website.
Content Scraping – What Is It?
Content scraping (also called web scraping or website scraping) is where some or all of the content of a website is downloaded or copied by another party (usually by automated web scraping bots or tools), often against the website owner’s wishes, and usually done to repurpose the “stolen” content on another site for malicious purposes (e.g. phishing) or to create filler content on spammy sites for SEO (e.g. to attract users for AdSense clicks).
Although not all content scraping is bad (e.g. think of affiliates using your content to market your products and services), malicious web scraping is a parasitic practice that can affect your business negatively in many ways.
For example, content theft violates your copyrights, steals your organic traffic, and can take up valuable server resources and cost you money (e.g. image hotlinking, which we discuss further below).
Although anyone savvy enough can manually go through and copy and paste the entire contents of a website, website scraper bots can download all of your website’s content in seconds, even if you run a large site such as an e-commerce store with hundreds or thousands of product pages. Some bots can also access and scrape gated content by filling in and submitting forms automatically. An example of this is data or price scraping where scraper bots target the pricing information of competing businesses to undercut their rivals and increase their own sales
While there are tools and methods you can use to throw a few hurdles at content scrapers and scraper bots (we’ll touch on some of these later), there are also websites that teach users how to bypass content scraping tools and methods.
Since anything posted publicly on the Internet can be scraped, including text, images, code, etc., there is really not that much you can do to protect publicly-posted content (other than to not make it public in the first place).
So, how do you protect your website content?
Well, first let’s take a look at what rights content owners have, and then we’ll look at some options for preventing content theft.
Protecting Your Content – Your Rights
Copyright ownership gives you the exclusive right to use your work, with some exceptions. When you create an original work, fixed in a tangible medium, you automatically own the copyright to the work.
Note: With content writing, “who owns the copyright” to content can be confusing and can lead to disagreements.
- Generally, freelance writers own the copyright to their work unless their contract or agreement stipulates that they must transfer the copyright to another person or business.
- Employees don’t own the copyright to the work they create, their employers do.
- Journalists employed by a newspaper or publisher are an exception (while their employer owns the copyright in their articles as published, the journalist retains the copyright in their work for other specified purposes, such as reproduction in a book).
- Another exception is fair use copyright in countries where certain types of uses of your content may be considered fair (e.g. criticism, commentary, news reporting, teaching, scholarship, or research).
- Sometimes, rights to use the work are implied even if agreements to transfer the rights were not signed, except when used for purposes that the content creator did not agree or consent to.
Regarding issues of copyright, always seek advice if you are unsure.
Additionally, even though in accordance with U.S. and European Union laws the original work is automatically copyrighted from the moment of creation and you do not need to display a public copyright notice (“© All rights reserved”) on your website, it’s a good idea to include it (since laws differ in different countries) and to also consider filing a copyright registration to protect your web content if you think your content merits protection.
Registering a copyright with The United States Copyright Office creates a public record of ownership which is stored with the Library of Congress. This can be useful if you ever need to file an infringement lawsuit in court. Note, however, that new content is not automatically added to the copyright registration (a new registration must be filed to indicate that it covers the new materials).
You can file for a copyright registration for your website’s content (original text or images) with The U.S. Copyright Office eCO online system. Refer to the United States Copyright Office Copyright Basics Guide for more information.
The following additional resources will also help to protect your content and intellectual property:
To understand Creative Commons it helps to understand how Copyright works.
As stated earlier, when you create something original, fixed in a tangible medium, like a photograph, a song, story, or even articles for your website, you automatically own the copyright to the product of your creativity.
This automatic copyright is known as an “All Rights Reserved” copyright. It protects your creativity against uses that you don’t consent to, such as people or other businesses taking, using, and potentially making money off your work.
In some cases, however, an “All Rights Reserved” copyright may be too restrictive. You may want others to use, share, or build upon certain aspects of your work, as this could have benefits for your business (e.g. increased exposure and promotion by allowing others to share and distribute your content) or your industry, or a particular group or community.
This is where Creative Commons licensing comes in.
“Creative Commons licenses give everyone from individual creators to large institutions a standardized way to grant the public permission to use their creative work under copyright law.”
Source: Creative Commons
So, whereas copyright is an “all rights reserved” option in which you hold all rights, a Creative Commons license offers a “some rights reserved” option, allowing for certain uses of your work to occur under specific conditions of your choice.
There are six different Creative Commons license options, ranging from the most to least permissive, where you can give the public permission to share and use your work provided they agree to your conditions. For example, you may allow companies to share your content but not sell it, or re-publish your articles for commercial purposes if they provide credit and attribution, etc.
Creative Commons, then, is not about giving up or replacing copyright (you still own your work), it’s about introducing a more flexible way to manage the rights embodied in copyright by giving you choices about what others can and can’t do with your content.
As mentioned earlier, there are pros to using Creative Commons licenses, such as increased exposure and publicity for your business through the sharing and redistribution of your content. Additionally, Creative Commons licenses are non-exclusive, so you can license the same content under different agreements.
Some of the cons to using Creative Commons licenses is that with almost all of the licenses, you can’t be sure who is using your work or making money from it, and others can use your content without compensating you for the use.
Adding a Creative Commons license to your website is fast, easy, and free. Just visit the website, answer a few questions (e.g. will you allow your work to be used commercially or modified?), and the right type of license will be issued, along with all the elements you need to display it on your site.
Visit the website: Creative Commons
Now, all this is well and good if you are ok about others using your content, and/or they respect your intellectual property rights.
But, what if others are using your content without permission or attribution? What can you do?
Well, the first step is to find out who is using your content.
Content Detection Tools
The tools below can help you identify sites that are using your content with or without your permission.
Google Alerts is a free service from Google that lets you keep up-to-date with the latest news about all kinds of topics, stay informed about people and companies, and track what other people are publishing about you and your business online.
This is a useful tool if you have content with unique brand names or keywords. Whereas Google search can help you uncover sites using your unique words in their content, Google Alerts is always monitoring for new content and will instantly notify you as soon as it detects that your content containing those words has been used.
If you need help setting up Google Alerts, see this tutorial: How To Set Up Google Alerts
Copyscape helps to protect your website, online publications, blog, marketing materials, or any other online content against plagiarism. Simply enter the URL of a page on your site and the free plagiarism checker will look for any copies of your web pages online.
The animated explainer video below provides an overview of the tool:
The premium service automatically scans and monitors the Web for copies of your content and notifies you if it detects any plagiarized content. This is also a useful tool if you plan to outsource your content writing or purchase content from freelance writers, as it allows you to check if the content has been previously sold to others or published elsewhere on the web.
Visit the website: Copyscape
Unicheck is a plagiarism checking software. While it is primarily used to detect plagiarism in academic writing, it can help you find instances where others have used your content without attribution.
Visit the website: Unicheck
Grammarly Plagiarism Checker
Grammarly Plagiarism Checker is another useful plagiarism checking tool.
Use this tool not only to search for content that may have been copied from your website but also to ensure that your content writers are creating fresh and original content that is also free of plagiarism.
Visit the site: Grammarly Plagiarism Checker
TinEye is a reverse image search engine that checks if anyone else is using your images (even modified versions).
TinEye also offers a premium service that automatically checks all the locations where your images appear online.
Visit the website: TinEye
DMCA is an online service that assists you with getting stolen or plagiarized content removed from sites infringing on your copyright in accordance with The Digital Millennium Copyright Act (DMCA), a U.S. law that criminalizes digital plagiarism.
This service assists you with filing DMCA takedown requests if you find content that violates your rights, as DMCA requires hosting providers to remove content that infringes on intellectual property rights.
Filing a DMCA complaint should be used only as a last resort in situations where the violation can seriously impact your business, as the process of filing a complaint can involve a significant amount of time and effort. For this reason, if you find that your content has been stolen and published on a scammy or spammy site that isn’t ranking high on the search engines, it will probably have very little effect on your website and you can probably just ignore it (or use one of these hotlinking prevention methods if the stolen content includes media files like images, videos, or downloadable files hosted on your site or remotely).
If, however, the violation is significant and/or has taken place on a site that has high visibility, authority, or ranking, then follow the steps below to file a take-down notice.
Filing a DMCA Complaint After Content Theft
Many sites that allow users to upload their own content and web hosts have a DMCA form that you can fill in if you believe there has been a copyright infringement of your content.
For example, Automattic (the company behind WordPress) has a DMCA form that you can fill out and submit on their website.
Many web hosting companies also provide their own forms and processes for submitting claims, as they are legally required to remove content that infringes on intellectual property rights if served with a DMCA notice.
As stated earlier, filing a DMCA complaint can be a fairly lengthy and laborious process, so only do this if you feel it’s absolutely necessary to do so.
Before filing a complaint, make sure that you:
- Have sent the website owners or webmaster a polite message asking them to remove your copyrighted material.
- Understand fair use copyright to avoid filing wrongful DMCA claims.
- Have records of instances of copyright infringement and proof of content theft (e.g. screenshots).
- Can provide all the information required to avoid delays.
Once you have done the above, follow the steps below to file a DMCA claim:
- Who is hosting the plagiarized content? Locate and record the IP address of the offending website using these domain tools and then input the IP address into the American Registry for Internet Numbers to find the hosting company.
- Create your DMCA complaint. Your complaint must include information like identification of the copyrighted work and the original material and a physical or electronic signature. If you need proof of infringement, use the Internet Archive to show that the content was published on your site prior to appearing on the offender’s site. If your site is built using WordPress, you can use a plugin like WordProof to show that your content was published before anyone else using blockchain-registered timestamps.
- Submit the DMCA complaint to the user’s host or use a DMCA Designated Agent.
If your claim is successful, you can expect the offending website’s host to remove your stolen content unless the site owner feels that your complaint is unwarranted and issues a counter-notice.
If for any reason, you’re still experiencing issues or problems, use this form to notify Google and request that they remove the infringing content. If your claim is verified, Google will delist the content from its search results (note: you will have to repeat this process for every page of content stolen from your website).
Visit the website: DMCA
Reporting A Violation Or Infringement On Social Media
Content theft and violations or infringements of your copyright aren’t only restricted to content on your website.
See the links below to report violations or infringements of your rights on your social media platforms:
- Facebook – Use this tool to report violations of copyrights or trademarks on Facebook.
- Amazon – Amazon Brand Registry (ABR) allows you to maintain control over your products and intellectual property on Amazon.
- YouTube – You can report channels on YouTube with content designed to impersonate your business or any other kinds of violations.
- Instagram – Use this form to report accounts on Instagram pretending to be your business.
Preventing Content Theft – Options
So, you have done everything possible to protect your website content, and it’s still not enough. What more can you do to prevent content theft from happening on your website?
Let’s look at some options:
Make Your Pages/Posts Private
Does your content need to be public? If not, consider making the page or post that includes this content “private.” This way, only those who you give access to your content will be able to view it.
If your website was built using WordPress, you can easily make posts or pages private. See this tutorial for more details: How to Protect your Content In WordPress Posts & Pages
Protect Your Content In A Membership Site
A membership site lets you make all of your content or only specific sections of it private. Only registered members can access your valuable content.
If your site is built using WordPress, refer to these tutorials:
Configure Your RSS Feed To Display Post Summaries Only
Scraping software can obtain content illegally from your website’s RSS feeds if these display full articles in your feed.
If your site allows you to display post summaries only, we recommend choosing this option, as it will limit the amount of content that can be stolen from your site to the post excerpt or post summary only.
Disable Right-Clicking On Your Website
Depending on what kind of platform your website was built with, you may be able to configure your site’s settings to disable right-clicking on users’ web browsers.
For example, if your site runs on WordPress, you can use a plugin like WP Content Copy Protector to disable right-clicking on your website and prevent users from copying your content.
Protect Your Images
Consider adding a watermark to your images. While this may not prevent your images from being stolen, it will let online users know that you are the rightful owner of those images.
Alternatively, consider using these hotlinking prevention methods.
Additional Content Scraping Prevention Solutions
There are sophisticated solutions you can use to detect malicious scraper bots and block them from accessing your website’s content (e.g. DataDome), or scammers, fakes, and frauds copying your brand and stealing your revenue (e.g. RedPoints), but in many cases and for many business or websites, these may be overly expensive or unnecessary solutions.
Prevent Content Theft In WordPress
For useful content protection and content theft-prevention plugins for WordPress users, go here: How To Prevent Content Theft In WordPress
Prevent Image Hotlinking
Many content thieves engage in the practice of hotlinking images, videos, and downloadable files and this can cost you serious money.
Find out what hotlinking is and how to prevent it here: How to Prevent Content Hotlinking
Content theft is a serious and growing problem on the web and an unfortunate reality of having an online presence.
While you may not be able to stop someone who is hell-bent on stealing your content from infringing on your intellectual property rights, there are things you can do to prevent those who are less savvy from attempting it. This lesson provides a number of methods to protect your content from being stolen.
If you publish valuable content on your site, make sure to implement as many of the methods covered in this lesson to protect it.
- The United States Copyright Office – The official website of the U.S. Copyright Office. See this site for information on U.S. copyright laws and expert advice on copyright law and policy.
- CreativeCommons.org – This site provides Creative Commons licenses and public domain tools that give you a free, simple, and standardized way to grant copyright permissions for allowing others to copy, distribute, and make use of your works.
- DMCA – Provides content protection and copyright enforcement services worldwide.
- How To Prevent Content Theft In WordPress – Useful content protection and content theft-prevention plugins for WordPress users.
- What Is Content Scraping?
- What Is Fair Use?
- Digital Millennium Copyright Act (DMCA)
- Easy Steps To Protect Your Website From Being Copied
- Proven Methods For Protecting Content From Being Copied
- How To Report Copyright Infringement On Facebook
- How To Protect Your Brand On Amazon
- How To Report A YouTube Channel For Impersonation
- How To Report An Impersonation Account On Instagram
- Return to the module overview: Content Management
- Go to the lesson on Preventing Image Hotlinking
- Return to the Course Outline
Image: Content Security