In the age of generative AI, publishers and content creators face an unprecedented challenge: your work can be scraped, indexed, and used to train large language models (LLMs) without your consent or compensation. This guide offers a quick overview of your rights, your options, and your tools to take back control.

1. Understanding the Risk

AI models are often trained on publicly accessible content scraped from the web. This includes:

  • Articles, blogs, and editorials
  • Reviews, recipes, and guides
  • Metadata (headings, categories, tags)
  • Images and alt-text

The impact? Your words may be reproduced without context, attribution, or traffic being returned to your site.

2. What Does the Law Say?

EU Directive on Copyright (DSM Directive)

  • Article 4 allows text and data mining (TDM) for research purposes.
  • But: You can opt out if you state this clearly in your Terms of Use or robots.txt.

Key Legal Concepts:

  • Moral Rights: Your right to attribution and integrity of your work.
  • Economic Rights: Control over reproduction and distribution.
  • Database Protection: In the EU, structured collections of articles may be protected.

3. How to Opt Out of AI Crawling

A. Update Your robots.txt

User-agent: GPTBot
Disallow: /

User-agent: Google-Extended
Disallow: /
  

B. Add a Legal Disclaimer

Include this in your footer or legal notice:

“Content on this site may not be used for the development or training of AI systems, machine learning models, or any automated systems without explicit permission.”

4. Tools and Actions You Can Take

  • Monitor your server logs for unusual crawler behavior.
  • Use services like Cloudflare or server firewalls to block abusive bots.
  • Join industry groups (like EATW!) to push for collective enforcement.
  • Report misuse to data protection authorities or copyright bodies.

5. What About Search Engines?

Crawlers like Googlebot are still essential for visibility. Blocking everything is not recommended. Use selective rules instead:

User-agent: Googlebot
Disallow: /private-directory/
Allow: /
  

Use robots.txt and sitemaps strategically to allow good bots and block exploitative ones.

6. Moving Forward as a Community

As independent publishers, we have strength in numbers. Share resources, support fair licensing, and educate your peers. The web may be open, but your content isn’t up for grabs.

Contact

European Association of Travel Writers (EATW)
www.eatw.org
For questions or to join our mailing list: This email address is being protected from spambots. You need JavaScript enabled to view it.

Volunteer Your Legal Expertise

Are you a legal thinker, policy strategist, or digital rights advocate who believes authorship still matters? Whether you're fluent in IP law, AI ethics, EU copyright directives, or licensing frameworks—we’d love your help.

Contribute your skills to help EATW develop real-world tools, shape forward-thinking policy, and defend creative freedom across Europe.

Become a Legal Advisor