How to Block GPTBot from Scraping Your WordPress Site
OpenAI's GPTBot is actively crawling the web to train future language models. If you want to protect your WordPress content from being used without permission, here's how to block it effectively.
Why Block GPTBot?
GPTBot collects content from websites to train AI models. While this helps improve AI capabilities, you may want to:
Method 1: Using robots.txt
The simplest way is to add GPTBot to your robots.txt file:
User-agent: GPTBot
Disallow: /However, this relies on the crawler respecting robots.txt, which isn't always guaranteed.
Method 2: Using AI Crawler Guard Plugin
For WordPress sites, the AI Crawler Guard plugin provides:
Method 3: Server-Level Blocking
If you have server access, you can block GPTBot at the web server level using .htaccess (Apache) or nginx configuration.
Best Practices
1. **Monitor First**: Before blocking, monitor what GPTBot is accessing 2. **Test Social Previews**: Ensure blocking doesn't break Facebook/Twitter previews 3. **Check Analytics**: Verify legitimate traffic isn't affected 4. **Document Your Choice**: Keep records of your blocking decisions
Conclusion
Blocking GPTBot is straightforward with the right tools. Choose the method that fits your technical expertise and site requirements.
Written by
AI Crawler Guard Team