You can prevent ChatGPT from accessing your website and using your content as a dataset. Here, we will explain how you can do this.
What is GPTBot?
GPTBot is OpenAI’s web crawler, specifically created to collect data from various websites. It accesses and reads the content on these websites, and the gathered information is utilized for training OpenAI’s GPT models.

Therefore, if you don’t take any additional actions, GPTBot can access your website and use the information for training purposes, effectively copying the content in your website.
How to Restrict GPTBot From Accessing Your Website?
By preventing GPTBot from accessing your website, you can also prevent OpenAI’s GPT models or ChatGPT from copying your content.
To do this, you need to modify the robot.txt file in your website.
Also Read: Where does ChatGPT get its data from?
GPTBot can be identified by the following user agent and string.
User agent token | Full user-agent string |
---|---|
GPTBot | Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot) |
To completely prevent GPTBot from accessing your website, you can modify your site’s robots.txt file with the following:
User-agent: GPTBot
Disallow: /
Restrict Access Partially
To restrict partially, directories can be specified in the robots.txt file:
User-agent: GPTBot
Allow: /directory-1/
Disallow: /directory-2/
When you organize your robots.txt file on your website in this way, you prevent OpenAI’s GPTBot crawler from crawling the content on your site. This helps you protect the content on your website.
Ip Addresses That OpenAI Uses When Accessing Websites
OpenAI’s calls to websites will be made from the following IP address:
IPv4 Prefix |
---|
20.15.240.64/28 |
20.15.240.80/28 |
20.15.240.96/28 |
20.15.240.176/28 |
20.15.241.0/28 |
20.15.242.128/28 |
20.15.242.144/28 |
20.15.242.192/28 |
40.83.2.64/28 |
20.9.164.0/24 |
52.230.152.0/24 |
ChatGPT-User
While ChatGPT doesn’t typically have direct internet access, with ChatGPT Plugins, it can access third-party applications through ChatGPT. These applications have internet access.
The user-agent ChatGPT-User is specifically assigned to the plugins within ChatGPT. Its function is limited to executing direct commands on behalf of ChatGPT users and is not intended for automated web crawling.
User agent token | Full user-agent string |
---|---|
ChatGPT-User | Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot |
How to Prevent ChatGPT From Accessing Your Website
If you want to prevent ChatGPT from accessing your site and copying your content, you can apply the same modifications on the robot.txt file for ChatGPT’s specific user agent, similar to GPTBot crawlers.
To prevent plugins from accessing your site, you can explicitly add the ChatGPT-User to your site’s robots.txt file:
User-agent: ChatGPT-User
Disallow: /
To allow plugins to access only specific parts of your site, you can add the ChatGPT-User to your site’s robots.txt in the following:
User-agent: ChatGPT-User
Disallow:
Allow: /directory-1/
Allow: /directory-2/
Calls to websites from ChatGPT-User will be made from the 23.98.142.176/28
IP address block.
OpenAI has 2 different user agents. One is GPTBot, and the other is ChatGPT-User. However, these two users are considered the same. It means that the modifications you make for one will also apply to the other user.