Generative AI such as Google’s Bard and ChatGPT doesn’t create content from scratch. It repurposes it from original sources.
Valid reasons could prompt a website to prevent AI bots from using its content, including:
- Protect intellectual property. Blocking generative AI could protect unique content, ideas, or products from being copied or reused.
- Misrepresentation. AI answers could misinterpret or misuse content.
- Limited usefulness. AI answers generate little (or no) traffic to the publishers who provide them.
- User control. Blocking AI bots allows creators more control over how and where their content appears online.
Bard represents an added concern: What if Google’s Search Generative Experience uses content for an answer without citing the source, such as your company?
There’s no good solution. I know no method to prevent Bard from using your content without jeopardizing organic search performance.
Nonetheless, Google recommends two ways to block or control Bard.
Bard uses the same user agent as Google Search when collecting data, so blocking it disables Googlebot from crawling your site and collecting relevancy signals.
Google-Extended is the company’s solution. It blocks Bard without affecting Google’s index and ranking algorithm.
To use, add a disallow directive in your site’s robot.text file, as follows:
User-agent: Google-Extended Disallow: /
The directive does not stop Google from showing your content in SGE’s answers, with or without citations. Its purpose, per Google, is to prevent SGE from learning from it.
Use Nosnippet, Max-snippet, or Data-nosnippet
SGE will follow Google-approved meta tags and attributes:
- no-snippet meta tag prevents AI answers from showing any parts of your content.
- max-snippet robots meta tag allows creators to set the maximum number of characters AI can include from your content.
- data-nosnippet HTML attribute enables creators to designate any text from an HTML page to be excluded from a search snippet.
Images, which in my testing often appear in SGE’s results, cannot be blocked, according to Google.
Blocking Other AI Bots
New generative AI platforms show up seemingly monthly. Blocking all of them will be difficult. We can block or contorl three non-Google bots:
- Use both nocache and noarchive meta tags to control Bingbot. The tags will not impact Bing’s organic search rankings, although they will disable Google’s cache and prevent Wayback Machine from archiving pages.