Robots.txt Generator
Build & validate a perfect robots.txt for your website — WordPress / Joomla / Shopify presets, AI crawler blocking (GPTBot, Claude, Gemini), URL tester, syntax validator, multiple user-agent blocks & sitemap. All client-side.
🌐 Sitemap URL(s) — one per line, full URL
👤 User-agent blocks
+ Add user-agent block⚡ Quick rules — add to first block
📄 Generated robots.txt
Configure user-agent blocks above to generate your robots.txt…
How to Use the Robots.txt Generator
- Pick a preset (WordPress, Shopify, "block AI crawlers", etc.) — or start blank.
- Add one or more user-agent blocks. Each block targets a specific crawler (or
*for all). - Inside each block, add Allow and Disallow rules. Use quick rules for common patterns.
- Paste your sitemap URL(s) — typically at the bottom of robots.txt.
- Use the URL Tester to confirm specific URLs are blocked or allowed for a chosen bot.
- Use the Validator to catch syntax errors before going live.
- Download the file → upload to the root of your domain at
https://example.com/robots.txt.
What is robots.txt?
robots.txt is a plain-text file that lives at the root of your website (e.g., https://example.com/robots.txt) and tells web crawlers — Googlebot, Bingbot, ChatGPT, Claude, etc. — which parts of your site they can access. It's part of the Robots Exclusion Protocol, an internet standard since 1994 and formalized as RFC 9309 in 2022.
It's the first file most crawlers fetch when visiting a domain. Compliant bots (Google, Bing, Yandex, DuckDuckGo) honor it strictly. Non-compliant bots (scrapers, some AI crawlers) may ignore it — for true blocking you'll need server-side controls.
This generator builds standards-compliant robots.txt with smart defaults, AI-crawler blocking (GPTBot, Claude-Web, Google-Extended, PerplexityBot, etc.), CMS-specific templates, a built-in URL tester & full syntax validator. Output is ready to upload to your site root.
Common Use Cases
/admin/, /checkout/, member-only pages.Why Choose Our Robots.txt Generator?
Frequently Asked Questions
Where do I put the robots.txt file?
It MUST be at the root of your domain — https://example.com/robots.txt (not in a subfolder). On WordPress, many SEO plugins (Yoast, Rank Math) let you edit it from the dashboard. On Apache/Nginx, upload it via FTP to your web root. On Shopify/Squarespace, use their dedicated robots.txt editor.
What's the difference between Disallow and noindex?
Disallow in robots.txt tells bots not to crawl a URL — but if it's already indexed or linked elsewhere, it may still appear in search results without a snippet. noindex (a meta tag) tells search engines not to include in results — but requires the page to be crawlable. For full removal: use noindex meta tag, NOT Disallow.
Will blocking AI crawlers actually stop my content from being used?
Only for compliant bots. GPTBot, ClaudeBot, Google-Extended, PerplexityBot honor robots.txt. However: (1) content already trained on is already in the model, (2) AI scrapers using fake/no user-agents bypass robots.txt entirely, (3) data is also collected via Common Crawl & other datasets. For true protection, also use server-side blocking (Cloudflare, Cloudflare AI Crawl Block, or WAF rules).
Should I block /wp-admin/ in WordPress?
Yes — but ALLOW /wp-admin/admin-ajax.php because many plugins (WooCommerce, contact forms) need it for AJAX calls. Our WordPress preset handles this correctly. Don't block /wp-content/ or /wp-includes/ — Google needs to crawl your CSS/JS for proper rendering since 2014.
What's the wildcard syntax?
* matches any sequence of characters. $ matches end of URL. Examples: Disallow: /*? blocks all URLs with query strings. Disallow: /*.pdf$ blocks all PDFs. Allow: /public/*.css$ allows all CSS files in /public/. Both Google & Bing support these wildcards.
How big can robots.txt be?
Google caps at 500 KB — content beyond is ignored. Realistically, keep it under 5–10 KB; if you need more, restructure your URL patterns instead.
What's Crawl-delay and does Google support it?
Google ignores Crawl-delay. Set it instead via Google Search Console → Settings → Crawl rate. Bing, Yahoo & Yandex DO honor Crawl-delay (value in seconds between requests).
Can I have different rules for different bots?
Yes — that's the point of user-agent blocks. Each User-agent: starts a new block applying only to that bot. Example: allow Googlebot everything, block Bingbot from /private/, block all AI crawlers entirely. Our builder makes this trivial.
Is my data sent anywhere?
No — robots.txt construction is 100% local. The only optional network call is "Fetch from URL" which goes through a public CORS proxy.
