Robots.txt Generator – Free Online robots.txt Builder with AI Crawler Blocking

Robots.txt Generator

Build & validate a perfect robots.txt for your website — WordPress / Joomla / Shopify presets, AI crawler blocking (GPTBot, Claude, Gemini), URL tester, syntax validator, multiple user-agent blocks & sitemap. All client-side.

🛠 Builder 🧪 URL Tester ✅ Validator 📥 Edit existing

🌐 Sitemap URL(s) — one per line, full URL

👤 User-agent blocks

+ Add user-agent block

⚡ Quick rules — add to first block

+ Block /wp-admin/ + Block /wp-includes/ + Block /cart/, /checkout/ + Block /search? + Block /tag/, /category/ + Block /feed/ + Block /tmp/, /private/ + Allow /wp-admin/admin-ajax.php

📄 Generated robots.txt 0 lines · 0 B

Configure user-agent blocks above to generate your robots.txt…
📋 Copy ⬇ Download robots.txt 🔗 Share ↺ Reset all
History (last 10)

How to Use the Robots.txt Generator

  1. Pick a preset (WordPress, Shopify, "block AI crawlers", etc.) — or start blank.
  2. Add one or more user-agent blocks. Each block targets a specific crawler (or * for all).
  3. Inside each block, add Allow and Disallow rules. Use quick rules for common patterns.
  4. Paste your sitemap URL(s) — typically at the bottom of robots.txt.
  5. Use the URL Tester to confirm specific URLs are blocked or allowed for a chosen bot.
  6. Use the Validator to catch syntax errors before going live.
  7. Download the file → upload to the root of your domain at https://example.com/robots.txt.

What is robots.txt?

robots.txt is a plain-text file that lives at the root of your website (e.g., https://example.com/robots.txt) and tells web crawlers — Googlebot, Bingbot, ChatGPT, Claude, etc. — which parts of your site they can access. It's part of the Robots Exclusion Protocol, an internet standard since 1994 and formalized as RFC 9309 in 2022.

It's the first file most crawlers fetch when visiting a domain. Compliant bots (Google, Bing, Yandex, DuckDuckGo) honor it strictly. Non-compliant bots (scrapers, some AI crawlers) may ignore it — for true blocking you'll need server-side controls.

This generator builds standards-compliant robots.txt with smart defaults, AI-crawler blocking (GPTBot, Claude-Web, Google-Extended, PerplexityBot, etc.), CMS-specific templates, a built-in URL tester & full syntax validator. Output is ready to upload to your site root.

Common Use Cases

🔒 Block private areasStop crawlers from indexing /admin/, /checkout/, member-only pages.
🤖 Block AI crawlersStop GPTBot, Claude, Gemini & others from training on your content.
📋 WordPress sitesBlock /wp-admin/ & /wp-includes/ but allow admin-ajax.php for plugins.
🛍 E-commerceHide cart, checkout, account pages from search results.
🧪 Staging sitesBlock entire dev/staging environments from indexing.
🗺 Sitemap declarationTell Google where your XML sitemap(s) live for faster crawling.
⏱ Crawl rate controlUse Crawl-delay to slow aggressive bots that overload your server.
🕷 Block scrapersBlock AhrefsBot, SemrushBot, MJ12bot & other SEO/data scrapers.
🔍 Faceted nav cleanupBlock URL patterns with query strings to prevent duplicate indexing.
📰 Feed controlBlock RSS feed URLs from search index while keeping them accessible.

Why Choose Our Robots.txt Generator?

⚡ 15+ smart presetsWordPress, Shopify, Magento, AI blockers & more.
🤖 AI crawler readyBlock GPTBot, Claude-Web, Google-Extended, PerplexityBot, etc.
🧪 URL testerVerify exactly which URLs each bot can access.
✅ Syntax validatorCatch errors before deploying to production.
📥 Parse existingPaste your live robots.txt to edit in the builder.
🌐 Fetch from URLPull any site's robots.txt to learn from or audit.
👥 Multiple user-agentsDifferent rules for different bots in one file.
🗺 Multi-sitemapDeclare multiple sitemap URLs cleanly.
⏱ Crawl-delay supportPer-bot delay rules for server load management.
💬 Helpful commentsAuto-add comment lines so your robots.txt is self-documenting.
🔒 100% privateRuns entirely client-side. Nothing uploaded.
📜 HistoryLast 10 generated files saved locally.
🌙 Dark modeEasy on the eyes.
🖥 FullscreenDistraction-free editing.
🔗 Share linkURL that restores your configuration.
⬇ Direct downloadOne click → robots.txt ready to upload.
⌨️ Keyboard shortcutsQuick navigation throughout.
📊 Live previewSee output update as you edit.
♾ Free & unlimitedNo signup. No quotas. No ads in workflow.

Frequently Asked Questions

Where do I put the robots.txt file?

It MUST be at the root of your domain — https://example.com/robots.txt (not in a subfolder). On WordPress, many SEO plugins (Yoast, Rank Math) let you edit it from the dashboard. On Apache/Nginx, upload it via FTP to your web root. On Shopify/Squarespace, use their dedicated robots.txt editor.

What's the difference between Disallow and noindex?

Disallow in robots.txt tells bots not to crawl a URL — but if it's already indexed or linked elsewhere, it may still appear in search results without a snippet. noindex (a meta tag) tells search engines not to include in results — but requires the page to be crawlable. For full removal: use noindex meta tag, NOT Disallow.

Will blocking AI crawlers actually stop my content from being used?

Only for compliant bots. GPTBot, ClaudeBot, Google-Extended, PerplexityBot honor robots.txt. However: (1) content already trained on is already in the model, (2) AI scrapers using fake/no user-agents bypass robots.txt entirely, (3) data is also collected via Common Crawl & other datasets. For true protection, also use server-side blocking (Cloudflare, Cloudflare AI Crawl Block, or WAF rules).

Should I block /wp-admin/ in WordPress?

Yes — but ALLOW /wp-admin/admin-ajax.php because many plugins (WooCommerce, contact forms) need it for AJAX calls. Our WordPress preset handles this correctly. Don't block /wp-content/ or /wp-includes/ — Google needs to crawl your CSS/JS for proper rendering since 2014.

What's the wildcard syntax?

* matches any sequence of characters. $ matches end of URL. Examples: Disallow: /*? blocks all URLs with query strings. Disallow: /*.pdf$ blocks all PDFs. Allow: /public/*.css$ allows all CSS files in /public/. Both Google & Bing support these wildcards.

How big can robots.txt be?

Google caps at 500 KB — content beyond is ignored. Realistically, keep it under 5–10 KB; if you need more, restructure your URL patterns instead.

What's Crawl-delay and does Google support it?

Google ignores Crawl-delay. Set it instead via Google Search Console → Settings → Crawl rate. Bing, Yahoo & Yandex DO honor Crawl-delay (value in seconds between requests).

Can I have different rules for different bots?

Yes — that's the point of user-agent blocks. Each User-agent: starts a new block applying only to that bot. Example: allow Googlebot everything, block Bingbot from /private/, block all AI crawlers entirely. Our builder makes this trivial.

Is my data sent anywhere?

No — robots.txt construction is 100% local. The only optional network call is "Fetch from URL" which goes through a public CORS proxy.

Scroll to Top