How to Create a robots.txt File

What a robots.txt file does

A robots.txt file is a plain text file that sits at the root of your domain and tells search engine crawlers and other bots which parts of your site they're allowed to fetch. Want to keep your admin area, internal search results, or checkout flow out of the crawl queue? This is where you say so. It's also the standard place to point crawlers at your sitemap and, increasingly, to decide whether AI training bots like GPTBot or ClaudeBot can read your content.

You'd want one whenever you launch a site, migrate a platform, or notice search engines wasting time on pages that don't matter. The catch: a single typo can accidentally block your entire site from Google, so it's worth building it carefully and testing before you ship. Below, you'll build a valid file from scratch using the robots.txt generator — it runs entirely in your browser, so nothing you type is uploaded and there's no sign-up.

How to create a robots.txt file

Open the generator and pick a starting point. Go to the robots.txt generator. Under Quick Presets in the Visual Builder tab, choose the template closest to your setup — Standard Website, WordPress, Shopify, Next.js, Laravel, or E-commerce. These pre-fill sensible disallow paths (like /wp-admin/ or /cart/) so you're not starting with a blank file. If you just want everything crawlable, pick Allow All.
Set your User-agent. Each rule block targets a bot. The default User-agent: * applies to all crawlers, which is what most sites need. To write a rule for one specific bot, use the User Agent dropdown to select Googlebot, Bingbot, or any other — or type a custom agent string.
Add your Disallow paths. In the rule card, type a path into the Disallow Paths field (for example /admin/ or /search) and press Enter, or click one of the common-path chips below the input. Remember paths are case-sensitive and should start with a /. Anything you list here is the crawler's "please don't fetch this."
Add Allow exceptions if needed. If you blocked a folder but want one subpath inside it crawled, add it under Allow Paths. When a URL matches both, the most specific (longest) rule wins, so Allow: /wp-content/uploads/ can carve an opening inside a broader disallow.
Point to your sitemap. Open Global Settings and paste your full sitemap URL (for example https://example.com/sitemap.xml) into the Sitemap URLs field. Use the absolute URL, not a relative path. Don't have a sitemap yet? Build one with the XML sitemap generator first, then come back and add the link here.
Optionally control AI bots. Switch to the AI Bot Control tab to toggle individual AI crawlers on or off, or check Block All AI Bots in one click. Blocking these doesn't affect normal search indexing — it only stops your content from being pulled for AI training and scraping.
Test a few URLs before you trust it. Open the URL Tester tab, enter a path like /admin/settings or /blog/my-post, pick a user agent, and hit Test URL. You'll see ALLOWED or BLOCKED plus the exact rule that matched. Use Bulk URL Test to check a whole list at once. This is the step that catches accidental sitewide blocks.
Copy or download the file. Go to the Output tab. Review the syntax-highlighted preview, then use Copy or Download to save it as robots.txt. Upload that file to the root of your domain so it's reachable at https://yourdomain.com/robots.txt. That exact location matters — crawlers only look there.

That's it. Once it's live, you can paste your published URL into the standalone robots.txt tester to confirm the real, deployed version behaves the way you expect.

Tips and common problems

Place it at the root, nowhere else. A robots.txt in a subfolder is ignored. It must live at the domain root, and a file on a subdomain only governs that subdomain.
It's a request, not a lock. Well-behaved crawlers obey it, but it does not hide or secure anything. Never use it to protect private data — that needs real authentication or a noindex directive on the page itself.
Don't block your CSS and JS. If you disallow the assets Google needs to render a page, it may judge your layout as broken. Keep stylesheets and scripts crawlable.
Blocking a page doesn't always remove it from results. If other sites link to a blocked URL, it can still appear in search listings (without a description). To truly keep a page out of the index, allow crawling and add a noindex meta tag instead — then confirm with the Google index checker.
Watch the validation panel. The generator flags conflicting rules, paths missing a leading /, invalid sitemap URLs, and a sitewide Disallow: / so you don't ship a mistake.

FAQ

Do I even need a robots.txt file? Not strictly — if it's missing, crawlers assume they can fetch everything. But having one lets you steer crawl budget away from low-value pages, declare your sitemap, and manage AI bots, so most sites benefit from one.

What's the difference between Disallow and noindex? Disallow tells a crawler not to fetch a URL; it doesn't guarantee the page stays out of search results. noindex (a meta tag or HTTP header on the page) tells search engines not to list it. For reliable removal from results, allow crawling and use noindex.

Why is my whole site blocked after publishing? Almost always a stray Disallow: / under User-agent: *. Re-open your file in the generator, check the URL Tester against a real page, and remove or narrow that rule. An empty Disallow: line (with nothing after it) allows everything.

Does blocking AI bots hurt my Google rankings? No. AI crawlers like GPTBot, CCBot, and Google-Extended are separate from the search indexing bots, so blocking them leaves your normal search visibility untouched.

For the bigger picture on how crawl rules fit together, see the technical SEO guide, and pair this with the sitemap generator and robots.txt tester to round out your crawl setup.

What a robots.txt file does

How to create a robots.txt file

Tips and common problems

FAQ

Related Tools