robots.txt

The robots.txt is a text file that is located in the root directory of a website. In it, the behavior of the crawlers / bots is regulated.

Notice: The robots.txt is only a recommendation for the crawler. With the included rules it is not possible to protect directories from unwanted access. Malicious crawlers can access the content without problems despite robots.txt.

A standard robots.txt file in WordPress looks like this:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

User-agent:

defines which crawlers are addressed. With a * all crawlers are controlled. For example, if you only want to address the Google Bot, you can do this with User-Agent: google
This addresses all bots that start with google*.

Disallow:

defines which directories are not to be called or crawled. This automatically includes all subdirectories. If you want to release a subdirectory for the bot, this can be done with Allow be made.
We have made the experience that it can happen during a CMS change that Google wants to search any subdirectories of the old CMS. Since these no longer exist, the bot can simply be restricted for this.

Allow:

In the example the admin-ajax.php in the "wp-admin" directory is unlocked, because the complete directory with all files and subdirectories was locked before.

Download Web Glossary as PDF

Contact

About us

More