Robots.txt in ASP.NET Core
In this article, I would like to show you how to create a robots.txt file in Asp.Net Core. robots.txt is a file which should be stored in the root directory of every website. The main purpose if this file is to restrict some or all content on your website by search engines. Simply tell Search Robots which page you would like them not to visit. robots.txt is a plain text file that follows the Robots Exclusion Standard.
robots.txt file is not mandatory. You only need this file when you want your website content should not index by search engine.
The location of robots.txt is very important. It must be the main directory otherwise user agents/ search engines will not be able to find it. The Route path must be
http(s)://www.example.com/robots.txt
How It Works
Let’s a robot or search engine wants to visit a website URL example: http(s)://example.com/contact.html. Before it does so, it first’s checks http(s)://example.com/robots.txt file. If search engine’s found robots.txt
file, then it checks the content is allowed for indexing or not. If it finds
User-agent: *
Disallow: /
The “User-agent: *” means this section applies to all robots. The “Disallow: /” tells the robot that it should not visit/index any pages on the site.
If user agents don’t find any robots.txt
there, the simply assume that this site doesn’t haverobots.txt
file and therefore it index everything they find along the way.
Here are some common robots.txt
setups:
Allowing all web crawlers access to all content
User-agent: *
allow: /
Blocking all web crawlers from all content/entire website.
User-agent: *
Disallow: /
Disallow crawling of a single webpage by listing the page after the slash:
User-agent: *
Disallow: /example-subfolder/ blocked-page.html
In this article we will deal only how to create and use robots.txt in asp.net core. What we will do in the project
Asp.net core project configuration
In this project we are going to create a Middleware that checks if URL is /robots.txt
then it reads the /robots.txt
file from project root. If /robots.txt
file not found it shows default text
Now create a Asp.Net Core MVC or API Project. Now . Project structure:
Configure Middleware :
Let us now go to the Configure() method . Add a middleware to the application pipeline and it can either pass the request to next delegate or it can end the request (short-circuit request pipeline). In this scenario if the URL
matches to /robots.txt
it end the request. Here's what the Configure() method may look like
Now run your project and navigate to https://hostname/robots.txt . It will show default text that we set in the middleware
User-agent: *
Disallow: /
Creating robots.txt file in document root
Now go to your project folder and create a text file robot.txt in the project root. Details in the image:
Now run your project and navigate to https://hostname/robots.txt . We have allowed search engine’s index all content. Output will be :
User-agent: *
allow: /
References
support.google.com| Robots Exclusion Standard.
Conclusion
Thanks a lot for reading. I hope you will love this article. Please share your valuable suggestions and feedbacks. Write in the comment box in case you have any questions. Have a good day! You can download the Source Code from my github.
I have also published tutorial on my personal blog www.codingwithesty.com. You can follow my on Linkedin.