Barry Shwartz from Search Engine Land wrote a post on John Mueller advising why it’s not a good idea to make a dynamically driven robots.txt file, saying updating a static file by hand is a better approach.
John wrote in Oct 2015:
“Making the robots.txt file dynamic (for the same host! Doing this for separate hosts is essentially just a normal robots.txt file for each of them.) would likely cause problems: it’s not crawled every time a URL is crawled from the site, so it can happen that the “wrong” version is cached. For example, if you make your robots.txt file block crawling during belize mobile numbers list business hours, it’s possible that it’s cached then and followed for a day — meaning nothing gets crawled (or, alternately, cached when crawling is allowed). Google crawls the robots.txt file about once a day for most sites, for example.”
The takeaway
The lesson is that dynamically changing your robots.txt throughout the day confuses Google’s crawlers, which might allow Google to crawl media files, URLs, and sensitive documentation you don’t want or prevent it from crawling what you need. Google’s developer advocate, Martin Splitt, recently provided excellent advice on the best ways to block Googlebot from crawling your site and one common mistakes every site owner should avoid.
Old advice against making dynamically generated robots.txt remains relevant
-
- Posts: 202
- Joined: Sat Dec 28, 2024 8:56 am