Robots.txt files are primarily used to avoid overloading a website with crawler requests.
To ensure the effectiveness and efficiency of their search engine crawlers in finding and updating information about websites on the internet, search engines establish “crawl budgets”.
These budgets ensure crawlers don’t use up too many site resources during their requests which can negatively impact the functionality of a website.
These budgets also demand websites make a crawler’s job as easy as possible.
Duplicate content, load errors, errors pages, low quality (spammy) content and other variables makes indexing more challenging and leads to crawlers being less efficient.
This inefficiency is frowned upon by Google and equates to marks against these types of websites based on “errors”. Marks lower ranking potential in SERPs and should be avoided as often as possible.
The goal for webmasters is to maximize the efficiency of search engine crawlers by making their websites easy to index with quality content and detailed instructions.
Robots.txt files should not be used to keep pages off of Google.
If a webmaster wants to keep Google from indexing a page, they should instead use noindex directives or implement password protection.