urllib.robotparser

Let's explore urllib.robotparser and its use.

We'll cover the following

Overview

The robotparser module is made up of a single class, RobotFileParser. This class will answer questions about whether or not a specific user agent can fetch a URL that has a published robot.txt file. The robots.txt file will tell a web scraper or robot what parts of the server should not be accessed. Let’s take a look at a simple example using ArsTechnica’s website:

Get hands-on with 1200+ tech skills courses.