Scrapy

Learn about Scrapy and how it is used to extract data out of the HTML pages and provide a modular code for doing so.

Scrapy

Scrapy is one of the other tools that we can use to extract data from sites. It has a more organized structure and works in a systematic manner. We won’t be going into too much detail though. Instead, we will look at a real-time example of scraping data using Scrapy to understand how it works.

Exercise

Task

We will visit a site that displays questions/queries asked by people all over the world. A screenshot of the image is shown below.

We will be scraping the top ten questions being asked at the time the site is requested by our code. Everything scraped will include four aspects of a question:

  1. Summary
  2. Votes
  3. Views
  4. Number of answers

Single question’s HTML

Go to the site and pick one question.

This question’s HTML looks like the HTML below. This gives us an idea of the general structure of a single question’s HTML.

Get hands-on with 1200+ tech skills courses.