This device is not compatible.

Web Scraping Using Selenium in Python

PROJECT


Web Scraping Using Selenium in Python

In this project, we’ll scrape the Wikipedia website using different tools provided by the Selenium library in Python. We’ll master the techniques of fetching data using multiple Selenium commands in the form of HTML elements. Lastly, we’ll learn to automate the events on a web page.

Web Scraping Using Selenium in Python

You will learn to:

Understand the fundamentals of Selenium methods.

Automate the events on a webpage using Selenium.

Use regex for text cleaning.

Create Python dictionaries from scraped data.

Skills

Web Scraping

Python Programming

HTML Elements

Prerequisites

Basic understanding of the Python language

Basic understanding of the Selenium library

Basic understanding of the Python regex library

Basic understanding of Python dictionaries

Technologies

CSS

HTML

Python

Selenium

Project Description

In this project, you’ll use the Selenium library in Python to scrape data from a website. You’ll scrape the data from Wikipedia, the fastest growing free online encyclopedia.

Throughout this project, you’ll use multiple Selenium commands to fetch HTML elements. You’ll fetch elements using the following attributes:

  • CSS class names
  • CSS IDs
  • HTML tag names
  • Link texts
  • Texts
  • Nested CSS selectors
  • Attributes

Furthermore, you’ll use multiple Selenium events to automate the processes on this website. Finally, you’ll use the regex library to clean the text data.

Project Tasks

1

Initial Setup

Task 1: Get Started

Task 2: Navigate to the Web Page

2

Scrape the Data

Task 3: Fetch an Element Using ID

Task 4: Fetch an Element Using Its Class Name

Task 5: Switch to the New Window

Task 6: Fetch an Element Using Link Text

Task 7: Fetch Elements By a Tag Name

Task 8: Extract the Text from Elements

Task 9: Remove Stop Words from the Text

Task 10: Fetch Nested Elements

Congratulations!