How to parse a website with R

Parsing or web scraping refers to extracting the required data from the websites. The rvest library in R provides parsing functionality.

Steps to parse a webpage

We can parse a webpage with R in the following three steps:

Here is an R code that scraps data from a Wiki page.

Line 1: We import the rvest library.
Line 4: We use the read_html() function to fetch the downloaded HTML from the Wiki URL provided as a parameter.
Line 7: We scrape the page's title from the HTML code stored in the webpage. In this case, the CSS selector for the title is mv-page-title-main.
Line 10: We convert the value stored in data to readable form, i.e., text.

Try changing the CSS selector at line 7 to 'p'. This will scrape all the paragraph sections.

Note: In case, a pre-added CSS selector doesn't work, try inspecting the element and verify the CSS code.