Data APIs and Web Scraping

Explore a variety of questions around APIs and web scraping in Python.

APIs and web scraping are essential techniques for gathering external data. Whether you're building a data pipeline or enriching your training dataset, you’ll need to interact with REST endpoints or parse HTML content. In this lesson, we’ll implement three common tasks: retrieving API data, handling pagination, and scraping product information from a website. Let’s get started.

Retrieving data from a RESTful API

An interviewer may ask, “How would you retrieve data from a RESTful API? Walk me through the steps and demonstrate with a sample Python code.”

This question is frequently asked in entry-level data science interviews to check basic API literacy.

Press + to interact
Python - RESTFUL API
import requests
def get_weather_data(city: str, api_key: str) -> dict:
#TODO your implementation

Sample answer

There are several ways to approach this. For our solution, let’s implement a simple Python function that uses the requests library to fetch weather data from an API (e.g., wttr.in) that does not require an API key for access. To modify this code snippet to use authentication, you would simply ensure your function can support an API key.

Here’s a sample implementation:

Press + to interact
Python - RESTFUL API
import requests
def get_weather_data(city: str) -> dict:
"""
Fetch weather data for a given city using a public weather API.
Args:
city (str): Name of the city to get weather data for
Returns:
dict: Weather data or error message
"""
url = f"https://wttr.in/{city}?format=j1"
try:
response = requests.get(url)
response.raise_for_status() # Raise an exception for bad responses
# Parse and return weather data
weather_data = response.json()
return weather_data
except requests.RequestException as e:
return {"error": f"Unable to fetch data: {str(e)}"}
def main():
cities = ["New York", "London", "Tokyo", "Sydney"]
for city in cities:
print(f"\nWeather for {city}:")
weather_data = get_weather_data(city)['current_condition'][0]
if "error" in weather_data:
print(weather_data["error"])
else:
print(f"Temperature: {weather_data['temp_F']}°F")
print(f"Feels like: {weather_data['FeelsLikeF']}°F")
print(f"Humidity: {weather_data['humidity']}%")
print(f"Description: {weather_data['windspeedMiles']}")
main()

Let’s look at a code breakdown of some of the key areas of the code:

  • Line 1: We import the requests library to make HTTP calls to a public weather API.

  • Line 3: We define a get_weather_data function.

  • Line 10: We construct a URL using the wttr.in API, requesting weather in JSON format for ...