Coding Exercise: Analyze Clickstream with RDDs
Practice using PySpark RDDs to load and summarize simplified clickstream data.
We'll cover the following...
We'll cover the following...
Scenario
You’re working as a junior data engineer at a growing e-commerce company. Every day, the platform collects millions of clickstream logs—records of users interacting with the website. For now, you’ve been given a small sample to practice with.
Dataset
Here’s a small sample ...