Solution: User Defined Function
Explore how to build and use User Defined Functions in PySpark to calculate median review metrics for specific years and top products. Understand filtering, grouping, and percentile-based selections to compare aggregated results for robust data analysis.
We'll cover the following...
We'll cover the following...
Task
Calculate the median review for a selected year—for example, 2016—and compare it with the median review of the top product for that particular year. The top product is selected based on the total number of reviews for a particular ...