Problem Statement and Metrics

Explore how to frame the problem of rental search ranking by predicting bookings using a classification model. Understand the importance of choosing appropriate offline metrics such as normalized discounted cumulative gain (nDCG) and online metrics like conversion rate. Learn techniques to handle imbalanced data and design training, validation, and inference strategies for a practical ranking system.

We'll cover the following...

Airbnb rental search ranking

Airbnb rental search ranking

1. Problem statement

Airbnb users often search for homes in a specific location. The platform must return relevant stays—but more than that, it must return homes users are likely to book. So the goal of the ranking system is simple:

Rank the homes such that those most likely to be booked appear higher in the search results.

A naive method might use keyword matching or hand-crafted scoring—like sorting based on similarity between the query and listing descriptions. But this fails in practice. Text similarity might show results that “sound good,” but don’t necessarily lead to bookings.

Instead, we want a data-driven approach. If we could estimate the likelihood of a user booking a given listing, we could rank by that likelihood. That brings us to the core idea:

Train a supervised machine learning model that learns from historical user sessions and predicts whether a listing will be booked. This becomes a binary classification task: booked vs. not booked.

Why binary classification?

Our outcome is binary (booked or not).

It allows flexibility in evaluating ranking, user behavior, and optimizing for downstream metrics like revenue.

Alternative methods like regression (predicting booking probability directly) could work, but classification gives more control when balancing false positives vs. negatives—important in high-stakes ranking.

2. Metrics design and requirements

Metrics

Designing the right metrics is just as important as choosing the algorithm. The wrong metric can optimize the wrong behavior.

We break down metrics into two buckets: offline metrics (evaluated during training) and online metrics (measured in production).

Offline metrics

Normalized discounted Cumulative Gain: nDCG is a standard metric in ranking problems where position matters. It gives higher weight to correct predictions near the top of the list—exactly what we want in search ranking.

Why nDCG?

Users rarely scroll through all results. A relevant result at position 2 is more valuable than at position 10. *It accounts for both relevance and position, unlike basic accuracy or AUC. *It reflects user satisfaction better than simple classification metrics like precision or recall.

DCG_{p} = \sum_{i=1}^p {rel_{i} \over log_{2}(i+1)}

1.Machine Learning Primer

2.Video Recommendation

3.Feed Ranking

4.Ad Click Prediction

Mock Interview

5.Rental Search Ranking

6.Estimate Food Delivery Time

7.Conclusion

Assessment