Training Data Generation

Explore how training data for dynamic pricing engines is generated from logged pricing decisions, contextual factors, and observed outcomes. Understand the complexities of selection bias, counterfactual gaps, and delayed signals. This lesson helps you grasp how models learn from decision-conditioned data rather than traditional supervised labels, preparing you to design and evaluate pricing systems effectively.

We'll cover the following...

Pricing as a logged decision-making process
Pointwise training data for pricing
The counterfactual gap in pricing data
Selection bias and historical pricing policies
Exploration and experimental data
Synthetic data and cold-start products
Data validation and business sanity checks
Interview questions and answers

Dynamic pricing systems don’t learn prices in isolation; they learn from past decisions made under real-world constraints and observe the resulting outcomes. Each training row encodes a historical pricing decision influenced by inventory, promotions, competitor behavior, regional rules, and risk tolerance. Training data is thus an active record of business logic, not just raw numbers.

Unlike classical supervised learning, pricing outcomes are contextual and conditional. A purchase at $20 doesn’t mean $20 was “correct”; a non-purchase doesn’t automatically indicate a price was too high. Models must interpret data considering timing, intent, stock levels, and applied policies.

Fun fact: Some large e-commerce platforms spend more engineering effort on pricing data logging and validation than on model development itself, because once bad pricing data is learned, models can amplify errors at scale.

Historical prices often reflect human decisions, rules, or earlier models, creating selection bias. Without careful handling, models simply replicate past policies rather than discovering optimal pricing. Strong candidates in interviews highlight the importance of understanding how pricing data is generated, what constraints it encodes, and why naïve assumptions are dangerous.

Pricing as a logged decision-making process

At its core, dynamic pricing is not a prediction problem; it is a decision-learning problem. Every row of training data exists because someone or something chose a price. Unlike traditional supervised learning, where labels exist independently of the model, pricing labels are created by decisions. This makes pricing fundamentally different from tasks like image classification or spam detection.

Each training example corresponds to a logged pricing decision made by a human operator, a rules engine, or a previous model. The data is not a passive observation of reality; it is the record of an action taken under uncertainty. This is why pricing data must always be interpreted as decision-conditioned evidence, not ground truth.

Fun fact: Many real-world pricing models are trained using supervised learning, but their data structure is identical to reinforcement learning logs: (state, action, reward), even if teams don’t explicitly call it that.

Every pricing decision can be decomposed into three essential components:

Context represents the observable state of the world at the moment the price was set. This includes factors such as inventory levels, time of day, seasonality, competitor prices, user segment, device type, and active promotions. Context defines what information was available at the time the decision was made. If relevant context is missing or logged incorrectly, the model will infer spurious relationships.
Action is the price that was actually chosen. Importantly, this is just one option among many possible prices. The model does not observe alternative actions that could have been taken. This single-action logging is the root cause of counterfactual uncertainty in pricing systems.
Outcome ...

1.Introduction

2.Practical ML Techniques/Concepts

Breakout Session

3.Search Ranking

Breakout Session

4.Feed Based System

5.Recommendation System

Breakout Session

Mock Interview

6.Self-Driving Car: Image Segmentation

7.Entity Linking System

8.Ad Prediction System

Breakout Session

9.Fraud Detection System

Mock Interview

10.Hate Speech Detection

Mock Interview

11.Dynamic Pricing Engine

Mock Interview

Mock Interview

Mock Interview

Training Data Generation

Pricing as a logged decision-making process