Solution: In No Particular Order
Explore various approaches to selecting random rows in SQL without causing costly table scans or sorts. Understand the benefits and limitations of methods like choosing random keys, using offsets, and leveraging database-specific sampling features to improve query efficiency.
The sort-by-random technique is an example of a query that’s bound to perform a table scan and an expensive manual sort. When we design solutions in SQL, we should be on the lookout for inefficient queries like this. Instead of searching fruitlessly for a way to optimize an unoptimizable query, we need to rethink our approach. We can use the alternative techniques shown in the following sections to query a random row from a query result set. In different circumstances, each of these solutions can produce the same result with greater efficiency than sort-by-random.
Choose a random key-value between 1 and MAX
One technique that avoids sorting the table is to choose a random value between 1 and the greatest primary key value.
This solution assumes that:
- the primary key values start at 1
- the primary key values are contiguous, that is, there are no values unused between 1 and the greatest value. If there are gaps, a randomly chosen value