Trusted answers to developer questions

Ani Tumanyan

Given a singly linked list of unknown length, return a random node from the list by only one traversal. Each node must have the same probability of being chosen.

This problem is a popular interview question. The approach discussed below is a useful technique in data streaming applications when data is too large to fit in the memory.

We will use the reservoir sampling technique with a reservoir size equal to 1.

The algorithm consists of the following steps:

- Initialize the result node with the head node of the list.
- Track number of the processed elements. After the first step
$i = 1$ , the head node is processed. - Iterate over the elements in the list:
- Generate uniformly distributed random number
$r \in [0, i];$ - If the generated number
$r$ equals$0$ , then select the current node as a result node. - Increment the number of processed elements:
$i = i + 1;$ - Move to the next element of the list.

After these steps, the resulting node will be a random node chosen with equal probability from a given list. See the proof of correctness** **below:

Let

- Time complexity is
$O(n)$ as we iterate over the whole list.

- The Space complexity is
$O(1)$ as constant auxiliary space is used.

Let's look at the code below:

#include <random> //get random number in a given range int getRandomNumber(int left, int right) { //C++11's random number generation facilities std::uniform_int_distribution<> distrib(left, right); std::mt19937 randomGen; return distrib(randomGen); } //get random node from a given linked list in a single pass ListNode* getRandomNode(ListNode* head) { ListNode* resultNode = head; int i = 1; ListNode* currNode = head->next; while(currNode != nullptr) { int randomInd = getRandom(0, i); if(randomInd == 0) resultNode = currNode; ++i; currNode = currNode->next; } return resultNode; }

Let

- During iteration over the
$i$ -th node, there are already$i-1$ processed elements. We get a random number from$0$ to$i - 1$ , so there are$i$ values to be chosen with equal probability.

- The probability that generated random number equals
$0$ is$1/i$ , so the$i$ -th node is selected with probability$1/i$ .

- At the end of the whole process, the
$i$ -th node will remain selected if it is not replaced by further nodes during iteration, which means further nodes haven't been selected.

- For
$(i+1)$ -th node, the probability of not being selected is$1 - 1/(i+1)$ , for$(i+2)$ -th node, it is,$(i+2) n$ - th node$- \; 1 - 1/n$ . The probability that$i$ -th node remains selected after further iteration is a product of these probabilities:$(1 - \frac{1}{i+1})(1 - \frac{1}{i+2}) \dots (1 - \frac{1}{n}) = \frac{i}{n}.$

- So, the probability for
$i$ -th node to be selected as a result node is:

This proves that each node is returned with equal probability** **

RELATED TAGS

algorithms

coding interview

c++

data structures

CONTRIBUTOR

Ani Tumanyan

RELATED COURSES

View all Courses

Keep Exploring

Related Courses