What is reservoir sampling?
Reservoir sampling is a randomized algorithm that is used to select out of samples; is usually very large or unknown. For example, reservoir sampling can be used to obtain a sample of size from a population of people with brown hair. This algorithm takes to select elements with uniform probability.
Algorithm
-
Copy the first elements from the input array to the output array.
-
Iterate from to (both inclusive). In each iteration :
2.1 Generate a random number from 0 to .
2.2 If is less than , replace the element at index in the output array with the item at index in the input array.
Code
// Including dependancies#include <iostream>#include <stdlib.h>#include <time.h>using namespace std;int main() {// Defining the parametersint k = 4;int n = 8;// The array to be sampledint input[] = {1, 7, 4, 8, 2, 6, 5, 9};int output[k];// Getting a random seed everytimesrand (time(NULL));int i;// Initializing the output array to the first k elements// of the input array.for(i = 0; i < k; i++){output[i] = input[i];}int j;// Iterating over k to n-1for(j = i; j < n; j++){// Generating a random number from 0 to jint index = rand() % (j + 1);// Replacing an element in the output with an element// in the input if the randomly generated number is less// than k.if(index < k){output[index] = input[j];}}cout<<"Input array:"<<endl;for( i = 0; i < n; i++){cout<<input[i]<<" ";}cout<<endl;cout<<"Output array:"<<endl;for(i = 0; i < k; i++){cout<<output[i]<<" ";}cout<<endl;return 0;}
Free Resources
Copyright ©2025 Educative, Inc. All rights reserved