Reservoir sampling is a family of randomized algorithms for randomly choosing a
sample of size **k** from a set **S** that contains **n** items, where **n** is
either very large or unknown. The answer for question 2-43 of the second edition
of The Algorithm Design Manual by Steve S. Skiena is a reservoir sampling
algorithm. The problem statement is as follows:

You are given a set S of n numbers. You must pick a subset S’ of k numbers from S such that the probability of each element of S occurring in S’ is equal (i.e., each is selected with probability k/n). You may make only one pass over the numbers. What if n is unknown?

This is my implementation (using Python 3) of an algorithm that would solve this problem in Θ(n) time.

```
def random_sample(iterator, k):
sample = []
elements_so_far = 0
for item in iterator:
elements_so_far += 1
if len(sample) < k:
sample.append(item)
else:
index = int(random.random() * elements_so_far)
if index < k:
sample[index] = item
return sample
```

- Wikipedia
- The Algorithm Design Manual (2nd edition)