# algorithm - two - in an array 1 100 numbers are stored one number is missing

## Easy interview question got harder: given numbers 1..100, find the missing number(s) (20)

As @j_random_hacker pointed out, this is quite similar to Finding duplicates in O(n) time and O(1) space, and an adaptation of my answer there works here too.

Assuming that the "bag" is represented by a 1-based array A[] of size N - k, we can solve Qk in O(N) time and O(k) additional space.

First, we extend our array A[] by k elements, so that it is now of size N. This is the O(k) additional space. We then run the following pseudo-code algorithm:

for i := n - k + 1 to n
A[i] := A[1]
end for

for i := 1 to n - k
while A[A[i]] != A[i]
swap(A[i], A[A[i]])
end while
end for

for i := 1 to n
if A[i] != i then
print i
end if
end for


The first loop initialises the k extra entries to the same as the first entry in the array (this is just a convenient value that we know is already present in the array - after this step, any entries that were missing in the initial array of size N-k are still missing in the extended array).

The second loop permutes the extended array so that if element x is present at least once, then one of those entries will be at position A[x].

Note that although it has a nested loop, it still runs in O(N) time - a swap only occurs if there is an i such that A[i] != i, and each swap sets at least one element such that A[i] == i, where that wasn't true before. This means that the total number of swaps (and thus the total number of executions of the while loop body) is at most N-1.

The third loop prints those indexes of the array i that are not occupied by the value i - this means that i must have been missing.

I had an interesting job interview experience a while back. The question started really easy:

Q1: We have a bag containing numbers 1, 2, 3, …, 100. Each number appears exactly once, so there are 100 numbers. Now one number is randomly picked out of the bag. Find the missing number.

I've heard this interview question before, of course, so I very quickly answered along the lines of:

A1: Well, the sum of the numbers 1 + 2 + 3 + … + N is (N+1)(N/2) (see Wikipedia: sum of arithmetic series). For N = 100, the sum is 5050.

Thus, if all numbers are present in the bag, the sum will be exactly 5050. Since one number is missing, the sum will be less than this, and the difference is that number. So we can find that missing number in O(N) time and O(1) space.

At this point I thought I had done well, but all of a sudden the question took an unexpected turn:

Q2: That is correct, but now how would you do this if TWO numbers are missing?

I had never seen/heard/considered this variation before, so I panicked and couldn't answer the question. The interviewer insisted on knowing my thought process, so I mentioned that perhaps we can get more information by comparing against the expected product, or perhaps doing a second pass after having gathered some information from the first pass, etc, but I really was just shooting in the dark rather than actually having a clear path to the solution.

The interviewer did try to encourage me by saying that having a second equation is indeed one way to solve the problem. At this point I was kind of upset (for not knowing the answer before hand), and asked if this is a general (read: "useful") programming technique, or if it's just a trick/gotcha answer.

The interviewer's answer surprised me: you can generalize the technique to find 3 missing numbers. In fact, you can generalize it to find k missing numbers.

Qk: If exactly k numbers are missing from the bag, how would you find it efficiently?

This was a few months ago, and I still couldn't figure out what this technique is. Obviously there's a Ω(N) time lower bound since we must scan all the numbers at least once, but the interviewer insisted that the TIME and SPACE complexity of the solving technique (minus the O(N) time input scan) is defined in k not N.

So the question here is simple:

• How would you solve Q2?
• How would you solve Q3?
• How would you solve Qk?

### Clarifications

• Generally there are N numbers from 1..N, not just 1..100.
• I'm not looking for the obvious set-based solution, e.g. using a bit set, encoding the presence/absence each number by the value of a designated bit, therefore using O(N) bits in additional space. We can't afford any additional space proportional to N.
• I'm also not looking for the obvious sort-first approach. This and the set-based approach are worth mentioning in an interview (they are easy to implement, and depending on N, can be very practical). I'm looking for the Holy Grail solution (which may or may not be practical to implement, but has the desired asymptotic characteristics nevertheless).

So again, of course you must scan the input in O(N), but you can only capture small amount of information (defined in terms of k not N), and must then find the k missing numbers somehow.

Can you check if every number exists? If yes you may try this:

S = sum of all numbers in the bag (S < 5050)
Z = sum of the missing numbers 5050 - S

if the missing numbers are x and y then:

x = Z - y and
max(x) = Z - 1

So you check the range from 1 to max(x) and find the number

Here's a solution that uses k bits of extra storage, without any clever tricks and just straightforward. Execution time O (n), extra space O (k). Just to prove that this can be solved without reading up on the solution first or being a genius:

void puzzle (int* data, int n, bool* extra, int k)
{
// data contains n distinct numbers from 1 to n + k, extra provides
// space for k extra bits.

// Rearrange the array so there are (even) even numbers at the start
// and (odd) odd numbers at the end.
int even = 0, odd = 0;
while (even + odd < n)
{
if (data [even] % 2 == 0) ++even;
else if (data [n - 1 - odd] % 2 == 1) ++odd;
else { int tmp = data [even]; data [even] = data [n - 1 - odd];
data [n - 1 - odd] = tmp; ++even; ++odd; }
}

// Erase the lowest bits of all numbers and set the extra bits to 0.
for (int i = even; i < n; ++i) data [i] -= 1;
for (int i = 0; i < k; ++i) extra [i] = false;

// Set a bit for every number that is present
for (int i = 0; i < n; ++i)
{
int tmp = data [i];
tmp -= (tmp % 2);
if (i >= odd) ++tmp;
if (tmp <= n) data [tmp - 1] += 1; else extra [tmp - n - 1] = true;
}

// Print out the missing ones
for (int i = 1; i <= n; ++i)
if (data [i - 1] % 2 == 0) printf ("Number %d is missing\n", i);
for (int i = n + 1; i <= n + k; ++i)
if (! extra [i - n - 1]) printf ("Number %d is missing\n", i);

// Restore the lowest bits again.
for (int i = even; i < n; ++i) data [i] += 1;
}


Here's a summary of Dimitris Andreou's link.

Remember sum of i-th powers, where i=1,2,..,k. This reduces the problem to solving the system of equations

a1 + a2 + ... + ak = b1

a12 + a22 + ... + ak2 = b2

...

a1k + a2k + ... + akk = bk

Using Newton's identities, knowing bi allows to compute

c1 = a1 + a2 + ... ak

c2 = a1a2 + a1a3 + ... + ak-1ak

...

ck = a1a2 ... ak

If you expand the polynomial (x-a1)...(x-ak) the coefficients will be exactly c1, ..., ck - see Viète's formulas. Since every polynomial factors uniquely (ring of polynomials is an Euclidean domain), this means ai are uniquely determined, up to permutation.

This ends a proof that remembering powers is enough to recover the numbers. For constant k, this is a good approach.

However, when k is varying, the direct approach of computing c1,...,ck is prohibitely expensive, since e.g. ck is the product of all missing numbers, magnitude n!/(n-k)!. To overcome this, perform computations in Zq field, where q is a prime such that n <= q < 2n - it exists by Bertrand's postulate. The proof doesn't need to be changed, since the formulas still hold, and factorization of polynomials is still unique. You also need an algorithm for factorization over finite fields, for example the one by Berlekamp or Cantor-Zassenhaus.

High level pseudocode for constant k:

• Compute i-th powers of given numbers
• Subtract to get sums of i-th powers of unknown numbers. Call the sums bi.
• Use Newton's identities to compute coefficients from bi; call them ci. Basically, c1 = b1; c2 = (c1b1 - b2)/2; see Wikipedia for exact formulas
• Factor the polynomial xk-c1xk-1 + ... + ck.
• The roots of the polynomial are the needed numbers a1, ..., ak.

For varying k, find a prime n <= q < 2n using e.g. Miller-Rabin, and perform the steps with all numbers reduced modulo q.

As Heinrich Apfelmus commented, instead of a prime q you can use q=2⌈log n⌉ and perform arithmetic in finite field.

I believe I have a O(k) time and O(log(k)) space algorithm, given that you have the floor(x) and log2(x) functions for arbitrarily big integers available:

You have an k-bit long integer (hence the log8(k) space) where you add the x^2, where x is the next number you find in the bag: s=1^2+2^2+... This takes O(N) time (which is not a problem for the interviewer). At the end you get j=floor(log2(s)) which is the biggest number you're looking for. Then s=s-j and you do again the above:

for (i = 0 ; i < k ; i++)
{
j = floor(log2(s));
missing[i] = j;
s -= j;
}


Now, you usually don't have floor and log2 functions for 2756-bit integers but instead for doubles. So? Simply, for each 2 bytes (or 1, or 3, or 4) you can use these functions to get the desired numbers, but this adds an O(N) factor to time complexity

I don't know whether this is efficient or not but I would like to suggest this solution.

1. Compute xor of the 100 elements
2. Compute xor of the 98 elements (after the 2 elements are removed)
3. Now (result of 1) XOR (result of 2) gives you the xor of the two missing nos i..e a XOR b if a and b are the missing elements
4.Get the sum of the missing Nos with your usual approach of the sum formula diff and lets say the diff is d.

Now run a loop to get the possible pairs (p,q) both of which lies in [1 , 100] and sum to d.

When a pair is obtained check whether (result of 3) XOR p = q and if yes we are done.

Please correct me if I am wrong and also comment on time complexity if this is correct

I haven't checked the maths, but I suspect that computing Σ(n^2) in the same pass as we compute Σ(n) would provide enough info to get two missing numbers, Do Σ(n^3) as well if there are three, and so on.

I think this can be done without any complex mathematical equations and theories. Below is a proposal for an in place and O(2n) time complexity solution:

Input form assumptions :

# of numbers in bag = n

# of missing numbers = k

The numbers in the bag are represented by an array of length n

Length of input array for the algo = n

Missing entries in the array (numbers taken out of the bag) are replaced by the value of the first element in the array.

Eg. Initially bag looks like [2,9,3,7,8,6,4,5,1,10]. If 4 is taken out, value of 4 will become 2 (the first element of the array). Therefore after taking 4 out the bag will look like [2,9,3,7,8,6,2,5,1,10]

The key to this solution is to tag the INDEX of a visited number by negating the value at that INDEX as the array is traversed.

    IEnumerable<int> GetMissingNumbers(int[] arrayOfNumbers)
{
List<int> missingNumbers = new List<int>();
int arrayLength = arrayOfNumbers.Length;

//First Pass
for (int i = 0; i < arrayLength; i++)
{
int index = Math.Abs(arrayOfNumbers[i]) - 1;
if (index > -1)
{
arrayOfNumbers[index] = Math.Abs(arrayOfNumbers[index]) * -1; //Marking the visited indexes
}
}

//Second Pass to get missing numbers
for (int i = 0; i < arrayLength; i++)
{
//If this index is unvisited, means this is a missing number
if (arrayOfNumbers[i] > 0)
{
missingNumbers.Add(i + 1);
}
}

return missingNumbers;
}


I'd take a different approach to that question and probe the interviewer for more details about the larger problem he's trying to solve. Depending on the problem and the requirements surrounding it, the obvious set-based solution might be the right thing and the generate-a-list-and-pick-through-it-afterward approach might not.

For example, it might be that the interviewer is going to dispatch n messages and needs to know the k that didn't result in a reply and needs to know it in as little wall clock time as possible after the n-kth reply arrives. Let's also say that the message channel's nature is such that even running at full bore, there's enough time to do some processing between messages without having any impact on how long it takes to produce the end result after the last reply arrives. That time can be put to use inserting some identifying facet of each sent message into a set and deleting it as each corresponding reply arrives. Once the last reply has arrived, the only thing to be done is to remove its identifier from the set, which in typical implementations takes O(log k+1). After that, the set contains the list of k missing elements and there's no additional processing to be done.

This certainly isn't the fastest approach for batch processing pre-generated bags of numbers because the whole thing runs O((log 1 + log 2 + ... + log n) + (log n + log n-1 + ... + log k)). But it does work for any value of k (even if it's not known ahead of time) and in the example above it was applied in a way that minimizes the most critical interval.

May be this algorithm can work for question 1:

1. Precompute xor of first 100 integers(val=1^2^3^4....100)
2. xor the elements as they keep coming from input stream ( val1=val1^next_input)
3. final answer=val^val1

Or even better:

def GetValue(A)
for i=1 to 100
do
val=val^i
done
for value in A:
do
val=val^value
done
return val


This algorithm can in fact be expanded for two missing numbers. The first step remains the same. When we call GetValue with two missing numbers the result will be a a1^a2 are the two missing numbers. Lets say

val = a1^a2

Now to sieve out a1 and a2 from val we take any set bit in val. Lets say the ith bit is set in val. That means that a1 and a2 have different parity at ith bit position. Now we do another iteration on the original array and keep two xor values. One for the numbers which have the ith bit set and other which doesn't have the ith bit set. We now have two buckets of numbers, and its guranteed that a1 and a2 will lie in different buckets. Now repeat the same what we did for finding one missing element on each of the bucket.

The problem with solutions based on sums of numbers is they don't take into account the cost of storing and working with numbers with large exponents... in practice, for it to work for very large n, a big numbers library would be used. We can analyse the space utilisation for these algorithms.

We can analyse the time and space complexity of sdcvvc and Dimitris Andreou's algorithms.

Storage:

l_j = ceil (log_2 (sum_{i=1}^n i^j))
l_j > log_2 n^j  (assuming n >= 0, k >= 0)
l_j > j log_2 n \in \Omega(j log n)

l_j < log_2 ((sum_{i=1}^n i)^j) + 1
l_j < j log_2 (n) + j log_2 (n + 1) - j log_2 (2) + 1
l_j < j log_2 n + j + c \in O(j log n)


So l_j \in \Theta(j log n)

Total storage used: \sum_{j=1}^k l_j \in \Theta(k^2 log n)

Space used: assuming that computing a^j takes ceil(log_2 j) time, total time:

t = k ceil(\sum_i=1^n log_2 (i)) = k ceil(log_2 (\prod_i=1^n (i)))
t > k log_2 (n^n + O(n^(n-1)))
t > k log_2 (n^n) = kn log_2 (n)  \in \Omega(kn log n)
t < k log_2 (\prod_i=1^n i^i) + 1
t < kn log_2 (n) + 1 \in O(kn log n)


Total time used: \Theta(kn log n)

If this time and space is satisfactory, you can use a simple recursive algorithm. Let b!i be the ith entry in the bag, n the number of numbers before removals, and k the number of removals. In Haskell syntax...

let
-- O(1)
isInRange low high v = (v >= low) && (v <= high)
-- O(n - k)
countInRange low high = sum $map (fromEnum . isInRange low high . (!)b) [1..(n-k)] findMissing l low high krange -- O(1) if there is nothing to find. | krange=0 = l -- O(1) if there is only one possibility. | low=high = low:l -- Otherwise total of O(knlog(n)) time | otherwise = let mid = (low + high) div 2 klow = countInRange low mid khigh = krange - klow in findMissing (findMissing low mid klow) (mid + 1) high khigh in findMising 1 (n - k) k  Storage used: O(k) for list, O(log(n)) for stack: O(k + log(n)) This algorithm is more intuitive, has the same time complexity, and uses less space. There is a general way to generalize streaming algorithms like this. The idea is to use a bit of randomization to hopefully 'spread' the k elements into independent sub problems, where our original algorithm solves the problem for us. This technique is used in sparse signal reconstruction, among other things. • Make an array, a, of size u = k^2. • Pick any universal hash function, h : {1,...,n} -> {1,...,u}. (Like multiply-shift) • For each i in 1, ..., n increase a[h(i)] += i • For each number x in the input stream, decrement a[h(x)] -= x. If all of the missing numbers have been hashed to different buckets, the non-zero elements of the array will now contain the missing numbers. The probability that a particular pair is sent to the same bucket, is less than 1/u by definition of a universal hash function. Since there are about k^2/2 pairs, we have that the error probability is at most k^2/2/u=1/2. That is, we succeed with probability at least 50%, and if we increase u we increase our chances. Notice that this algorithm takes k^2 logn bits of space (We need logn bits per array bucket.) This matches the space required by @Dimitris Andreou's answer (In particular the space requirement of polynomial factorization, which happens to also be randomized.) This algorithm also has constant time per update, rather than time k in the case of power-sums. In fact, we can be even more efficient than the power sum method by using the trick described in the comments. To solve the 2 (and 3) missing numbers question, you can modify quickselect, which on average runs in O(n) and uses constant memory if partitioning is done in-place. 1. Partition the set with respect to a random pivot p into partitions l, which contain numbers smaller than the pivot, and r, which contain numbers greater than the pivot. 2. Determine which partitions the 2 missing numbers are in by comparing the pivot value to the size of each partition (p - 1 - count(l) = count of missing numbers in l and n - count(r) - p = count of missing numbers in r) 3. a) If each partition is missing one number, then use the difference of sums approach to find each missing number. (1 + 2 + ... + (p-1)) - sum(l) = missing #1 and ((p+1) + (p+2) ... + n) - sum(r) = missing #2 b) If one partition is missing both numbers and the partition is empty, then the missing numbers are either (p-1,p-2) or (p+1,p+2) depending on which partition is missing the numbers. If one partition is missing 2 numbers but is not empty, then recurse onto that partiton. With only 2 missing numbers, this algorithm always discards at least one partition, so it retains O(n) average time complexity of quickselect. Similarly, with 3 missing numbers this algorithm also discards at least one partition with each pass (because as with 2 missing numbers, at most only 1 partition will contain multiple missing numbers). However, I'm not sure how much the performance decreases when more missing numbers are added. Here's an implementation that does not use in-place partitioning, so this example does not meet the space requirement but it does illustrate the steps of the algorithm: <?php$list = range(1,100);
unset($list[3]); unset($list[31]);

findMissing($list,1,100); function findMissing($list, $min,$max) {
if(empty($list)) { print_r(range($min, $max)); return; }$l = $r = [];$pivot = array_pop($list); foreach($list as $number) { if($number < $pivot) {$l[] = $number; } else {$r[] = $number; } } if(count($l) == $pivot -$min - 1) {
// only 1 missing number use difference of sums
print array_sum(range($min,$pivot-1)) - array_sum($l) . "\n"; } else if(count($l) < $pivot -$min) {
// more than 1 missing number, recurse
findMissing($l,$min, $pivot-1); } if(count($r) == $max -$pivot - 1) {
// only 1 missing number use difference of sums
print array_sum(range($pivot + 1,$max)) - array_sum($r) . "\n"; } else if(count($r) < $max -$pivot) {
// mroe than 1 missing number recurse
findMissing($r,$pivot+1, \$max);
}
}


Demo

Try to find the product of numbers from 1 to 50:

Let product, P1 = 1 x 2 x 3 x ............. 50

When you take out numbers one by one, multiply them so that you get the product P2. But two numbers are missing here, hence P2 < P1.

The product of the two mising terms, a x b = P1 - P2.

You already know the sum, a + b = S1.

From the above two equations, solve for a and b through a quadratic equation. a and b are your missing numbers.

Wait a minute. As the question is stated, there are 100 numbers in the bag. No matter how big k is, the problem can be solved in constant time because you can use a set and remove numbers from the set in at most 100 - k iterations of a loop. 100 is constant. The set of remaining numbers is your answer.

If we generalise the solution to the numbers from 1 to N, nothing changes except N is not a constant, so we are in O(N - k) = O(N) time. For instance, if we use a bit set, we set the bits to 1 in O(N) time, iterate through the numbers, setting the bits to 0 as we go (O(N-k) = O(N)) and then we have the answer.

It seems to me that the interviewer was asking you how to print out the contents of the final set in O(k) time rather than O(N) time. Clearly, with a bit set, you have to iterate through all N bits to determine whether you should print the number or not. However, if you change the way the set is implemented you can print out the numbers in k iterations. This is done by putting the numbers into an object to be stored in both a hash set and a doubly linked list. When you remove an object from the hash set, you also remove it from the list. The answers will be left in the list which is now of length k.

We can do the Q1 and Q2 in O(log n) most of the time.

Suppose our memory chip consists of array of n number of test tubes. And a number x in the the test tube is represented by x milliliter of chemical-liquid.

Suppose our processor is a laser light. When we light up the laser it traverses all the tubes perpendicularly to it's length. Every-time it passes through the chemical liquid, the luminosity is reduced by 1. And passing the light at certain milliliter mark is an operation of O(1).

Now if we light our laser at the middle of the test-tube and get the output of luminosity

• equals to a pre-calculated value(calculated when no numbers were missing), then the missing numbers are greater than n/2 .
• If our output is smaller, then there is at least one missing number that is smaller than n/2 . We can also check if the luminosity is reduced by 1 or 2. if it is reduced by 1 then one missing number is smaller than n/2 and other is bigger than n/2 . If it is reduced by 2 then both numbers are smaller than n/2 .

We can repeat the above process again and again narrowing down our problem domain. In each step, we make the domain smaller by half. And finally we can get to our result.

Parallel algorithms that are worth mentioning(because they are interesting),

• sorting by some parallel algorithm, for example, parallel merge can be done in O(log^3 n) time. And then the missing number can be found by binary search in O(log n) time.
• Theoretically, if we have n processors then each process can check one of the inputs and set some flag that identifies the number(conveniently in an array). And in the next step each process can check each flag and finally output the number that is not flagged. The whole process will take O(1) time. It has additional O(n) space/memory requirement.

Note, that the two parallel algorithms provided above may need additional space as mentioned in the comment.

Yet another way is using residual graph filtering.

Suppose we have numbers 1 to 4 and 3 is missing. The binary representation is the following,

1 = 001b, 2 = 010b, 3 = 011b, 4 = 100b

And I can create a flow-graph like the following.

                   1
1 -------------> 1
|                |
2      |     1          |
0 ---------> 1 ----------> 0  |
|                          |  |
|     1            1       |  |
0 ---------> 0 ----------> 0  |
|                |
1      |      1         |
1 ---------> 0 -------------> 1


Note that the flow graph contains x nodes, while x being the number of bits. And the maximum number of edges are (2*x)-2 .

So for 32 bit integer it will take O(32) space or O(1) space.

Now if I remove capacity for each number starting from 1,2,4 then I am left with a residual graph.

0 ----------> 1 ---------> 1


Finally I shall run a loop like the following,

 result = []
for x in range(1,n):
exists_path_in_residual_graph(x)
result.append(x)


Now the result is in result contains numbers that are not missing as well(false positive). But the k <= (size of the result) <= n when there are k missing elements.

I shall go through the given list one last time to mark the result missing or not.

So the time complexity will be O(n) .

Finally, it is possible to reduce the number of false positive(and the space required) by taking nodes 00,01,11,10 instead of just 0 and 1.

You can motivate the solution by thinking about it in terms of symmetries (groups, in math language). No matter the order of the set of numbers, the answer should be the same. If you're going to use k functions to help determine the missing elements, you should be thinking about what functions have that property: symmetric. The function s_1(x) = x_1 + x_2 + ... + x_n is an example of a symmetric function, but there are others of higher degree. In particular, consider the elementary symmetric functions. The elementary symmetric function of degree 2 is s_2(x) = x_1 x_2 + x_1 x_3 + ... + x_1 x_n + x_2 x_3 + ... + x_(n-1) x_n, the sum of all products of two elements. Similarly for the elementary symmetric functions of degree 3 and higher. They are obviously symmetric. Furthermore, it turns out they are the building blocks for all symmetric functions.

You can build the elementary symmetric functions as you go by noting that s_2(x,x_(n+1)) = s_2(x) + s_1(x)(x_(n+1)). Further thought should convince you that s_3(x,x_(n+1)) = s_3(x) + s_2(x)(x_(n+1)) and so on, so they can be computed in one pass.

How do we tell which items were missing from the array? Think about the polynomial (z-x_1)(z-x_2)...(z-x_n). It evaluates to 0 if you put in any of the numbers x_i. Expanding the polynomial, you get z^n-s_1(x)z^(n-1)+ ... + (-1)^n s_n. The elementary symmetric functions appear here too, which is really no surprise, since the polynomial should stay the same if we apply any permutation to the roots.

So we can build the polynomial and try to factor it to figure out which numbers are not in the set, as others have mentioned.

Finally, if we are concerned about overflowing memory with large numbers (the nth symmetric polynomial will be of the order 100!), we can do these calculations mod p where p is a prime bigger than 100. In that case we evaluate the polynomial mod p and find that it again evaluates to 0 when the input is a number in the set, and it evaluates to a non-zero value when the input is a number not in the set. However, as others have pointed out, to get the values out of the polynomial in time that depends on k, not N, we have to factor the polynomial mod p`.

You could try using a Bloom Filter. Insert each number in the bag into the bloom, then iterate over the complete 1-k set until reporting each one not found. This may not find the answer in all scenarios, but might be a good enough solution.

You will find it by reading the couple of pages of Muthukrishnan - Data Stream Algorithms: Puzzle 1: Finding Missing Numbers. It shows exactly the generalization you are looking for. Probably this is what your interviewer read and why he posed these questions.

Now, if only people would start deleting the answers that are subsumed or superseded by Muthukrishnan's treatment, and make this text easier to find. :)

Also see sdcvvc's directly related answer, which also includes pseudocode (hurray! no need to read those tricky math formulations :)) (thanks, great work!).