Thursday, July 31, 2008

Uniform random numbers

Random numbers are used extensively in modern computing, for example they are used in testing programs, simulations, cryptology and in randomized algorithms like randomized primality testing.

True randomness on a computing machine is impossible, because the numbers are generated by an algorithm, but it is sufficient to produce numbers that appear to be random, these are called pseudo random numbers, they satisfy many of the properties of true random numbers. Algorithms used to generate random numbers are called pseudo random number generators (PRNG). To be sure that the numbers are "random enough", they must pass some statistical tests.

Uniformly distributed random numbers
A uniformly distributed random sequence A1, A2, ..., Am generated from X1, X2, ..., Xn satisfies:
  1. Ai is equally likely to be X1, X2, ..., Xn, i = 1, 2, ..., m.
  2. The average of the generated numbers is (∑i = 1 Ai)/ m.
After generating n numbers, the previous two properties hold, more stronger properties do not:
  1. The sum of any two consecutive numbers is equally likely to be even or odd.
  2. When n numbers are generated, there should be duplicates in the sequence.
Unfortunately, the number of times the sum of two consecutive integers is even is always more than the number of times the sum is odd, and the sequence is duplicate free. All pseudo random number generators fail some statistical tests.

The simplest uniform generator: Linear Congruential Generator
This generator was first introduced in 1951, it is a pseudo random number generator, the generated sequence X1, X2, ..., Xn satisfies:

Xi+1 = A[Xi (mod M)]

The (i+1)th number is generated by multiplying its previous number by a constant
A and computing the remainder of the division of the result be another constant M. It is easy to see that all of the generated numbers are less than M. To start generating the numbers, we must start with an initial value called the seed, say X0, but take care that if X0 = 0 then, you have a sequence of zeroes (which is not random).

If M is prime
Notice that if M is prime, then no zeroes are generated because, primes have no non-trivial divisors, let M = 7, A = 5 and X0 = 1, the first 8 numbers are:
5, 25, 20, 30, 10, 15, 5, 25, ...

value Notice that the sequence is repeated after 6 (which is M - 1) numbers are generated, this is called the period of the sequence, the choice of A matters, not all values of A will generated a full-period sequence, but several choices of A are valid. A full-period is achieved of all the non-zero integers less than M are generated.

To generate a long sequence with a full-period, choose M to be a large prime, very popular choice of A and A are 231 - 1 and 48,271.

Overflow problem
If we use 32-bit integers in our computation, overflow will occur after a sufficient number of values are generated, so, we would no longer be sure if a full-period is achieved. A solution to this problem is to use 64-bit integers (called long in Java), but this is inefficient computationally, a slight change in the main equation helps us overcome the overflow problem:

Xi+1 = A[Xi (mod ℑ)] - ℜ⌊Xi / ℑ⌋ + Mδ(Xi)

Where ℑ and ℜ are the quotient and the remainder of M / A.
  • δ(Xi) = 0, if A[Xi (mod ℑ)] - ℜ⌊Xi/ℑ⌋ ≥ 0
  • δ(Xi) = 0, if A[Xi (mod ℑ)] - ℜ⌊Xi/ℑ⌋ ≤ 0
When the difference between the first two terms is less than zero, then δ(Xi) is defined to be 1, and when the difference is greater than 0, δ(Xi) is defined to be 0.

Pseudo code
M ← 19
A ← 5
ℑ ← ⌊M/A⌋
ℜ ← M (mod A)
for i = 1 to i = n step 1
begin
  • Xi ← A×[Xi-1 (mod ℑ)] - ℜ×⌊Xi-1/ℑ⌋
  • if Xi ≤ 0 then Xi ← Xi + M
end

References
  1. Data Structures & Problem Solving Using Java (1998).
  2. Wikipedia.

2 comments:

Hany M. El-Hosseiny said...

Dear Math and CS,
I think the definition of uniform random sequence is wrong. You may have to write:
1- $X_1, \cdots, X_n$ take values in $\{1,\cdots, N\}$.
2- Each $X_k$ is equally likely to take any value in $\{1,\cdots, N\}$.
3- The average of $X_1,\cdots, X_n$ approaches $\frac{N+1}2$ as $n$ grows.

Hany M. El-Hosseiny said...

Also, the formula for generating random numbers ${\rm mod} M$ should be written (notice the parentheses)
$$X_{i+1}=(A X_i){\rm mod} M$$
Another mistake appears in the example with $M=7$, $A=5$ where you did not take the ${\rm mod} M$. You will actually have
$$ 5, 4, 6, 2, 3, 1, 5, 4, \cdots$$