'Normal' distributions...how to generate
Author Message
'Normal' distributions...how to generate

Does anybody have an algorithm (or C-code) for generating "normal"
distributions from a random number generator?
Where, say for any time t on t0-t1, some f(t) would be produced..
so a plot of all f(t)'s from t0-t1 produces a bell-shaped
normal distribution?

--
/**********************************************************

**********************************************************/

Mon, 08 Sep 1997 06:59:40 GMT
'Normal' distributions...how to generate

Quote:

> Does anybody have an algorithm (or C-code) for generating "normal"
> distributions from a random number generator?

Of all the questions that I've been too lazy or too ignorant or
too scatterbrained to add to the comp.lang.c FAQ list over the
years, this question ranks among the most pressing.  Finally (as
part of the preparation of the book-length version), I've pulled
together the three answers that are always suggested, whenever
this question comes up, and composed the following.  (Disclaimer:
I'm not a numerical analyst; part of my reason for posting this
here now is to let people poke holes in this code.)

Q: How can I generate random numbers with a normal or
Gaussian distribution?

A: There are a number of ways of doing this.

1.  Exploit the Central Limit Theorem ("law of large numbers")
and add up several uniformly-distributed random numbers:

#include <stdlib.h>

#define NSUM 12

double gaussrand()
{
double x = 0;
int i;
for(i = 0; i < NSUM; i++)
x += (double)rand() / RAND_MAX;

x -= NSUM / 2.;
x /= NSUM / 12.;

return x;
}

2.  Use a method described by Abramowitz and Stegun:

#include <stdlib.h>
#include <math.h>

#define PI 3.141592654

double gaussrand()
{
static double U, V;
static int phase = 0;
double Z;

if(phase == 0) {
U = (rand() + 1.) / (RAND_MAX + 2.);
V = rand() / (RAND_MAX + 1.);
Z = sqrt(-2 * log(U)) * sin(2 * PI * V);
} else
Z = sqrt(-2 * log(U)) * cos(2 * PI * V);

phase = 1 - phase;

return Z;
}

3.  Use a method described by Box and Muller, and
discussed in Knuth:

#include <stdlib.h>
#include <math.h>

double gaussrand()
{
static double V1, V2, S;
static int phase = 0;
double Z;

if(phase == 0) {
do {
double U1 = (double)rand() / RAND_MAX;
double U2 = (double)rand() / RAND_MAX;

V1 = 2 * U1 - 1;
V2 = 2 * U2 - 1;
S = V1 * V1 + V2 * V2;
} while(S >= 1);

Z = V1 * sqrt(-2 * log(S) / S);
} else
Z = V2 * sqrt(-2 * log(S) / S);

phase = 1 - phase;

return Z;
}

All three methods generate numbers with mean 0 and standard
deviation 1.  (To adjust to some other distribution, multiply by
the standard deviation and add the mean.)  Method 1 is poor "in
the tails" (especially if NSUM is small), but methods 2 and 3

References:
Knuth Sec. 3.4.1 p. 117.
G.E.P. Box, M.E. Muller, and G. Marsaglia,
Annals Math. Stat. 28 (1958), 610.
Abramowitz and Stegun, ???

Does anyone have the full citations for the Box, Muller, and
Marsaglia or Abramowitz and Stegun papers?

Steve Summit

Tue, 09 Sep 1997 01:09:11 GMT
'Normal' distributions...how to generate

Quote:
>  Does anybody have an algorithm (or C-code) for generating "normal"
>  distributions from a random number generator?
>  Where, say for any time t on t0-t1, some f(t) would be produced..
>  so a plot of all f(t)'s from t0-t1 produces a bell-shaped
>  normal distribution?
>--
>/**********************************************************

> **********************************************************/

Generate a sequence of uniform pseudo-random numbers in the range [t0,t1].
The mean of this sequence is a random variable with a normal distribution.

ie: If x1 and x2 are uniformly distributed random variables in [t0,t1], then
(x1+x2)/2 is a normally distributed random variable in [t0,t1].

Rick Castrapel

Thu, 11 Sep 1997 07:10:07 GMT
'Normal' distributions...how to generate

Quote:
Castrapel) writes:

<snip>
|> Generate a sequence of uniform pseudo-random numbers in the range [t0,t1].
|> The mean of this sequence is a random variable with a normal distribution.
|>
|> ie: If x1 and x2 are uniformly distributed random variables in [t0,t1],
then
|> (x1+x2)/2 is a normally distributed random variable in [t0,t1].

Two is the worst approximant of infinity that I have come across in a while
:-)

I think given two numbers x1 and x2 uniform [0,1], -log(x1)*sin(2*pi*x2) and
-log(x1)*cos(2*pi*x2) are two independent normal variates. (Check this: I
rederive it everytime I need it, so what I remember may not be completely
accurate).

Cheers
Tanmoy
--

Tanmoy Bhattacharya O:T-8(MS B285)LANL,NM87544-0285,USA H:#3,802,9 St,NM87545
Others see <gopher://yaleinfo.yale.edu:7700/00/Internet-People/internet-mail>,
<http://alpha.acast.nova.edu/cgi-bin/inmgq.pl>or<ftp://csd4.csd.uwm.edu/pub/
internetwork-mail-guide>. -- <http://nqcd.lanl.gov/people/tanmoy/tanmoy.html>
fax: 1 (505) 665 3003   voice: 1 (505) 665 4733    [ Home: 1 (505) 662 5596 ]

Thu, 11 Sep 1997 11:04:38 GMT
'Normal' distributions...how to generate

Quote:

>>  Does anybody have an algorithm (or C-code) for generating "normal"
>>  distributions from a random number generator?
>>  Where, say for any time t on t0-t1, some f(t) would be produced..
>>  so a plot of all f(t)'s from t0-t1 produces a bell-shaped
>>  normal distribution?

>>--
>>/**********************************************************

>> **********************************************************/

>Generate a sequence of uniform pseudo-random numbers in the range [t0,t1].
>The mean of this sequence is a random variable with a normal distribution.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Quote:

>ie: If x1 and x2 are uniformly distributed random variables in [t0,t1], then
>(x1+x2)/2 is a normally distributed random variable in [t0,t1].

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Amazing.

Since when did the Central Limit Theorem become a statement about small
sample properties? (Not to mention the remarkable circumstance of a
normal variate with a finite interval for its domain.)

The proposed algorithm (averaging sets of uniform variates) is *only* a
practical approximation.

Cheers,
ar

Thu, 11 Sep 1997 09:55:08 GMT
'Normal' distributions...how to generate
[on methods to generate sample (Standard) Normal variates]
Quote:

>    1.  Exploit the Central Limit Theorem ("law of large numbers")
>        and add up several uniformly-distributed random numbers:

>            #include <stdlib.h>

>            #define NSUM 12

>            double gaussrand()
>            {
>                    double x = 0;
>                    int i;
>                    for(i = 0; i < NSUM; i++)
>                            x += (double)rand() / RAND_MAX;

>                    x -= NSUM / 2.;
>                    x /= NSUM / 12.;

^^^^^^^^^^^^^^^
In theory, should be:  x /= sqrt( (double) NSUM / 12. ) ;

Quote:

>                    return x;
>            }

>    2.  Use a method described by Abramowitz and Stegun:

[a method based on a mathematical relation between the joint
distribution of a pair of independent uniform variates and that of a
pair of independent gaussians: don't bother to puzzle it out if the
phrase "Jacobian determinant" means nothing to you :-)]

Quote:

>    3.  Use a method described by Box and Muller, and
>        discussed in Knuth:

[a variation on Method 2 to save on the trig function calls at the
expense of possibly multiple rand() calls: the basic idea is to sample
a point in the unit circle, and use the coordinates to calculate the
sin and cos values (!) -- a cool hack.]

The CLT basically says that given _any_ random variate (e.g., not
necessarily uniform and *also* not necessarily continuously distributed)
with certain constraints on its first two moments, for a sample of
size N the statistic

x(N) = ( Sum(N) - N*Expectation ) / sqrt( N*Variance )

tends to the Standard Normal distribution as N -> oo.

So, given precautions against overflow, normalizing rand() values
before summing at each iteration isn't really necessary: the
Expectation U/2 and Variance U^2/12 of a continuous [0,U] uniform
variate can be factored in at the end.

The practical consideration is how *small* can we make N without being
dangerously off-mark? 12 is the number usually cited in books -- it
saves the sqrt() call and obviates a division :-) -- but, as noted,
this renders the method definitely inferior to the next 2 methods.

Quote:
>Does anyone have the full citations for the Box, Muller, and
>Marsaglia or Abramowitz and Stegun papers?

M. Abramowitz and I.A.Stegun (1964)
Handbook of Mathematical Functions
Applied Mathematics Series, vol. 55 (Washington: Nat. Bur. of Standards)
Reprinted, 1968, Dover Publications.

Cheers,
ar

Thu, 11 Sep 1997 13:07:20 GMT
'Normal' distributions...how to generate

Quote:

>I think given two numbers x1 and x2 uniform [0,1], -log(x1)*sin(2*pi*x2) and
>-log(x1)*cos(2*pi*x2) are two independent normal variates. (Check this: I
>rederive it everytime I need it, so what I remember may not be completely
>accurate).

The transformation is

y1 = sqrt( -2 * ln( x1 ) ) * sin( 2 * pi * x2 )
y2 = sqrt( -2 * ln( x1 ) ) * cos( 2 * pi * x2 )

with inverse

x1 = exp( -( y1^2 + y2^2 ) / 2 )
x2 = arctan( y1 / y2 ) / ( 2 * pi )

The Jacobian determinant of this transformation is the product of
gaussian density functions in y1 and y2.

Cheers,
ar

Thu, 11 Sep 1997 13:26:20 GMT
'Normal' distributions...how to generate

Quote:

>>  Does anybody have an algorithm (or C-code) for generating "normal"
>>  distributions from a random number generator?
>>  Where, say for any time t on t0-t1, some f(t) would be produced..
>>  so a plot of all f(t)'s from t0-t1 produces a bell-shaped
>>  normal distribution?
>>--
>>/**********************************************************

>> **********************************************************/
>Generate a sequence of uniform pseudo-random numbers in the range [t0,t1].
>The mean of this sequence is a random variable with a normal distribution.
>ie: If x1 and x2 are uniformly distributed random variables in [t0,t1], then
>(x1+x2)/2 is a normally distributed random variable in [t0,t1].
>Rick Castrapel

and quite correctly
called me on this one for overstating the case.

The problem is that, as the size of the uniform random sequence over which
you are taking the mean goes to infinity, the variance of that mean shrinks
to zero.  Thus you cannot truly get a normal distribution.

However, the Central Limit Theorem justifies using this approximation in
computation.  With tail between my legs, I apologize for the lack of
mathematical rigor, but I take refuge in the fact that those of us in
scientific computation must make due with approximations.

Rick Castrapel

Fri, 12 Sep 1997 05:37:05 GMT
'Normal' distributions...how to generate

Quote:
>  Does anybody have an algorithm (or C-code) for generating "normal"
>  distributions from a random number generator?

Here is my file ranstdnor.c

I generate z ~ N(0,1), if you want N(mu, sigma), use mu + z*sigma

#include <math.h>
#include <stdio.h>
#include <utils.h>
#include <distribs.h>

double ranstdn(void)            /* TOMS 712 */
{
const float s = .449871f, t = -.386595f, a = .196f, b = .25472f,
r1 = .27597f, r2 = .27846f;
float q, u, v, x, y;

do {
u = ranu(); v = ranu();
x = u - s;
v = (v - 0.5f) * 1.7156f;
y = fabs(v) - t;
q = x*x + y*(a*y - b*x);

/*  Accept P if inside inner ellipse */
if (q < r1) return v/u;
if (q > r2) continue;
if (v*v > -4*log(u)*u*u) continue;
} while (1);
return v/u; /* ratio of P's coordinates is normal deviate */

Quote:
}

/* if we are doing either of the two regression-testing procedures
we need the same fortran RNG: */
#ifdef COMPARE_TESTING
#define NEED_FORTRAN_RANU 1
#endif
#ifdef REG_TESTING
#ifndef NEED_FORTRAN_RANU
#define NEED_FORTRAN_RANU 1
#endif
#endif
/* If either COMPARE_TESTING or REG_TESTING are on, we NEED_FORTRAN_RANU */

#ifdef NEED_FORTRAN_RANU
struct {
int indx;

Quote:
} ranuc_;

#define ranuc_1 ranuc_

double ranu()
{
/* Initialized data */

static float u[279] = { (float).5139,(float).1757,(float).3087,(float)
.5345,(float).9476,(float).1717,(float).7022,(float).2264,(float)
.4948,(float).1247,(float).0839,(float).3896,(float).2772,(float)
.3681,(float).9834,(float).5354,(float).7657,(float).6465,(float)
.7671,(float).7802,(float).823,(float).1519,(float).6255,(float)
.3147,(float).3469,(float).9172,(float).5198,(float).4012,(float)
.6068,(float).7854,(float).9315,(float).8699,(float).8665,(float)
.6745,(float).7584,(float).5819,(float).3892,(float).3556,(float)
.2002,(float).8269,(float).4159,(float).4635,(float).9792,(float)
.1264,(float).2126,(float).9585,(float).7375,(float).4091,(float)
.7801,(float).7579,(float).9568,(float).0281,(float).3187,(float)
.7569,(float).243,(float).5895,(float).0434,(float).956,(float)
.3191,(float).0594,(float).4419,(float).915,(float).5722,(float)
.1188,(float).5698,(float).252,(float).4959,(float).2367,(float)
.477,(float).4061,(float).873,(float).427,(float).3582,(float)
.382,(float).0432,(float).1606,(float).5224,(float).6966,(float)
.0971,(float).4008,(float).7734,(float).2448,(float).3428,(float)
.23,(float).2979,(float).3045,(float).8872,(float).0367,(float)
.6511,(float).3986,(float).6763,(float).7326,(float).9378,(float)
.2333,(float).8385,(float).9672,(float).7786,(float).4315,(float)
.6741,(float).8094,(float).1588,(float).2799,(float).1353,(float)
.8642,(float).7502,(float).208,(float).14,(float).2946,(float)
.8028,(float).2189,(float).5631,(float).7156,(float).1975,(float)
.9898,(float).25,(float).4306,(float).7553,(float).8609,(float)
.8948,(float).9781,(float).3954,(float).4322,(float).1271,(float)
.4577,(float).2378,(float).986,(float).6528,(float).6042,(float)
.2419,(float).4549,(float).79,(float).0788,(float).4764,(float)
.1526,(float).2458,(float).945,(float).614,(float).9882,(float)
.4773,(float).7997,(float).7442,(float).3807,(float).4799,(float)
.5269,(float).0981,(float).5942,(float).3472,(float).1434,(float)
.7795,(float).711,(float).4461,(float).7046,(float).0953,(float)
.9628,(float).5513,(float).7403,(float).579,(float).6379,(float)
.7817,(float).1879,(float).3021,(float).2828,(float).684,(float)
.2929,(float).5654,(float).4184,(float).3066,(float).4445,(float)
.5657,(float).4879,(float).6066,(float).4159,(float).1304,(float)
.256,(float).0358,(float).9771,(float).1145,(float).3781,(float)
.6467,(float).3504,(float).553,(float).3584,(float).5655,(float)
.4756,(float).1637,(float).6152,(float).1722,(float).5547,(float)
.2922,(float).8722,(float).8351,(float).8449,(float).8955,(float)
.5948,(float).5406,(float).1682,(float).655,(float).6905,(float)
.2639,(float).1067,(float).8149,(float).1914,(float).4233,(float)
.3519,(float).8392,(float).1373,(float).2627,(float).1773,(float)
.4799,(float).3802,(float).5048,(float).5028,(float).3519,(float)
.5256,(float).1206,(float).5196,(float).6071,(float).7329,(float)
.5569,(float).3441,(float).802,(float).591,(float).2669,(float)
.6707,(float).5522,(float).7889,(float).8877,(float).89,(float)
.0681,(float).8006,(float).9074,(float).6441,(float).1652,(float)
.3014,(float).1663,(float).2852,(float).842,(float).5363,(float)
.0363,(float).2072,(float).0212,(float).3581,(float).6215,(float)
.52,(float).546,(float).1537,(float).8234,(float).0334,(float)
.026,(float).3781,(float).6163,(float).0204,(float).6266,(float)
.9152,(float).3748,(float).7295,(float).3958,(float).9823,(float)
.5973,(float).1123,(float).2216,(float).7992,(float).8707,(float)
.7382,(float).0136,(float).7396,(float).4184,(float).362,(float)
.2039,(float).1832,(float).0763,(float).1156,(float).1591,(float)
.7883,(float).0404,(float).7906,(float).599,(float).4026,(float)
.2291 };

/* System generated locals */
float ret_val;

/*  This is a dummy routine for generated uniform numbers in the */
/*  interval (0,1).  A list of stored numbers is sequentially */
/*  accessed and returned.  The routine is supplied to permit testing */
/*  of the subprograms RANDN() and RAND0(). */

/*  The routine cycles through 279 values stored in the data array U. */
/*  The variable INDX in the named common RANUC retains the index of the
*/
/*  number returned.  This variable can be initialized to 0 in a calling
*/
/*  routine to restart the sequence. */

++ranuc_1.indx;
if (ranuc_1.indx > 279) {
ranuc_1.indx = 1;
}
ret_val = u[ranuc_1.indx - 1];
return ret_val;

Quote:
} /* randu_ */

#endif /* NEED_FORTRAN_RANU */

#ifdef COMPARE_TESTING          /* compare against ranstdn_dumb */
double ranstdn_dumb(void)
{
static float r, u, v;

/*  This function generates a normally distributed number using the */
/*  ratio of uniforms method.  No auxiliary boundary curves are used. */
/*  Calls to the subprogram RANDU() must return independent random */
/*  numbers uniform in the interval (0,1). */

/*  Generate P = (u,v) uniform in rectangle enclosing acceptance region */

L50:
u = ranu();
v = ranu();
v = (v - .5f) * 1.7156f;
/*  Compute coordinate ratio for candidate point */
r = v/u;
/*  Reject point if outside acceptance region */
/* Computing 2nd power */
if (r*r > -4*log(u)) goto L50;

/*  Return ratio as the normal deviate */
return r;

Quote:
} /* randn0_ */

#include <stdio.h>

int main()
{
int i;
float good[100], forcheck[100];
FILE *univar;

ranuc_.indx = 0;
for (i=0; i<100; i++) good[i] = ranstdn();
printf("index at end of ranstdn is %d\n", ranuc_.indx);
ranuc_.indx = 0;
for (i=0; i<100; i++) forcheck[i] = ranstdn_dumb();
printf("index at end of dumb is %d\n", ranuc_.indx);

printf("Look at properties of discrepency:\n");
fflush(stdout);
univar = popen("univar -1", "w");
for (i=0; i<100; i++) fprintf(univar, "%g\n", forcheck[i]-good[i]);
pclose(univar); fflush(stdout);

printf("\nExpect N(0,1):\n");
fflush(stdout);
univar = popen("univar -1", "w");
for (i=0; i<100; i++) fprintf(univar, "%f\n", good[i]);
pclose(univar); fflush(stdout);
return 0;

Quote:
}

#endif /* COMPARE_TESTING */

#ifdef REG_TESTING
int main()
{
int i;
ranuc_.indx = 0;
for (i=0; i<100; i++) printf("%f\n", ranstdn());
return 0;

Quote:
}

#endif /* REG_TESTING */

/* --------------------------------------------------------------------- */

/* Now we go on to Numerical Recipes version */
double ranstdn2(void)
/* Creates 2 draws from N(0, 1) on each alternate call.
So even-numbered calls cause no calculations, a variable
"saved" is simply returned.  The variable "alternating"
tracks which'th call this is.
*/
{
static int alternating=0;
static double saved;
double fac, r, v1, v2;

if  (alternating) {
alternating = 0; return saved;
} else {
do {
v1=2.0*ranu()-1.0;
v2=2.0*ranu()-1.0;
r=v1*v1 + v2*v2;
} while (r >= 1.0 || r == 0.0);
fac = sqrt(-2.0*log(r)/r);
saved = v1*fac;
alternating = 1;
return v2*fac;
}

Quote:
}

/* Another algorithm for ranstdn():

Algorithm 488, CACM, Vol. 17, #12, p704, Dec 1974, by R. P. Brent.

Except on the first call ranstdn3 returns a pseudo-random number
having a gaussian (i.e.  normal) distribution with zero mean and unit
standard deviation.
Thus, the density is f(x) = exp(-0.5*x**2)/sqrt(2.0*pi).
The first call initializes ranstdn3 and returns zero
(no hassles -- it's a legit draw from N(0,1)).

It calls a function ranu(), and it is assumed that successive calls
to ranu(0) give independent pseudo- random numbers distributed
uniformly on (0, 1), possibly including 0 (but not 1).  the method
used was suggested by Von Neumann, and improved by Forsythe, Ahrens,
Dieter and Brent.  on the average there are 1.37746 calls of ranu() for
each call of ranstdn().

Warning - dimension and data statements below are machine-dependent.
The dimension of d must be at least the number of bits in the fraction
of a float number.  Thus, on most machines the data statement
below can be truncated.

How is d calculated:
If the integral of sqrt(2.0/pi)*exp(-0.5*x**2) from a(i) to infinity
is 2**(-i), then d(i) = a(i) - a(i-1).   Thus it seems that we could
do better than the constants in this code by calculating d[] in
double precision using a modern pnorm1. */

double ranstdn3(void)
{
/* Initialized data */
static float d[60] = { .67448975f,.47585963f,.383771164f,.328611323f,
...