A new RNG. Comments? 
Author Message
 A new RNG. Comments?

I am about to include in an article
the following C function, cmwc(),
(complimentary-multiply-with-carry)
as an example of an excellent random number
generator that uses, but does not require,
many random seeds before it is called.

Recent attention has focused on the need
for RNG's with many seed values---for example,
with {*filter*} machines, to ensure every possible
shuffle of a deck of cards you need a RNG
with >=7 seeds, and in Law, to choose
a jury panel of 1200 from a list of 500,000
eligibles in Palm Beach County you would
need a RNG with 367 32-bit seeds to conform
to the law's requirement that selection be
by lot and at random.

I invite comments from you experts,
on the RNG in the form I have it, as it
mixes long long with long operations and
requires longs of 32 and long longs
of 64 bits.  The Q array should first be
filled with 256 (or fewer) 32-bit random
seed values, with no restrictions---they
could all be 0.
------------------------------------------

static unsigned long Q[256];

unsigned long cmwc(void)
{
  unsigned long long t, a = 107982LL;
  static unsigned long c, i = 255;
  unsigned long x, b=0xFFFFFFFF, bm1=b-1;
   t = a*Q[i]+c;
   c = (t>>32); x = t+c;  if(x<c){x++;c++;}
   if(i) return(Q[i--] = bm1-x);
   i=255;  return(Q[0] = bm1-x);

Quote:
}

---------------------------------------------

This takes 36 nanosecs on my Intel 850MHz PC,
or 110 nanosecs if I get c,x by means of:
    c = t/b; x = (t-b*c);
Whatever the 256 seed values in Q[256],
the period of the generator is  > 10^2471,
and every set of 256 consecutive 32-bit
integers will appear at least 100,000
times in a full period.  (By contrast,
rand() can guarentee the appearance of
only 1/2^31 of the possible pairs i,j, only
1/2^62 of the possible triples i,j,k, etc.)
Users not wanting to expend the effort
of getting 256 random seeds can settle for
fewer. The compiler's initialization
of Q can serve to provide the default,
seedless variety.

Versions are available with 8,16,32,64,128,
256,512, and 1024 random seeds.  One need
only change the multiplier 'a' and the
range of the index 'i'. Details on request.

Which platforms are likely to be able to use
this RNG and count on a common output?

How prevalent are those for which longs
are 32 bits and long longs 64?

I take advantage of what works with
gnu gcc in DOS on my PC.  Is casting
necessary to ensure it will work on others?

George Marsaglia



Sun, 27 Jun 2004 08:32:00 GMT  
 A new RNG. Comments?


Quote:
> I am about to include in an article
> the following C function, cmwc(),
> (complimentary-multiply-with-carry)
> as an example of an excellent random number
> generator that uses, but does not require,
> many random seeds before it is called.

> Recent attention has focused on the need
> for RNG's with many seed values---for example,
> with {*filter*} machines, to ensure every possible
> shuffle of a deck of cards you need a RNG
> with >=7 seeds, and in Law, to choose
> a jury panel of 1200 from a list of 500,000
> eligibles in Palm Beach County you would
> need a RNG with 367 32-bit seeds to conform
> to the law's requirement that selection be
> by lot and at random.

> I invite comments from you experts,
> on the RNG in the form I have it, as it
> mixes long long with long operations and
> requires longs of 32 and long longs
> of 64 bits.  The Q array should first be
> filled with 256 (or fewer) 32-bit random
> seed values, with no restrictions---they
> could all be 0.

One problem with this generator is that it is not a cryptographically secure
random number generator -- an attacker with 257 successive outputs can
trivially rederive the entire RNG state, and then predict future outputs
(and, you can predict selected outputs with considerably fewer previous
outputs).  Whether this is a problem depends greatly on what you're going to
use it for, however, one use you list is "{*filter*} machines", which would
appear to require that.

- Show quoted text -

Quote:
> ------------------------------------------

> static unsigned long Q[256];

> unsigned long cmwc(void)
> {
>   unsigned long long t, a = 107982LL;
>   static unsigned long c, i = 255;
>   unsigned long x, b=0xFFFFFFFF, bm1=b-1;
>    t = a*Q[i]+c;
>    c = (t>>32); x = t+c;  if(x<c){x++;c++;}
>    if(i) return(Q[i--] = bm1-x);
>    i=255;  return(Q[0] = bm1-x);
> }
> ---------------------------------------------

> This takes 36 nanosecs on my Intel 850MHz PC,
> or 110 nanosecs if I get c,x by means of:
>     c = t/b; x = (t-b*c);
> Whatever the 256 seed values in Q[256],
> the period of the generator is  > 10^2471,
> and every set of 256 consecutive 32-bit
> integers will appear at least 100,000
> times in a full period.  (By contrast,
> rand() can guarentee the appearance of
> only 1/2^31 of the possible pairs i,j, only
> 1/2^62 of the possible triples i,j,k, etc.)

Could you point out where in the Standard where it makes this requirement on
rand()?

Quote:
> Which platforms are likely to be able to use
> this RNG and count on a common output?

Actually, with a little rewrite, namely:

/* ... */
    if(i) return(Q[i--] = (bm1-x) & b);
    i=255;  return(Q[0] = (bm1-x) & b);

Quote:
}

you can guarrantee the same well-defined output on all platforms.  In
addition, any decent compiler on a platform with 32 bit longs should
optimize out the additional operation...

- Show quoted text -

Quote:

> How prevalent are those for which longs
> are 32 bits and long longs 64?

> I take advantage of what works with
> gnu gcc in DOS on my PC.  Is casting
> necessary to ensure it will work on others?

> George Marsaglia



Sun, 27 Jun 2004 12:43:01 GMT  
 A new RNG. Comments?


Quote:

> One problem with this generator is that it is not a cryptographically secure
> random number generator -- an attacker with 257 successive outputs can
> trivially rederive the entire RNG state, and then predict future outputs
> (and, you can predict selected outputs with considerably fewer previous
> outputs).  Whether this is a problem depends greatly on what you're going to
> use it for, however, one use you list is "{*filter*} machines", which would
> appear to require that.

&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
The numbers produced by a cmwc RNG with
modulus b=2^32-1 are the base-b digits in the
expansion of a rational k/p, where p is a huge
prime, and thus could be accessible---that
is, cracked--- by means of continued fractions.
But it is not the trivial problem you suggest.

Nor is this or any other generator expressed by
a relatively simple recursion on randomly selected
seeds expected to be impregnable for security purposes.
The primary use is in simulation, to get a random
selection by means of dependent but identically
distributed uniform variates that appear to behave
as independent but uniformly distributed variates
over the space from which the seeds are randomly
selected.

One can combine simple-recursion generators in
various ways or shuffle the output to make
'cracking' very difficult;  I suggested ways
to do this 40 years ago.
But the primary input to such schemes is often
a  RNG selected for simplicity, speed and apparently
good randomness, supported by extensive tests.

As for suitability in {*filter*} machines, my RNG's
are used by makers of such machines, two with
license and others without, and they manage
to perturb the results to avoid crackers.
Of course they don't list the algorithm
on the case of the machine, as I did with a
request for comments from C linguists,
but if it is as easy as you think, take a bunch
of quarters to your nearest {*filter*}.

To the guardians of this newsgroup I apologize for
this attempt to get the  discussion back on track.

George Marsaglia



Sun, 27 Jun 2004 19:59:00 GMT  
 A new RNG. Comments?


Quote:



> > One problem with this generator is that it is not a cryptographically
secure
> > random number generator -- an attacker with 257 successive outputs can
> > trivially rederive the entire RNG state, and then predict future outputs
> > (and, you can predict selected outputs with considerably fewer previous
> > outputs).  Whether this is a problem depends greatly on what you're
going to
> > use it for, however, one use you list is "{*filter*} machines", which would
> > appear to require that.

> &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
> The numbers produced by a cmwc RNG with
> modulus b=2^32-1 are the base-b digits in the
> expansion of a rational k/p, where p is a huge
> prime, and thus could be accessible---that
> is, cracked--- by means of continued fractions.
> But it is not the trivial problem you suggest.

It is trivial.  With 256 successive outputs, you then get the entire
contents of the Q array.  Then, with the 257th output, you can compute (with
the original value of Q[i], which you now know) the value of c (both before
and after the 257th step).  That gives you the entire RNG state.

- Show quoted text -

Quote:

> Nor is this or any other generator expressed by
> a relatively simple recursion on randomly selected
> seeds expected to be impregnable for security purposes.
> The primary use is in simulation, to get a random
> selection by means of dependent but identically
> distributed uniform variates that appear to behave
> as independent but uniformly distributed variates
> over the space from which the seeds are randomly
> selected.

> One can combine simple-recursion generators in
> various ways or shuffle the output to make
> 'cracking' very difficult;  I suggested ways
> to do this 40 years ago.
> But the primary input to such schemes is often
> a  RNG selected for simplicity, speed and apparently
> good randomness, supported by extensive tests.

> As for suitability in {*filter*} machines, my RNG's
> are used by makers of such machines, two with
> license and others without, and they manage
> to perturb the results to avoid crackers.
> Of course they don't list the algorithm
> on the case of the machine, as I did with a
> request for comments from C linguists,
> but if it is as easy as you think, take a bunch
> of quarters to your nearest {*filter*}.

> To the guardians of this newsgroup I apologize for
> this attempt to get the  discussion back on track.

Actually, I did give you some C based comments -- you appear to have snipped
them.

--
poncho



Mon, 28 Jun 2004 00:29:45 GMT  
 A new RNG. Comments?
On Wednesday, in article


Quote:
>I am about to include in an article
>the following C function, cmwc(),
>(complimentary-multiply-with-carry)
>as an example of an excellent random number
>generator that uses, but does not require,
>many random seeds before it is called.

>Recent attention has focused on the need
>for RNG's with many seed values---for example,
>with {*filter*} machines, to ensure every possible
>shuffle of a deck of cards you need a RNG
>with >=7 seeds, and in Law, to choose
>a jury panel of 1200 from a list of 500,000
>eligibles in Palm Beach County you would
>need a RNG with 367 32-bit seeds to conform
>to the law's requirement that selection be
>by lot and at random.

The question then is how you generate the seeds. They must be
independent and "random" themselves. It would make sense to
use some sort of hardware "noise" based RNG for the sort of
applications you mention above so a softwar RNG isn't needed at
all.

Quote:
>I invite comments from you experts,
>on the RNG in the form I have it, as it
>mixes long long with long operations and
>requires longs of 32 and long longs
>of 64 bits.

Standard C supports longs which must be *at least* 32 bits. C99
supports long long which must be *at least* 64 bits. However C99
isn't very widely implemented yet. If you want to ensure that the
results are to exactly 32 and 64 bits then it would be best to mask
with 0xffffffff or 0xffffffffffffffff where necessary. On platforms
with exact 32 and 64 bit types any half-decent compiler optimiser can
eliminate these operations.

Quote:
> The Q array should first be
>filled with 256 (or fewer) 32-bit random
>seed values, with no restrictions---they
>could all be 0.

Have you tried it when they are all 0?

Quote:
>------------------------------------------

>static unsigned long Q[256];

>unsigned long cmwc(void)
>{
>  unsigned long long t, a = 107982LL;
>  static unsigned long c, i = 255;

I'd initialise c explicitly here to make it clear that the code relies
on this.

Quote:
>  unsigned long x, b=0xFFFFFFFF, bm1=b-1;
>   t = a*Q[i]+c;
>   c = (t>>32); x = t+c;  if(x<c){x++;c++;}
>   if(i) return(Q[i--] = bm1-x);
>   i=255;  return(Q[0] = bm1-x);

You could write
    static int i = 0;

    if (--i < 0) i = 255;

...

    return (Q[i] = bm1-x);

Or if as here the size of the array is a power of 2

    static unsigned i = 0;

    i = (i-1) & 255;

...

    return (Q[i] = bm1-x);

Is there any reason why i counts down rather than up?

Quote:
>}

...

Quote:
>Which platforms are likely to be able to use
>this RNG and count on a common output?

If you add the masks it will work on any implementatiuon that supports
a 64 bit or wider integer type. As notd in C99 that is mandatory but
many C90 compilers support a 64 bit type although it isn't always called
long long. You may want to use a typedef for your 64 bit type. For example
on MSVC that would be __int64

Quote:
>How prevalent are those for which longs
>are 32 bits and long longs 64?

Quite common but, for example, long is sometimes a 64 bit type as well
as long long not always existing. One approach you can make is to
#include <stdint.h> which is a standard C99 header that has type definitions
of things like uint32_t, uint64_t, uint_least32_t, uint_least64_t. The
first 2 are exact width types but are not guaranteed to exist. The
last to are N or more width types and are guaranteed to exist on a C99
implementation. If you don't have a C99 implementation the header
may still be provided and if not it is trivial to create with the
necessary definitions as long as an underlying 64 bit type is
available. This is a case where it may be preferable to use the ""
include format for a standard header.

Quote:
>I take advantage of what works with
>gnu gcc in DOS on my PC.  Is casting
>necessary to ensure it will work on others?

Casting is no different to implicit conversions between integer types
so, no, it isn't necessary.

--
-----------------------------------------


-----------------------------------------



Mon, 28 Jun 2004 20:44:39 GMT  
 A new RNG. Comments?
On Tuesday, in article


...

Quote:
>> Which platforms are likely to be able to use
>> this RNG and count on a common output?
>Actually, with a little rewrite, namely:

>/* ... */
>    if(i) return(Q[i--] = (bm1-x) & b);
>    i=255;  return(Q[0] = (bm1-x) & b);
>}

It is a little more complex than that because in

Quote:
>>    t = a*Q[i]+c;

any extra bits carried over in c will affect the result and in

Quote:
>>    c = (t>>32); x = t+c;  if(x<c){x++;c++;}

extra bits will affect the if condition. So a number of masks are
required.

--
-----------------------------------------


-----------------------------------------



Mon, 28 Jun 2004 21:40:53 GMT  
 A new RNG. Comments?


Quote:
> On Tuesday, in article


> ...

> >> Which platforms are likely to be able to use
> >> this RNG and count on a common output?
> >Actually, with a little rewrite, namely:

> >/* ... */
> >    if(i) return(Q[i--] = (bm1-x) & b);
> >    i=255;  return(Q[0] = (bm1-x) & b);
> >}

> It is a little more complex than that because in

> >>    t = a*Q[i]+c;

> any extra bits carried over in c will affect the result and in

> >>    c = (t>>32); x = t+c;  if(x<c){x++;c++;}

> extra bits will affect the if condition. So a number of masks are
> required.

You are correct:

    c = (t>>32); x = (t+c) & b;  if(x<c){x++;c++;}

is required as well...

- Show quoted text -

Quote:

> --
> -----------------------------------------


> -----------------------------------------



Tue, 29 Jun 2004 00:41:29 GMT  
 A new RNG. Comments?

Quote:
>   c = (t>>32); x = t+c;
[vs]
>    c = t/b; x = (t-b*c);

either detect implementations that work properly with the former or always
code the later.  (i suggest always coding the later, so that the right
thing is always done.)

Quote:
>How prevalent are those for which longs are 32 bits and long longs 64?

very common in general purpose use, i.e., that's what's used by most ia-32
compilers.  slightly less common, and getting less so all the time, for
embedded applications using dsp's.

if you need a specific or minimum number of bits use the appropriate c99
type, and provide an include file appropriate to the implementation when
one isn't already present.

--
okay, have a sig then



Tue, 29 Jun 2004 23:12:07 GMT  
 
 [ 8 post ] 

 Relevant Pages 

1. My new linked list: request for comment please

2. A new C comment style

3. My new linked list: request for comment please

4. Comments on new Kelley and Pohl /A Book on C/, other C teaching stuff

5. Need comments on commenting program

6. Change C++ comments to C-comments

7. Multi-Line Comments Nested in Single Line Comments

8. changing C++ comments // to C comments /*

9. comments on comments

10. changing C++ comments // to C comments /*

11. Converting C++ comments to C comments

12. Random-ness of RNG's

 

 
Powered by phpBB® Forum Software