random
Author Message
random

Good morning,

For one of my classes (machine learning, actually) I'm going to do an
experiment. One of the jobs that I should tackle is segmenting a
dataset. The total set is 10000 items big. First I make a set of 1000
items, then 2000 etc etc.. Making this set is done by *randomly*
extracting items from the big set.

One of the things that I'm concerned about is this random-ness. My
teacher (actually a nice guy :) explained to us that not all
random-number-generators are "good", and that this selection process
*must* *be* *random*. So, my question: how random is the
random-number-generator that python uses?

Bas

--

Sun, 16 Nov 2003 15:59:22 GMT
random
[Bas van Gils]

Quote:
> For one of my classes (machine learning, actually) I'm going to do an
> experiment. One of the jobs that I should tackle is segmenting a
> dataset. The total set is 10000 items big. First I make a set of 1000
> items, then 2000 etc etc.. Making this set is done by *randomly*
> extracting items from the big set.

> One of the things that I'm concerned about is this random-ness. My
> teacher (actually a nice guy :) explained to us that not all
> random-number-generators are "good", and that this selection process
> *must* *be* *random*. So, my question: how random is the
> random-number-generator that python uses?

6.  And if your teacher says he doesn't know how random 6 is, call him an
idiot and walk out in a huff <wink>.

Seriously, it's impossible to answer this without getting an objective
definition of "good" from your teacher.  Python's random.random() is the std
Wichmann-Hill generator, and passes most tests for randomness.  It's
certainly better than the rand() function in the typical C library; it's
certainly worse than, e.g., the Mersenne Twister.

If you want truly random bits, they're available for the asking; for
example, at

http://www.fourmilab.ch/hotbits/

Sun, 16 Nov 2003 16:15:02 GMT
random

Quote:

> 6.  And if your teacher says he doesn't know how random 6 is, call him an
> idiot and walk out in a huff <wink>.

ok, doesn't make sense to me (yet), but I'll look that one up on the web

Quote:
> Seriously, it's impossible to answer this without getting an objective
> definition of "good" from your teacher.  Python's random.random() is the std
> Wichmann-Hill generator, and passes most tests for randomness.  It's
> certainly better than the rand() function in the typical C library; it's
> certainly worse than, e.g., the Mersenne Twister.
> [snip]

I can work with that. Thanks very much for y're detailed answer.

on-his-way-to-tell-the-news-ly yours

Bas

--

Sun, 16 Nov 2003 16:57:43 GMT
random

Quote:

> If you want truly random bits, they're available for the asking; for
> example, at

>     http://www.fourmilab.ch/hotbits/

And http://www.random.org/, http://lavarand.sgi.com/

Oleg.
----

Programmers don't die, they just GOSUB without RETURN.

Sun, 16 Nov 2003 16:31:44 GMT
random

Quote:

> If you want truly random bits, they're available for the asking; for
> example, at

>     http://www.fourmilab.ch/hotbits/

Tim is being unnecessarily cruel to a person who asked a perfectly valid
question. Of course a deterministic process can't produce true random
sequences, but only pseudorandom sequences, which must pass a number of
statistical randomness tests to be called "random". The more tests the
generator passes, the more random it is.

IN cryptography, where's there is a demand to inject entropy from a
physical source (/dev/dsp, /dev/video, head timing of a harddrive,
keystrokes, mouse movement, hardware random number generators, and
similiar which are pooled in /dev/random) it is compressed, cryptohashed,
and added to the state of the random number generator to make guessing the
internal state harder.

I don't see the reason why a cryptohash such as md5 initialized with some
appropriate number and kept feed its own output would not do nicely for
purposes in question.

Sun, 16 Nov 2003 17:31:37 GMT
random
[Eugene Leitl, quoting remarkably little]

Quote:
>> If you want truly random bits, they're available for the asking; for
>> example, at

>>     http://www.fourmilab.ch/hotbits/
> Tim is being unnecessarily cruel to a person who asked a perfectly valid
> question.

Sorry, but I think I gave him several perfectly valid answers, including but
not limited to the last one you quoted.

Quote:
> Of course a deterministic process can't produce true random sequences,
> but only pseudorandom sequences, which must pass a number of statistical
> randomness tests to be called "random". The more tests the generator
> passes, the more random it is.

And I explicitly said Python's Wichmann-Hill passes most tests for
randomness, while strong implying C rand() did not (which it doesn't), and
that other generators pass more.

Quote:
> IN cryptography, where's there is a demand to inject entropy from a
> physical source (/dev/dsp, /dev/video, head timing of a harddrive,
> keystrokes, mouse movement, hardware random number generators, and
> similiar which are pooled in /dev/random) it is compressed,
> cryptohashed, and added to the state of the random number generator
> to make guessing the internal state harder.

> I don't see the reason why a cryptohash such as md5 initialized with
> some appropriate number and kept feed its own output would not do
> nicely for purposes in question.

And this is newbie-friendly <wink>?  I don't see any reason why
random.random() would not do for the purposes in question.

it's-6-after-all<wink>-ly y'rs  - tim

Sun, 16 Nov 2003 17:40:02 GMT
random

Quote:

> > 6.  And if your teacher says he doesn't know how random 6 is, call him an
> > idiot and walk out in a huff <wink>.

> ok, doesn't make sense to me (yet), but I'll look that one up on the web

Tim was joking. There isn't a simple scale to determine randomity of
pseudo-random numbers. But, then again, if you do what Tim proposes with
enough confidence and arrogance, your teacher *might* believe it :)

--

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

Sun, 16 Nov 2003 21:47:43 GMT
random

Quote:
> For one of my classes (machine learning, actually) I'm going to do an
> experiment. One of the jobs that I should tackle is segmenting a
> dataset. The total set is 10000 items big. First I make a set of 1000
> items, then 2000 etc etc.. Making this set is done by *randomly*
> extracting items from the big set.

> One of the things that I'm concerned about is this random-ness. My
> teacher (actually a nice guy :) explained to us that not all
> random-number-generators are "good", and that this selection process
> *must* *be* *random*. So, my question: how random is the
> random-number-generator that python uses?

Not random. It's a pseudo random generator, like every other algorithm.

Also, there are 10000!/(1000!*9000!) =

8733076486718372118789840064671528935886804436516964720931852721979835392
8604543969730929016092646949025469441725201048958717336354774276856409133
6649839591235147954583600092105757504564562285345336202364055060390334879
7142311194553023355486359368482309963681453547171336786843298882726543351
5298927108271757609230194725964024517320724075417558991723097208965562040
2106864734806814039974193085331747744501835237067009770447657674817841981
6998503384278064982279324951039159060522602587159786303404218743214397422
6573699418831585343319586988696816327827274983509176278108506746098330784
2005832790299870952673137455809222687983426841466535112391891043971243786
3278071949944036145996141393912412876447156772135509155765076575948744455
8910110504244242182642412842480989493248008355423802252914855706479786904
3524965620254917035006808707869148486352610563529387439793130566023693255
2844960841048371207450105776961149843718970112151679022546107396000204370
0376232898735778263504754434991257226115631551442989664230053273403237191
0233504667136228590559429716296795474897198327305875382495595631923029865
0684341823707537650292197226377403665809742324671940764823772514432194866
5414556158103888132584145454295200725107457192126984703264310606564380942
4631333133422141082736813963651133816767720354813233929755508327565907571
1111515881345343827005433863608545339155486771010323143976006662955036912
84757816488002960318400

different ways to select a set of 1000 items from a set of 10000. Since
there is no pseudorandom generator I know that has a period that huge, there
will always be *many* combinations that can't be produced. Python's random
generator has a period of only 6,953,607,871,644, so it can only produce at
most so many different combinations. This second number is smaller than the
first one ;-).

If you have a Linux machine, the /dev/random device gives real random data
generated from times between interrupts, key presses, mouse movements, that
sort of thing. It blocks whenever there is not enough randomness left. It
shouldn't be too hard to make a function that selects numbers from the list
based on a few bytes from /dev/random, maybe two bytes per selected item...
then wiggle your mouse a bit or type some random stuff and it will be random.

Be careful when writing such a function. It's quite easy to introduce a
bias...

--
Remco Gerlich

Sun, 16 Nov 2003 23:29:36 GMT
random

[snip]

Quote:
> Seriously, it's impossible to answer this without getting an objective
> definition of "good" from your teacher.  Python's random.random() is the std
> Wichmann-Hill generator, and passes most tests for randomness.  It's
> certainly better than the rand() function in the typical C library; it's
> certainly worse than, e.g., the Mersenne Twister.

[snip]

A hint might be crng:
"Random-number generators (RNGs) implemented as
Python extension types coded in C."

One of the generators is a 623 dimensional Mersenne Twister.
Faster and "more random" than random.random(). The name is MT19937

crng can be found at http://www.sbc.su.se/~per/crng/

/Fredrik

Sun, 16 Nov 2003 23:49:42 GMT
random

Quote:

> If you want truly random bits, they're available for the asking; for
> example, at

>     http://www.fourmilab.ch/hotbits/

or -- connect any bipolar small-signal transistor to the microphone part of
your sound card. We needed external power, but these days I think that you
can get that already.  You will then need to write a device driver. Which

Laura

Sun, 16 Nov 2003 23:18:09 GMT
random

Quote:

> can get that already.  You will then need to write a device driver. Which

You don't need anything, just jacking up the amplification adds sufficient
noise already. If you have a noisy system fan, you can just hang a cheap
mike behind the computer. As to drivers, catting from /dev/dsp should not
be that hard.

But /dev/random under Linux already uses an entropy pool internally, as
opposed to /dev/urandom, which is just a plain random number generator.

Sun, 16 Nov 2003 23:31:24 GMT
random

Quote:
> One of the things that I'm concerned about is this random-ness. My
> teacher (actually a nice guy :) explained to us that not all
> random-number-generators are "good", and that this selection process
> *must* *be* *random*. So, my question: how random is the
> random-number-generator that python uses?

"Anyone who considers arithmetical methods of producing random numbers
is, of course, in a state of sin." -John Von Neumann

Unless you have random-number-generating hardware, you don't really
have truly random numbers.

--
<a href=" http://www.*-*-*.com/ ~kamikaze/"> Mark Hughes </a>
"I will tell you things that will make you laugh and uncomfortable and really
{*filter*}ing angry and that no one else is telling you.  What I won't do is bullshit
you.  I'm here for the same thing you are.  The Truth." -Transmetropolitan #39

Mon, 17 Nov 2003 09:43:23 GMT
random

Quote:

> One of the things that I'm concerned
> about is this random-ness. My > teacher (actually a nice guy :)
> explained to us that not all > random-number-generators are "good",
> and that this selection process > *must* *be* *random*. So, my
> question: how random is the > random-number-generator that python
> uses?

> "Anyone who considers arithmetical methods of producing random
> numbers is, of course, in a state of sin." -John Von Neumann

> Unless you have random-number-generating hardware, you don't really
> have truly random numbers.

"Gott werft nicht!" ("God does not throw dice!") -- Albert Einstein

Sorry, I could not resist this urge. (What is it about random numbers that
provokes so much discourse?!)

Mon, 17 Nov 2003 20:17:18 GMT
random

[...]

Quote:
> > Unless you have random-number-generating hardware, you don't really
> > have truly random numbers.

> "Gott werft nicht!" ("God does not throw dice!") -- Albert Einstein

Correct german is: "Gott wrfelt nicht!"

Quote:
> Sorry, I could not resist this urge. (What is it about random numbers that
> provokes so much discourse?!)

Thx
Siggy

--

****** ceterum censeo javascriptum esse restrictam *******

Mon, 17 Nov 2003 20:34:32 GMT
random
On Thu, 31 May 2001 08:17:18 -0400, "Bill Bell"

Quote:

>> One of the things that I'm concerned
>> about is this random-ness. My > teacher (actually a nice guy :)
>> explained to us that not all > random-number-generators are "good",
>> and that this selection process > *must* *be* *random*. So, my
>> question: how random is the > random-number-generator that python
>> uses?

>> "Anyone who considers arithmetical methods of producing random
>> numbers is, of course, in a state of sin." -John Von Neumann

>> Unless you have random-number-generating hardware, you don't really
>> have truly random numbers.

>"Gott werft nicht!" ("God does not throw dice!") -- Albert Einstein

>Sorry, I could not resist this urge. (What is it about random numbers that
>provokes so much discourse?!)

Whoever said they're too important to be left to chance got it right.
(Which is more than we can say for Albert here...)

David C. Ullrich
*********************
"Sometimes you can have access violations all the
time and the program still works." (Michael Caracena,
comp.lang.Pascal.delphi.misc 5/1/01)

Mon, 17 Nov 2003 22:07:48 GMT

 Page 1 of 2 [ 25 post ] Go to page: [1] [2]

Relevant Pages