Rehash of previous exposition in more detail. Again, skip if you know

why radix sort does not obey the n log n bound.

Quote:

> >> Sorry to inject some mathematics into this lovely flamefest, but...

> >> The bound on sorting is O(N log N) comparisons because you must be able to

> >> generate all of the N! possible permutations of the N inputs.

Under the banner of ``mathematics'' that I wish you would stop flying,

you claim here that all (general) sorting methods take time at least cN

log N to sort N numbers, where c is some positive number depending on

the method. If this is an inaccurate translation of your quoted

statement, please let me know.

That claim is entirely false. You have not justified it. It is

contradicted by all variants of MSD radix sort, which are linear in the

number of bits in the inputs; the number of bits can be asymptotically

less than cN log N for all c. Please stop insulting mathematics.

Quote:

> >That bound is only valid if the only permitted operation is comparison.

> If you define "pick one of N buckets" as one operation, sorting is indeed

> linear. I'll maintain that your operation takes O(log N) time to perform.

I'm sorry I didn't dismiss this strongly enough in my first response. It

is absolutely irrelevant how long it takes to pick one of N buckets. All

radix sort uses is picking one of a constant number (you said K at

first) of buckets. For definiteness, let's settle upon K as being 256

for all time. It takes constant time, in theory and in practice, to

select one of 256 buckets.

Quote:

> >> N! grows approximately as N**N (Stirling's approximation, I believe it's

> >> called).

> >No, n does not grow as n^n. Stirling's (first) approximation is that n!

> >is about n^n/e^n times the square root of 2 pi n.

> Irrelevant, invalid, etc. (substitute your favorite dismissal word here.)

> log(N!) still grows as N log N.

You said you were introducing some mathematics into this discussion. I

was correcting your errors, one by one. Here you began a supposed proof

that log(n!) grows as n log n; but your first statement, namely that n!

grows as n^n, was false. The conclusion is true but you still haven't

proven it. (Given that n! is larger than n^n/e^n times the square root

of 2 pi n [see Abramowitz & Stegun], log n! is larger than n times the

log of n/e. When n is greater than 3, n/e is greater than the square

root of n, so log n/e is greater than half log n. Hence log n! is at

least half n log n for large n.)

Quote:

> >> Radix or bucket sorts may appear to be linear, but in fact they are also

> >> N log N in comparisons. (Picking one of K buckets requires a K-way

> >> comparison.)

> >Picking one of K buckets does not require a K-way comparison. Even if it

> >did, that would not be relevant. Radix sort is linear in the number of

> >bytes, which can be o(n log n) if there are many duplicate keys.

> I was talking about sorting N items, not N bytes.

Yes, and I was following your terminology. As I said before, the number

of bytes can be o(n log n). If you attempt to prove your statement that

radix sort uses at least cn log n comparisons, you will find an

uncrossable gap at this point.

Quote:

> Picking one of K buckets is indeed equivalent to a K-way comparison.

> That it appears to be "one" operation is merely an illusion, created

> mainly by fixed address lengths of real machines and parallel addressing

> hardware. This is probably your big mental block, Dan - you don't realize

> what is really going on here.

Fine, believe what you want. Even if it took K operations to pick one of

K buckets, that would be absolutely irrelevant: K is a *constant*. Make

it 2 if you want.

Quote:

> It would be equally valid to assert that sorting takes one "sort" unit,

> so that sorting can be done in constant time.

I was sticking to your terminology. You claim that radix sort is

theta(n log n) where n is the number of keys. (Here theta is the inverse

of big O.) That claim is simply false.

Quote:

> And we're talking real asymptotic here, not "real world"

> asymptotic or whatever your concept is.

I am talking about real asymptotics on a large class of (realistic)

computational models. Once you see Knuth make a simply false statement

about ``real-world'' asymptotics, you learn to pay attention to those

log factors. In this problem there's no log factor.

Quote:

> BTW, what does duplicate keys have to do with it? If all the keys are

> duplicate, then sorting is truly constant time.

What are you talking about? By definition, a general sorting method

cannot know such information about the keys beforehand. Furthermore,

nobody is talking about all keys being the same.

Quote:

> >The scale factor between sorting 100 of Ritchie's boxcars and just 1 of

> >them is almost exactly 100---internal sorting by far dominates any

> >merging.

> Just as an exercise, why don't you try calculating the merge time?

As a matter of fact, I did calculate it, and I presented the results in

my followup to his article. Just as an exercise, why don't you go back

and read my followup before you assign any further exercises?

Quote:

> (but then again, I wouldn't expect you to do something so rigorous -

> it's easier to wave your hands and say "merging is linear!").

Why the ad hominem attacks? I haven't seen you show any respect for

mathematical rigor. Merging is not linear, but it is, even for such

ridiculously large problems as Ritchie's trainload of tapes, dominated

by sorting. (David Chase has pointed out in e-mail that a distributed

solution might tip the scale the other way, but I don't believe any

distributed solution will be cost-effective.)

---Dan