Decorrelated Fast Cipher 
Author Message
 Decorrelated Fast Cipher

[ This is from a thread on sci.crypt.  One of the AES candidate
  algorithms that is particularly promising has fast implementations on
  some chips but not on Pentium Pro, which is the reference machine
  suggested by NIST.  I'm posting this here in the hope the some x86
  coders will take up the challenge!

  I put a functional (but very slow) implementation with a little test
  harness that interested people can start from, at:

    http://www.*-*-*.com/ ~harley/tmp/dfc32.new.c

  Skip straight to section 2 ("High end") below for the relevant text.

  -- Rob.
]

Quote:

>People says DFC has not enough rounds, which is why it is faster. Then
>increase it from 8 to 12 rounds, and it will be as fast as the popular
>ones!


Quote:
>i missed something... a dfc with 12 rounds is not faster on 64b cpu and
>looses its speed advantage.

I think Serge's remark was somewhat tongue-in-cheek!  Twelve rounds
would probably be a bit excessive for DFC.

Presumably we want a code which is:

  1. Possible on the low end.

  2. Fast on the high end.

  3. SECURE!!!

1.  Low end.

Everybody seems to agree that on the low end are smart cards for which
we need something that can get by with at most a few hundred bytes of RAM.

Just a few messages ago, Paul Crowley ruled out DFC and others
claiming that they required "at least 195 bytes", whereas the DFC Web
page links to a several-month-old paper describing an implementation
in 100 bytes.  Being too hasty and getting facts wrong sure won't help
us make a good choice!

Note also that Gemplus (a major smart-card company) is involved in DFC
so it appears that those who know the low-end well think highly of
this particular candidate.

2.  High end.

Here there seems to be some disagreement.  Should the high end be the
latest x86 chip?  Or the latest Alpha?  Or some unknown processor that
will be popular in 2010?  The conclusions are quite different
depending on which you choose.

Many people have x86 machines, obviously, and NIST suggested testing
on a 200 Mhz Pentium Pro.  However some appear far too willing to
dismiss out-of-hand all but the very fastest on that particular chip
with the particular implementations they have at hand.

My machine is an Alpha (the outgoing 21164 generation) so for
me DFC is the fastest candidate.  Amongst the serious contenders it
is fastest by quite a bit.

My personal opinion on this is that more effort should go into
implementations on diverse types of processor.  Bruce has done a very
careful implementation on Pentium Pro, and even posted here recently
to announce shaving 2 cycles off per round, but what about Alpha?
Perhaps Twofish could be sped up there.  Likewise for other codes.

I've implemented DFC myself on Alpha but am incapable of doing a good
job on x86.  Some guys at ENS did an implementation that takes about
750 cycles.  I did one on ARM that takes about 750 cycles too, yet ARM
is a very simple in-order single-issue RISC chip.  Surely a Pentium
Pro could beat it easily!  I wouldn't be surprised if 20 or 200 cycles
could be knocked off by a top-notch x86 programmer.  Where's Terje
Mathisen when you need him?  =:-)

3.  Security.

Don't overlook that security is more important than anything else!
Imagine if some variant of DES had been chosen that was faster on the
chip-du-jour in 1975 but got broken in 1985?  The whole AES landscape
would be very different.

A few candidates are easily ruled out but it would be a mistake to
assume that the rest are all equal.

For now, the various AES teams are involved in guerilla warfare,
trying every avenue to criticize the others.  Read carefully and
separate the wheat from the chaff.  When Bruce Schneier and colleagues
comment on 32-bit performance they criticize DFC as "The modular
multiply does a nice job of diffusing bits, but hurts performance
significantly."  They're praising it with faint criticism!

At heart I'm a bit of a speed-demon but mostly a mathematician.  There
are some fast candidates, but that's not enough.  There's plenty of
advocacy going on, but much of it doesn't really matter.  There are
guys with good reputations involved.  I respect that.  But the
clincher is:

  WHAT CAN YOU PROVE?  WHAT IS HARD, COLD FACT?

  And what's just heuristics, conjectures and such like?

Heuristics and conjectures are important, especially in cryptography,
but pale in comparison to the others.  Take a step back from just DES
and look at what lies behind cryptographic methods that have stood out
since the seventies: you will surely have in mind things like
factoring, discrete logarithms, exponentiation, finite fields and so
on.  When mathematics is brought to bear it really does a good job.

I would argue that DES was the end of an era: an era when cryptography
meant pushing bits around in some complicated fashion, getting smart
people to try to crack the resulting code and declaring it safe when
they fail.  This process produced DES and DES succeeded.  But I think
there was a large amount of luck involved.

The filtering process is of course necessary but it is not sufficient.
Twenty to twenty-five years ago, cryptography came out into the open
and nothing has been done the same since, save doomed attempts like
Clipper, until now that is.  The AES process seems to be swamped by
candidates going back to the old way, submitted by people apparently
blinded by DES's lucky success.

If that's what is wanted then go with triple-DES!

Take a look at the serious candidates from a mathematical point of
view.  There are a lot of similarities: Feistel this, XOR that...
But one stands out.  One says: hey guys here's a proof that
such-and-such attack can't work, nor this other attack nor some entire
classes of attacks.  And they're not "straw-man" attacks that are
being held at bay either, they're genuinely useful against some other
codes.  You may have missed the crucial word so I'll repeat it: proof.

Now you may be wondering why I'm laying down a strong argument which
happens to favour DFC.  Maybe the designers are friends of mine?
No: I just met them for the first time a few days ago.  Maybe its some
primitive nationalism, after all I'm posting from France.  No: I'm Irish.

I'm defending DFC because it melds the best of the old DES era and the
new mathematical one; because it has gone through the AES process to
date with as much distinction as the other good candidates, withstood
attempts to break it, and because on top of that it is backed up by
guarantees, when all the others have to offer is rhetoric.

So forget all the business with "candidate A is bla% slower/faster than
candidate B on processor X, but on Y mumble mumble".

None are perfect.  But one comes with proof.  The others don't.

  WHAT ARE YOU PEOPLE THINKING?

  =:-)

Bye,
  Rob.



Sat, 14 Jul 2001 03:00:00 GMT  
 Decorrelated Fast Cipher

Quote:

> [ This is from a thread on sci.crypt.  One of the AES candidate
>   algorithms that is particularly promising has fast implementations on
>   some chips but not on Pentium Pro, which is the reference machine
>   suggested by NIST.  I'm posting this here in the hope the some x86
>   coders will take up the challenge!

>   I put a functional (but very slow) implementation with a little test
>   harness that interested people can start from, at:

>     http://pauillac.inria.fr/~harley/tmp/dfc32.new.c

>   Skip straight to section 2 ("High end") below for the relevant text.
[...]
> I've implemented DFC myself on Alpha but am incapable of doing a good
> job on x86.  Some guys at ENS did an implementation that takes about
> 750 cycles.  I did one on ARM that takes about 750 cycles too, yet ARM
> is a very simple in-order single-issue RISC chip.  Surely a Pentium
> Pro could beat it easily!  I wouldn't be surprised if 20 or 200 cycles
> could be knocked off by a top-notch x86 programmer.  Where's Terje
> Mathisen when you need him?  =:-)

I'm here, and I've been following the AES debate on sci. crypt, without
noticing your original message (actually it seems like it never made to
my server).

I've taken a quick look at your code, and it seems like the 64-bit
mul-by-13 is the worst speed-bump, it can definitely be improved with
some asm.

I've dl'd the code and will take a look at it later.

Terje

--

Using self-discipline, see http://www.eiffel.com/discipline
"almost all programming can be viewed as an exercise in caching"



Sat, 14 Jul 2001 03:00:00 GMT  
 Decorrelated Fast Cipher

Quote:


> > [...]  Where's Terje Mathisen when you need him?  =:-)

> I'm here, and I've been following the AES debate on sci.crypt [...]

Thanks Terje, and thanks to the others who have sent implementations.
Keep them coming!  You guys have already sped past the 550 cycles that
were my most optimistic hope and got down to 505 (so far)!

See the update below...

Thank you all,
  Rob.

------------------------------------------------------------------------------
Subject: Re: Who will win in AES contest ??

Quote:

> Found your message entertaining (and much agrees with my thoughts about the
> current stage of AES...).  One small comment;

Ah, so there is hope yet!
Actually, several people have expressed hearty agreement in email.

Quote:
>>None are perfect.  But one comes with proof.  The others don't.

>There is a "proof" of DFC's security?

There are proofs (no quotes are necessary) of DFC's security against
various classes of attacks.

Consider differential attacks.  These are pretty serious and several
codes which otherwise appear secure can be broken by them.

Of course most AES teams have taken them into account and can say "we
think our code is resistant".  The DFC team doesn't say "we think".
It says "Our code is resistant.  Here's the proof.  End of story.
That's that.  Any questions?"

The same holds for linear attacks.

The same holds for some iterated attacks.

Does this not strike anybody else as fundamentally different, better
and phenomenally useful?

A little-known fact about the decorrelation method is that it could be
used with other AES candidates.  Essentially you start by designing an
algorithm with heuristic security, doing your best and ending up with
something like Mars or RC6.  Then in a perfectly modular fashion, you
append a pass that guarantees, with proof, all sorts of security that
was previously relegated to conjecture (at best)!

If some algorithm other than DFC wins the AES process, then surely we
would want to apply this strengthening to make it even better than it
already was.

How could anyone not want this?

The thing is, the resulting algorithm would look to a large extent
just like DFC!

The only counter-argument that has been mustered against this that the
decorrelation module takes time and slows the code down somewhat.

If we take as granted that we want the security that the latest and
greatest research can give us, then how fast can we go?  Probably not
a whole lot faster than DFC, but maybe a bit.  This might be a
fruitful avenue of research.

However the criticism on grounds of speed is a red herring as is
becoming clearer by the day.

Several other candidates are significantly faster than DFC on Pentium Pro
and similar chips.  That is a fact and DFC is not about to become the
fastest there.  So should we hastily rule it out as being too slow?

DFC was not particularly fast on Alpha initially.  Its designers are
in some ivory tower, talking amongst themselves about beautiful,
powerful theory and barely even pausing to program their ideas
properly.

Today DFC is the fastest candidate on several chips including Alpha
and is the fastest in absolute terms.  On my own two-year-old machine
it chomps through 200 Mbps.  No other AES candidate comes close with
currently available implementations.  So should we hastily rule out
the others as being too slow?  ;-)

NO!  We should all do more implementations, taking advantage of
whatever coding tricks are necessary to go as fast as possible, before
making any definitive judgement based on speed.

I posted my previous message a few evenings ago, optimistically
suggesting that the implementation of DFC on Pentium Pro might be sped
up by 20 or even 200 cycles.  The next morning when I arrived at work
my mailbox contained an implementation by a young Polish programmer,
Dominik Behr, saying "I think this should be 200 cycles faster but I
don't have a Pentium Pro to test it".  Lo and behold, testing revealed
it to be... 200 cycles faster.

It would be a reasonable consensus to agree that a given candidate
algorithm is reasonably fast on a given chip if it is within a factor
of two of the fastest of all candidates on that chip.  Several people
have now sent implementations of DFC on Pentium Pro, the fastest of
which is just outside the factor of two on Pentium Pro.  Two
programmers have said, in essence, "we can knock off many more cycles,
just give us a few days".

So it is reasonable to expect that soon the only real criticism of DFC
will have fallen by the way-side.  That's when I expect to start
seeing F.U.D. claiming that DFC is "just too damn good to be true"
and that "there must be a catch somewhere".

And I'll answer:

  OK, so what's the catch?

  SHOW ME!

But that's another post for another day.
  Rob.
------------------------------------------------------------------------------



Mon, 16 Jul 2001 03:00:00 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. fast, fast, fast sin, cos, sqrt - OSI open source code

2. Rijndael Cipher

3. Cipher book for ruby

4. ANN : New optimized version of the Serpent cipher

5. Optimising core transformation in a block cipher

6. Crypt XOR or any other cipher !

7. Selecting cipher suites with socket.ssl

8. caesar cipher....oy

9. TLS query: use a particular cipher with http::geturl

10. Cipher algorithm

11. New Python block cipher API, comments wanted

12. Could I have a virus? 0 cipher bits

 

 
Powered by phpBB® Forum Software