Benchmarking Lazy Functional Languages 
Author Message
 Benchmarking Lazy Functional Languages

Hi there!
  I'd like to poll the net opinion on the relative speeds of current
lazy functional languages with respect to imperative languages (such
as C). Do people think that there is an order of magnitude difference?
Or two or more?
  Are there any benchmark papers out there comparing lazy functional
languages with each other (I have a couple) or with imperative
languages.

Thanks in advance!

-- K
--

Grad Student, Dept of Computer Sciences, University of Texas at Austin
"Money means nothing 'cause hacks are free..." (Apologies to MK)



Thu, 02 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages


Quote:


>>  Are there any benchmark papers out there comparing lazy functional
>>languages with each other (I have a couple) or with imperative
>>languages.

>maybe the following is of interest for you:
>    Richard P. Gabriel, Performance and Evaluation of Lisp Systems,
>    Cambridge MA, MIT Press, 1985

Unfortunately, this book isn't about lazy languages, and many of the
benchmarks in the suite are not functional, i.e., they use side
effects.  It's about Lisp, not functional programming.

My impression is that there is essentially zero reliable quantitative
data comparing the efficiency of lazy functional languages to eager
imperative (dysfunctional?) languages.  (Tiny benchmarks don't count---I'm
talking about real applications, and source code sizes in at least the
hundreds of lines, preferably many thousands.)

The only remotely "real" study I know of is one comparing languages for
rapid prototyping.  It'd be interesting to see one for generating approximately
production-quality code, i.e., you get it to work and also spend a few days
tuning it.

My impression from anecdotal reports is that programs in lazy functional
languages tend to be a factor of five or ten slower than imperative
programs in (say) C or Modula doing the same job.  (There are exceptions
where the functional program may run as fast or faster, but on average
they're "several" times slower.)

I'd be interested in knowledgeable peoples' opinions on this.  In particular,
I'd be interested in whether it's true that lazy languages tend to be almost
an order of magnitude slower than imperative ones---even with the best
compilers---and if so, why?  I myself don't know much about the deep
issues in using and compiling functional languages, so comments are very
welcome.

Some hypotheses (NONE of which I'm defending!) for discussion:

   0. You have to have side effects to be able to write a lot of algorithms
      efficiently.  Most of the costs of lazy functional languages are due
      to excessive copying or expensive data structure lookups, where
      side effects would allow you to solve the problems with lower
      computational complexity, e.g., fast constant time vs. log time
      with an obnoxious constant.

   1. Lazy languages are intrinsically slow, because no feasible compiler
      can eliminate the overheads of laziness.  The analyses are just
      too hard to be automated, and we're in trouble here.

      I'm not sure if this is true, or why.  Does laziness defeat conventional
      optimizations because control flow is too unpredictable?  Would some
      slightly clever compilers be able to fix the problems, by being able
      to guess the likely control flow? (Based on static heuristics, profiling
      and recompilation, dynamic feedback guided incremental recompilation...)

      (In some recent discussions it sounded like people didn't really use
      laziness that much, in the sense that they usually knew which parts
      of their applications relied on laziness and could make that explicit
      without too much trouble.  This would reduce overheads immediately;
      I wonder if it would also have a big effect on code quality once
      other aspects of compilers improve significantly---is laziness a
      bottleneck to achieving performance that will become progressively
      more important?)

   2. Compilers for lazy languages are still not mature, because people
      have spent too much time on fancy features and not enough of basic
      back-end optimization.  (e.g., doing closure elimination and lifetime
      analysis and type inference and so on, but not getting around to
      bread and butter optimizations like data flow and control flow
      analysis, register allocation, instruction scheduling, and so on.)
      The extreme version of this would be that compiler writers need
      to work on their back ends and everything will be fine.

      The less extreme version is that the standard compiler stuff is a
      little harder to do in this context, but with some work it'll
      get integrated with the wonderful front-end parts of the compilers
      and functional languages will be competitive.

      A variant is that there just aren't enough people working on the
      compilers, so the population isn't big enough that you can pick
      the best compiler from a very variable bunch and it will be very good.

      Another variant is that the appropriate compiler techniques are
      different from the traditional ones, and more general (e.g.,
      abstract interpretation, partial evaluation).  In the short term,
      the intractability of such general techniques is a problem, but
      once they're tuned with the right heuristics, etc., functional
      programming languages will speed up and eventually surpass imperative
      languages, because their general niceness provides more opportunities
      for automatic analysis and optimization.

   3. People haven't learned to program the right ways in functional
      languages, because it's really hard.  You can code almost any
      program efficiently in a functional language, but you have to
      know avante-garde tricks to get it to really work well.  E.g.,
      the judicious use of monads, etc., and it's really hard to know
      what the appropriate techniques are.

      A more optimistic view is that people are getting to have a better
      understanding of these things, and the techniques will soon be
      pretty well understood, so that normal programmers will be able
      to write efficient programs easily Real Soon Now.

   4. Interactions between 3 and compiler issues are the problem---the
      idioms people use in functional programming are not very standardized,
      and the compilers are not well tuned for the ways people actually
      write programs---they solve the "general" problems that can be
      tractably solved, but they don't have a well-developed set of
      heuristics for solving the particular problems that real programs
      present a compiler.  (E.g., they don't know anything much about
      monads, and the usual compiler optimizations don't get as
      much performance out of monadic code as a compiler would if it
      exploited monadic idioms---kind of the way that fortran compilers
      are really good at big, dumb loops.)

   5. It's a little of each of these.  

      Pessimistic version:  using and compiling functional programming
      languages (for efficiency) is just too hard, and functional code
      will always be behind the curve performance-wise, because the
      software engineering (of the code and especially of the compilers
      is enormously difficult.

      Optimistic version:  good progress is being made on all of these
      fronts and good performance will come in due course.

Anybody have well-worked out views about this, or failing that, some
reasonable intuitions based on some knowledge and experience?  Where
will the big wins be from here?  How long will it take?

   Paul
--

| Papers on memory allocators, garbage collection, memory hierarchies,
| persistence and  Scheme interpreters and compilers available via ftp from
| ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)      



Thu, 09 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages

Quote:

> A side note about the laziness issue: I tend to think that it is
> underappreciated, especially with regard to the expressiveness that it
> lends to programs, by the general programming populace.  I don't think
> you can fully appreciate its nuances until you have programmed one or
> more large applications in a lazy language.  The pedagogical examples
> presented in most programming courses (e.g. streams) give only the
> faintest glimmer of the possibilities, and even then the explicit
> manipulation required negates much of these programming benefits.
> Current popular VHLLs owe much of their popularity due to the fact that
> there is not as great a conceptual leap between programming imperative
> styles in different languages.

We're all ears.  How about some concrete examples??  If you've got a catalog
of really good examples of why laziness is important, then this should be
worth a good conference/journal article.

Thanks in advance.

--
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html



Sun, 12 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages


   >0. You have to have side effects to be able to write a lot of algorithms
   >efficiently.

   Are you sure about this?

There are some problems for which the best asymptotic complexity is
achieved by algorithms that seem to inherently rely on side-effects.
Union-find with path compression comes to mind.  However, there are
some impressive results in the algorithms world that achieve very good
results and sometimes even lower bounds using purely functional
approaches.

--
-Matthias



Fri, 24 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages

Quote:

>0. You have to have side effects to be able to write a lot of algorithms
>efficiently.

Are you sure about this? From a parallelising point of view side effects are
a pain in the neck as they interfere with dataflow ordering. Also side effects
may be necessary to get fast code, but that does not mean the languages we
program in need to contain side effect semantics. Take a look at the work the
Sisal group has done. Sisal is a pure applicative language (absolutely no side
effects, not even I/O). Sisal is of course compiled into a language which
contains side effects (C or assembler), but thats a job for the compiler not
the programmer.

graham
--
      ----------------------------------------------------------
      England's not the mythical land of madame george and roses
        It's the home of police who kill black boys on mopeds
      ----------------------------------------------------------



Fri, 24 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages

It seems some of the posters in this discussion tend to see
optimization as the production of the shortest sequence of machine
statements.

Let me add that the speed of today's RISC processors depends on
optimal filling of their pipelines and that the performance
discrepancy of the cache compared to main memory gets bigger.

I think some of these points can be took in advance of functional
languages. To make full use of superscalar processors, one needs a
compiler that has a good interface to give it an idea of what
sequences of code will run fast on a given processor.

IMHO, it might be easier to write a good multi-target compiler in a
functional language under these circumstances. An abstract interface
to provide the necessary information for a given CPU type seems to be
easier to implement.

Look at the GNU C compiler. It's optimization in "classical" terms of
elimination of dead code and generating short sequences is good. But
the performance on modern superscalar processors gets worse compared
to the compilers provided by CPU makers. gcc's interface to tell about
optimal instruction sequences isn't really an elegant and powerful
one. Of course, there are few - if any - multi-platform *and*
superscalar-optimizing compilers for C or C++. Functional languages
may be able to provide such a compiler with much less effort.

The other issue, the growing importantness of the CPU cache may be
used to the benefit of languages that can generate code on the fly. A
compiled C program can detect whether it trashes the cache, but can't
really do something about it (maybe switching to an alternate
precompiled piece of code). Languages with on-the-fly code generation
will maybe able to react on such a situation.

Upcoming OS interfaces like "OpenImplementation" from PARC and the GNU
Hurd will provide more feedback on what is going on with the
hardware. You can tell them about your expectation how you will use
memory you're about to allocate, you'll can tune of buffer cache for
your needs etc. You can use these interfaces from languages like C or
C++, but you program cannot in turn react to what is available at a
given time.

I'm not an implementator, just my 2 cents. I though the discussion is
too much on "classic" optimization and maybe chances are missed.

Martin
--
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

"As far as I'm concerned,  if something is so complicated that you can't ex-"
"plain it in 10 seconds, then it's probably not worth knowing anyway"- Calvin



Mon, 27 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages

Quote:

> You may well be able to do without arrays and destructive updating.
> Certainly (unless we have some assumptions about the implemenation),
> we can not in pure functional languages simulate array access/update
> in linear time.
> P.S.
> In the worst case, functional languages have a O(log(N)) penalty over
> imperative languages, as running a RAM can be simulated in O(N*log(N))
> time, where N is the number of operations done by the RAM.

Bzzzzt!  Wrong!

If the array in the functional language is being used in an 'essentially'
imperative manner, then the implementation runs at the _same_ speed (O(1))
as the imperative language, and without _any_ fancy compiler optimizations.
This is a different approach from the uniqueness types approach, and has
been used in some Prolog implementations.

See

ftp://ftp.netcom.com/pub/hb/hbaker/ShallowArrays.html  (also .ps.Z)

--
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html



Mon, 27 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages

Quote:
Graham Matthews <graham> writes:

>>0. You have to have side effects to be able to write a lot of algorithms
>>efficiently.

>Are you sure about this? From a parallelising point of view side effects are
>a pain in the neck as they interfere with dataflow ordering. Also side effects
>may be necessary to get fast code, but that does not mean the languages we
>program in need to contain side effect semantics. Take a look at the work the
>Sisal group has done. Sisal is a pure applicative language (absolutely no side
>effects, not even I/O). Sisal is of course compiled into a language which
>contains side effects (C or assembler), but thats a job for the compiler not
>the programmer.

Some data structures (eg arrays and hash tables) are generally too
expensive to use if you can't be sure that imperative update will be
used on them (copying entire arrays all over the place would be really
bad). Languages like Clean and Mercury allow the programmer to specify
as part of the type or in the case of Mercury, part of the mode, that
a structure is unique and may be imperatively updated.

The uniqueness information does not compromise the declarative semantics,
but ensures that all the uses of a structure conform to the rules that
enable imperative  update to be used. The fact that the compiler introduces
side effects in the generated code has no effect on the declarative semantics
since the compiler will only do so where it can prove it to be safe.

Thomas
--

AD DEUM ET VINUM                     obscure incantation for the day: pr#6



Mon, 27 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages


   >>>0. You have to have side effects to be able to write a lot of algorithms
   >>>efficiently.
   >>
   >>Are you sure about this? From a parallelising point of view side effects are
   >>a pain in the neck as they interfere with dataflow ordering. Also side effects
   >>may be necessary to get fast code, but that does not mean the languages we
   >>program in need to contain side effect semantics.

   >Some data structures (eg arrays and hash tables) are generally too
   >expensive to use if you can't be sure that imperative update will be
   >used on them (copying entire arrays all over the place would be really
   >bad).

   You may well be able to do without arrays and destructive updating.
   Certainly (unless we have some assumptions about the implemenation),
   we can not in pure functional languages simulate array access/update
   in linear time. But who says we have to? It is quite conceivable that
   most problems can be handled purely functionally with the same
   complexity as in imperative languages. Here I look at the complexity
   including reading the input and printing the output as text. Hence, I
   invalidate algorithms that for a problem of (so-called) size N,
   require the input to be given as an already instantiated NxN matrix,
   e.g specifying an N-vertex graph by a neighbour matrix.

You can do array and hash table operations quite nicely in functional
languages.

You do need built in arrays, and operations on them to do it
effectively (such as O(1) access), but creating a new array each time
is not that bad.  You can have a special heap of array memory (using a
buddy system perhaps) and most machines these days can do block-copies
pretty rapidly, so copying is not much of an issue.

Also, with some analysis techniques it is possible to do update
in-place on the array to avoid the unnecessary copying.  To make it
simpler, you can limit this to just once cell at a time.

By having this functionality you can do arrays nearly as effectively
as imperative languages.  Note that in serial, imperative languages
arrays and hash tables are only ever accessed or updated one cell at a
time.  This means that functional versions of these algorithms using
the above primitives need not be so inefficient.

I looked at all this for my PhD a few years ago.  The current problem
is the lag getting this stuff in implemenations, but I think most
recent  Haskell systems have this, or are pretty close. (Someone will
correct if I'm wrong).

stuart



Tue, 28 Apr 1998 03:00:00 GMT  
 Benchmarking Lazy Functional Languages


:> A side note about the laziness issue: I tend to think that it is
:> underappreciated, especially with regard to the expressiveness that it
:> lends to programs, by the general programming populace.  I don't think
:> you can fully appreciate its nuances until you have programmed one or
:> more large applications in a lazy language.  The pedagogical examples
:> presented in most programming courses (e.g. streams) give only the
:> faintest glimmer of the possibilities, and even then the explicit
:> manipulation required negates much of these programming benefits.
:> Current popular VHLLs owe much of their popularity due to the fact that
:> there is not as great a conceptual leap between programming imperative
:> styles in different languages.

:We're all ears.  How about some concrete examples??  If you've got a catalog
:of really good examples of why laziness is important, then this should be
:worth a good conference/journal article.

:Thanks in advance.

[Sorry for the late reply--I don't keep up with this group daily]

OK.  Here are a few examples of small things that I have noticed after
coding in a lazy language for a while.  My guess is that most readers
of these groups will think: "Oh, I can do that in <insert favorite
strict symbolic language> by using <insert explicit delayed evaluation
trick>.  Of course you can; I'm not arguing that.  The crux of the point
I was trying to make is that when you combine these and other artifacts
that arise easily and naturally under a lazy language you end up with a
programming style that is distinctly different in character, sometimes
subtly, but definitely noticeable.  I agree that this would make an
interesting journal article.  Perhaps if I had a few more examples...

A similar argument can be made for other paradigms; e.g. a distinct
programming style develops using a truly object-oriented language as
opposed to coding in an OO style in a non-OO language.  However, I don't
know enough about these to make a good analogy here.

Caveat: these examples arise from programming in Daisy, a sort of lazy
counterpart to Scheme.  Some of these examples may not apply to other
lazy languages that have quite different attributes (e.g. strong typing).
I have tried to pick only things that are primarily linked to laziness
and not just a functional style.  This list is hardly comprehensive;
if you have another good example please jump in and tack it on.

[1] It is routine to de-structure a function argument before knowing
    the actual structure passed, because the binding is delayed.
    This occurs frequently in mapping functions where in strict languages
    you'd test for a nil argument right away before doing anything else.
    This results in multiple conditionals, separated by _lets_, where in
    a lazy language you generally end up with fewer conditionals--
    often one local binding followed by one conditional.

    in Scheme
    --------------------------
    (define foo (lambda (l)
       (if (null? l) ...
           (let ((x1 (caar l)) (y1 (cadar l))
                 (x2 (caadr l)) (y2 (cadadr l)))  ; did I get these right?
                (if (x1 < 0) ...
                   (let ...

    in Daisy
    --------------------------
    foo = \l.
        let:[ [[x1 y1] [x2 y2]] l
             if:[ nil?:l  ...
                  neg?:x1 ...
                ]
            ]

    [ In a pattern-matching language it's even shorter: ]
    foo  [] = ...
    foo  [[x1 y1] [x2 y2]] = ...

[2] I often create local bindings to large or (potentially) erroneous
    computations before I decide whether to use them.  For example, I
    might create a binding to the recursion of a function over the tail
    of its argument so that I can conveniently refer to it later in
    the body in multiple places with a single identifier, rather than
    explicitly expressing the same recursion over and over.

[3] Similar idea, different application is "data continuations".
    I frequently pass data continuations around between functions--
    the data continuation is the result of rest of the computation.
    This allows me to avoid costly functional appends and instead
    just cons data on to the front of the result.

[4] Prevalent use of streams.  Essentially the inverse of [3]--one can
    create a binding in the tail of a list to a future computation
    which makes expressing streams trivial.  Combined with transparent
    stream coercion, streams are commonplace in lazy languages and rarely
    used in others, except for special "I/O streams".

Note that these are all just subtle variations on the tacit assumption
of laziness.

To respond to Baker's original question, I think that a Scheme that
provides explicit delays and _implicit_ forces provides much of the power
needed to express these kinds of things semi-conveniently in Scheme.

For the curious, what was the motivation for the original question?
How many working Scheme's actually provide implicit "force"s?

--
Eric Jeschke                      |          Indiana University



Tue, 28 Apr 1998 03:00:00 GMT  
 
 [ 131 post ]  Go to page: [1] [2] [3] [4] [5] [6] [7] [8] [9]

 Relevant Pages 

1. Speed of FP languages, prospects (was Re: Benchmarking Lazy Functional Languages)

2. multiplatform GUI design with parallelized lazy functional language

3. Lazy and Strict Functional languages

4. Partial Evaluation for Lazy Functional Languages

5. Monitoring (Lazy) Functional Languages

6. Profiling Lazy Functional Languages

7. Time complexity of lazy programs. profiling functional languages

8. Lazy-evaluation functional language for MS DOS

9. Applications for lazy functional languages

10. Functional Programming Languages with Lazy Evaluation

11. Parial Evaluation & Lazy Functional Language

12. Are macros relevant in lazy functional languages?

 

 
Powered by phpBB® Forum Software