(debunk-myth '(debunk-myth '(<-efficient gc malloc/free))) 
Author Message
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

   Interesting.  Are you willing to give this code away for use as a benchmark?
   (That is, if I examine this code and decide that it demonstrates interesting
   stuff, is it okay if put it on my ftp site?  We need better benchmarks than
   the Gabriel suite.)

Yes, please do. That is why I posted the examples. I have distilled this
example (and a few other approximately 1000 line examples) precisely because I
use them as benchmarks to help debug and tune Stalin, my Scheme compiler.

I've been comming to the conclusion that there are qualitatively three kinds
of garbage, at least in the programs that I write:

1. *Extremely* short-lived objects that result from using a function style.
   Something like the matrix passed between M* and M+ in (M+ (M* A X) Y).
2. Objects created for use of a particular phase of a program. These objects
   predictably become garbage en-masse at very precise points during execution.
3. Objects is a global data base that get asserted and retracted in a
   data-dependent fashion as a result of large-time-scale interaction with a
   program.

Generational collection was invented primarily to deal with (1). But I think
that it might not be the best tool for that problem. I think that
compile-time life-time analysis is a better tool for dealing with both (1) and
(2). The argument is the same as the original motivation behind eliminating
hardware to do pipeline interlocking and run-time type checking/dispatching
and using a load/store architecture instead of a stack-machine. All of these
expend run-time resources (e.g. to do instruction scheduling, type inference,
and register allocation) to do something that can be completely determined
statically at compile-time. The same is true of GC in cases (1) and (2). It is
only case (3) that cannot be determined statically and still requires GC.

If this is true in general about real programs then this says that effort in
GC research might need to be redirected. GC research relies on assumptions
about the life-time distributions of objects. An exponential distribution, or
a bimodal distribution, as is typically assumed, attempts to characterize the
distinction between cases (1) and (2) on one hand and case (3) on the other.
But if (1) and (2) are handled by static analysis instead of GC then the
distributional assumptions might be mismatches to using GC primarly for (3). I
would guess that in a persistent-store, of the type used to build something
like an airline reservation system or the like, that object lifetime
distribution would either be flat or normally distributed. (This is only a
hunch, I haven't measured this. But I think that it is an important thing to
measure.) Such a distribution would lead one to build a very different GC than
current generational ones.

    Jeff (home page http://www.*-*-*.com/ ~qobi)
--

    Jeff (home page http://www.*-*-*.com/ ~qobi)



Thu, 09 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

   We need better benchmarks than the Gabriel suite.

BTW, the Gabriel suite is not very good at stressing and measuring GC
performance. The Gabriel benchmarks do not cons enough during a single
execution. If you do the simple thing of running each benchmark 1000 times in
the same address space, by replacing (RUN) in each benchmark with

(DO ((I 0 (+ I 1))) ((= I 1000)) (RUN))

then static analysis can flush the heap between each pass and the next. Stalin
currently can do this for all of the Gabriel benchmarks except boyer, ctak,
and puzzle and run them without any GC in a very small footprint. (In case you
were wondering, an earlier version of Stalin could also do this for the boyer
benchmark but the static analysis in Stalin suffers from feature bloat and it
can no longer process the large constant list structures in boyer.  Stalin
can't handle ctak and puzzle because of a sound but suboptimal technique for
determining the lifetimes of continuations. Stalin thinks that the
continuations created in ctak and puzzle are upward and thus converts them to
CPS. Since static storage reclaimation is done in Stalin upon procedure exit,
CPS code foils this strategy. But this could be solved in a number of ways:
(a) more sophisticated analysis could determine that the continuations in ctak
and puzzle are downward and thus conversion to CPS is not needed.  (b) even if
(a) isn't done a better CPS converter could figure out how to stop the
conversion at the top-level `RUN' iteration still allowing between-pass
heap flushes, or (c) a CPS-aware static allocation strategy could be
employed.)

    Jeff (home page http://www.cs.toronto.edu/~qobi)
--

    Jeff (home page http://www.cs.toronto.edu/~qobi)



Thu, 09 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

   I'm also not sure whether Scheme->C has the best GC.  When you
   compile to C, you often take a performance hit because the C compiler
   doesn't
   cooperate with the GC.  A good native-code compiler for a GC'd language
   can support GC techniques that are more efficient and/or robust than what
   you can do when you compile to C.

   My understanding is that Scheme->C uses Joel Bartlett's mostly-copying
   collector.  I don't know if its the best GC in the world (though it's
   a very nice effort for free, extremely portable software), especially
   in its handling of large, nonpointer arrays.  (Most state-of-the-art
   GC's treat large non-pointer-containing arrays specially, to avoid
   scanning or copying them.  Does Bartlett's collector do this?)

This is an important point. Until Hans Boehm's plug-compatible malloc
replacement, GC was something that was inextricably tied to the low-level
architecture of a particular language implementation. Users couldn't choose GC
stretegy independently from language implementation. And most GC strategies
in most Lisp/Scheme implementations lag far behind current state-of-the-art GC
research. (This might even be true of most other languages as well---even
SML/NJ---though I am only intimately familiar with Lisp and Scheme.) So one
must make a distinction between the relative GC-vs-manual-storage-management
costs that are attainable in theory vs the relative costs faced by users of
existing language implementations. (Now only if you could get Allegro or
Harlequin to incorporate some of your ideas in their products ...)

   I think that you may have misinterpreted some of my postings (and Hans
   Boehm's and Henry Baker's).

I was only tweaking you(pl) with my choice of subject line. But the main point
of my posting still stands.

   True---or at least, the programmer can't afford to be entirely unaware
   of what the underlying storage manager does, for resource-critical code.
   For example, does your GC treat large nonpointer objects specially,
   reusing the memory in place most of the time, or does it blindly copy
   them from one area of memory to another?

Until very recently, these parameters were not part of the published spec of
most language implementations, let alone being user settable.

   I'd be the first to admit that this is difficult.  It's not hard to develop
   a GC that uses heuristics to get this right most of the time, but sometimes
   the programmer will have to profile and tune the code for maximum
   performance.

Agreed. But again, while GC researchers no doubt have metering tools to help
profile and tune code, most language users do not have access to such tools,
and the ones that are available do not give easy-to-interpret results.

    Jeff (home page http://www.cs.toronto.edu/~qobi)
--

    Jeff (home page http://www.cs.toronto.edu/~qobi)



Thu, 09 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

Quote:

> <snip>
> >I have enclosed two versions of a program that I use heavily in my current
> >work. The programs are plug compatible. One is written in a functional APL
> >style and one in a fortran style. The programs use the EM algorithm to fit a
> >mixture of multivariate normal distributions to a set of data points and then
> >cluster the data points.
> <snip>
> >Using Scheme->C on a SPARCclassic the functional
> >version takes nearly three times the CPU time as the Fortran-style. While, I
> >can't be sure, I would bet that a large part of that difference is due to GC.
> <snip>

One thing you might try doing if you have the time and availability to
the NAG F90 compiler is rewrite the code into F90 in a way that requires
garbage collection.  NAG's version implements garbage collection.  This
will presumably be more interesting a year or two from now when NAG
may have an optimizing compiler.


Thu, 09 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))
   >2. Objects created for use of a particular phase of a program.
   >   These objects
   >   predictably become garbage en-masse at very precise points during
   >   execution.

   These seem to be the sorts of things where people are right to say
   malloc/free
   is better. Build a list to pass results from one stage to the next, and
   then process and deallocate it at the same time, for example.

   Do you know of any way of reliably spotting such objects, and separating
   them from class 3 ?    

There was a paper in OOPSLA'91 about identifying phases in the
computation and triggering GC at the right time: Barry Hayes, Using
Key Object Opportunism to Collect Old Objects.

Barklund & Millroth used a similar observation in their
<a href="ftp://ftp.csd.uu.se/pub/papers/reports/0038.ps.gz">garbage
cut</a> paper: in Prolog, collect garbage when a cut has occured.
(Uses the program rather than the data to decide when things die, though.)

Also, The POPL'94 paper by Tofte/Talpin can approximate (de-)allocation
times by inferring allocation regions. Apparently, there is an
SML implementation that sometimes beats SML/NJ with a generational GC
(and sometimes is beaten by it). Aiken et al followed up on the region
analysis in PLDI'95.

Finally, it is worth noting that the imperative compilers one get to hear about
(gcc, lcc and the Massively Scalar one (<- I think)) allocate the data
for a given pass on an 'arena' or 'obstack', which is essentially a
heap as we know it. Except that there is no garbage collection; once the
pass is done, the result is migrated and the area is emptied.

                        Thomas
--
Thomas Lindgren, Uppsala University      

http://www.csd.uu.se/~thomasl/            

Copyright Thomas Lindgren, 1995. Distribution on Microsoft Network prohibited.



Mon, 13 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

Quote:
Simon Kinahan <simonk> writes:

>>1. *Extremely* short-lived objects that result from using a function style.
>>   Something like the matrix passed between M* and M+ in (M+ (M* A X) Y).
>I agree with you about these. It should be exceptionally easy to spot them
>at compile time, because they usually have not even one reference at
>their own level in the program.

I note that
    There is a language called Clean; version 1.0 is available for the Mac.
    It is a very nice lazy functional language similar to Haskell and Gofer
    BUT they have "uniqueness types" and an I/O system built on top of that
    concept.  If for example I write
        :: Vec :== {int}
        f :: Vec -> Vec
    I thereby declare that f is a function which takes (one of possibly many
    references to) an array of integers as argument and returns (one of
    possibly many references to) an array of integers as result, whereas
    if I write
        :: UVec :== *{int}
        f :: Uvec -> Uvec
    I thereby declare that f is a function which takes the last surviving
    reference to an array of integers as argument and returns the only
    surviving reference to an array of integers as result.

    The use of uniqueness types permits descructive update, and in particular
    there is a unique "world" which I/O updates.  Also, when a reference to a
    unique value dies, the value can be reclaimed.

I further note that
    There is a language called Mercury (available for FTP, see notice in
    comp.compilers and comp.lang.prolog) which is a *pure* logic programming
    language with strict static polymorphic type checking *and* strict static
    mode analysis (vastly more flexible than PDC Prolog).  Mercury modes
    include the notion of uniqueness, and once again, there is a unique
    "world" which is threaded through pure declarations (normally using
    grammar notation, but you don't have to) to do I/O.  This uniqueness
    information can also be used for compile time garbage collection, but
    I don't think that has been released yet.

Quote:
>>2. Objects created for use of a particular phase of a program. These objects
>>   predictably become garbage en-masse at very precise points during execution.
>These seem to be the sorts of things where people are right to say malloc/free
>is better. Build a list to pass results from one stage to the next, and
>then process and deallocate it at the same time, for example.
>Do you know of any way of reliably spotting such objects, and separating
>them from class 3 ?    

Uniqueness types (Clean) or uniqueness modes (Mercury) can do the job.
The Clean group in Mijmegen have done some work on uniqueness inference
(by casting it as a version of type inference).  The Mercury group have
a fondness for programmer-provided annotations.

Quote:
>>3. Objects is a global data base that get asserted and retracted in a
>>   data-dependent fashion as a result of large-time-scale interaction with a
>>   program.

If your global data base is only accessible to one thread of control at a time
uniqueness types/modes can also help; the BIG problem is that if the data base
is big enough to worry about you probably want to share at least read access
and you may want to share it with other languages.

--
"conventional orthography is ... a near optimal system for the
 lexical representation of English words." Chomsky & Halle, S.P.E.
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.



Mon, 13 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

Quote:

> Uniqueness types (Clean) or uniqueness modes (Mercury) can do the job.
> The Clean group in Mijmegen have done some work on uniqueness inference
> (by casting it as a version of type inference).  The Mercury group have
> a fondness for programmer-provided annotations.

Another reference for 'Use-Once Variables':

ftp://ftp.netcom.com/pub/hb/hbaker/Use1Var.html  (also .ps.Z)

This paper has pointers to other papers regarding experiments in use-once
variables.

--
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html



Tue, 14 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))
At what point does a GC system start Garbage collecting? with Virtual memory
is this only when you run out of swap space? If the application never
fills the VM space then all its memory is returned when it exits so there is
never a need to Garbage Collect as such. Do you just GC every N alloc calls?

On limited memory systems I can see how GC works but with VM systems it is
not so obvious.

Pete.



Tue, 14 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

    pjc> At what point does a GC system start Garbage collecting? with Virtual memory
    pjc> is this only when you run out of swap space?

This would be optimal if `Copying GC dogma' held.  But it's
unacceptable under usual TSS environment.

    pjc>  If the application never
    pjc> fills the VM space then all its memory is returned when it exits so there is
    pjc> never a need to Garbage Collect as such.

Jon L. White pursued similar idea in his paper "Address/Memory Management
For a Gigantic Lisp Environment, or, GC Considered Harmful" in 1980
Lisp Conference (pp.119-127).  He proposed that, with tertiary storage
there would be no need to invoke GC in host machines' lifetime.

    pjc> Do you just GC every N alloc calls?

This scheme is used in Emacs Lisp.  User can control N by setting
variable gc-cons-threshold.

    pjc> On limited memory systems I can see how GC works but with VM systems it is
    pjc> not so obvious.

Popularly used technique is to gradually increase heap size when
garbage collector finds growth of live objects, so that heap
size/total live object size ratio is kept constant.  This guarantees
reasonably modest behavior in TSS environment.

Things may be even more complicated in generational garbage
collectors.  Which is better, to invoke GC on long-lived objects area
(and move survived objects to next generation), or to increase size of
the area for that generation?  Is there any clear criteria to make
such decision?

;;;  Keio University
;;;    Faculty of Science and Technology
;;;      Department of Math
;;;             MAEDA Atusi (In Japan we write our family names first.)



Fri, 17 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

Quote:



> >     There is a language called Clean; version 1.0 is available for the Mac.
> >     It is a very nice lazy functional language similar to Haskell and Gofer
> >     BUT they have "uniqueness types" and an I/O system built on top of that
> >     concept.  If for example I write
> [SNIP]
> >     The use of uniqueness types permits descructive update, and in
particular
> >     there is a unique "world" which I/O updates.  Also, when a
reference to a
> >     unique value dies, the value can be reclaimed.

> There is also a fledgeling language for the Mac (ReWrite) that _only_ has
> unique types. This means that no GC is required at all. I am biased (being
> the author) but have found that performance is not too bad (considering
> that the current version is purely dynamically typed) _except_ that
> I have to be careful not to have more than one reference to a big data
> structure unless I want multiple copies (Later versions may allow
> 'non-unique' types but I would prefer to avoid GC for as long as possible).

You might also look at the following paper on linear/unique types:

ftp://ftp.netcom.com/pub/hb/hbaker/Use1Var.html (also .ps.Z)

This paper references a number of other papers on the Web on the same subject.

--
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html



Wed, 22 Apr 1998 03:00:00 GMT  
 (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

Quote:

>There is also a fledgeling language for the Mac (ReWrite) that _only_ has
>unique types.

Exactly what is a "unique type"?

graham

--
      ----------------------------------------------------------
      England's not the mythical land of madame george and roses
        It's the home of police who kill black boys on mopeds
      ----------------------------------------------------------



Thu, 30 Apr 1998 03:00:00 GMT  
 
 [ 17 post ]  Go to page: [1] [2]

 Relevant Pages 

1. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

2. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

3. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

4. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

5. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

6. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

7. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

8. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

9. (debunk-myth '(debunk-myth '(<-efficient gc malloc/free)))

10. (debunk-myth '(debunk-myth '(<-efficien

11. (debunk-myth '(debunk-myth '(<-efficien

12. (debunk-myth '(debunk-myth '(<-efficien

 

 
Powered by phpBB® Forum Software