When are values recomputed?
Author Message
When are values recomputed?

expression as an example

let nums = replicate 10 5  -- a list of 10 5s
in (foldl (+) 0 nums) + (foldl (-) 0 nums)

My question is: does the list "nums" get generated once or twice?  Or
is this just undefined and up to the implementation?

I realize that in a side-effect free language like Haskell it doesn't
make any difference semantically.  However if it's expensive to
generate the list then it will matter for performance.  On the flip
side, if the list stays around after it's no longer needed, then it's
using up memory.

An example where this question makes more of a difference is when
searching a tree.  Following the example of John Hughes in his paper
Why Functional Programming Matters, I defined a depth-first traversal
of a tree as follows:

-------- stack_overflow.hs --------
import System(getArgs)

-- a generic tree datatype
data Tree a = Node a [Tree a] deriving(Show)

-- genTree takes a successors function, and root node value,
-- and generates a tree
genTree :: (a -> [a]) -> a -> Tree a
genTree genSuccs x = Node x [genTree genSuccs succ | succ <- genSuccs x]

-- like foldl for lists, but for a Tree datatype instead.  It
-- traverses the tree in depth first order.
depthFirstFold :: (a -> b -> a) -> a -> Tree b -> a
depthFirstFold accumF accum (Node x subs) =
let accum' = foldl (depthFirstFold accumF) accum subs
in accumF accum' x

-- counts the nodes in a tree, using depthFirstFold
numNodes :: Tree a -> Int
numNodes = depthFirstFold (\n _ -> n+1) 0

-- generates a tree of height h, where each node at level n has n children
genTreeOfHeight h = genTree (\x -> replicate x (x-1)) h

main = do args <- getArgs
let height = if args == [] then 8 else read (head args)
let n = numNodes (genTreeOfHeight height)
putStrLn (show n)

-------- end of stack_overflow.hs --------

The above program generates a tree which is not very high, but is very
wide.  It then does a depth-first traversal of this tree, using
depthFirstFold, to count the nodes in the tree.  If you run the above
program in Hugs 1.4 or ghc 4.01, with a height argument of 8 or over,
it dies with an error saying its stack space overflowed.

The reason it seems to me this shouldn't happen is that the tree only
has a height of 8, so a depth-first traversal should only require at
most 8 nodes to be in memory at once.  That is, unless Haskell is
keeping the rest of the already visited tree around in memory in case
it is visited again.

John Hughes's paper gives an example of minimax and alphabeta search
which works much like the above program, in that it depends on lazy
evaluation to only generate nodes of the tree as they're visited.  In
addition, he specifically states (page 15) that once the function
returns, the nodes below it can be reclaimed by the garbage collector.
That is, you won't end up with the whole tree resident in memory by
the time you've searched the whole tree.

Can anyone tell me what I'm not understanding here?  Or how to make
the above program work in Haskell?  Thank you.

--

Fri, 10 Aug 2001 03:00:00 GMT
When are values recomputed?

: let nums = replicate 10 5  -- a list of 10 5s
: in (foldl (+) 0 nums) + (foldl (-) 0 nums)
:
: My question is: does the list "nums" get generated once or twice?  Or
: is this just undefined and up to the implementation?

No it is not undefined or up to the implementation.
nums is a list object on the heap and as such it can only be computed
once. You can think of nums as a pointer to the object.

n.

--
--[ www.dtek.chalmers.se/~d95mback ]---[ PGP: 0x453504F1 ]---[ UIN: 4439498 ]--

Sat, 11 Aug 2001 03:00:00 GMT
When are values recomputed?

Quote:

> -- like foldl for lists, but for a Tree datatype instead.  It
> -- traverses the tree in depth first order.
> depthFirstFold :: (a -> b -> a) -> a -> Tree b -> a
> depthFirstFold accumF accum (Node x subs) =
>     let accum' = foldl (depthFirstFold accumF) accum subs
>     in accumF accum' x
> If you run the above
> program in Hugs 1.4 or ghc 4.01, with a height argument of 8 or over,
> it dies with an error saying its stack space overflowed.

In nhc13, we get a heap overflow instead.  I believe both heap
and stack overflows are due to the very commonly overlooked
laziness properties of foldl, whereby a large number of (\n _-> n+1)
closures are building up, one for each node of the tree, and only
evaluated at the end of the computation.

If we replace the call to foldl in depthFirstFold with the strict
version, foldl', then this program does indeed run in constant heap
space (of around 1kb according to nhc13).  Its time complexity is
pretty bad though, as you would expect:
tree depth 8:    5.9 seconds
tree depth 9:   51.6 seconds
tree depth 10: 515.8 seconds

Quote:
> The reason it seems to me this shouldn't happen is that the tree only
> has a height of 8, so a depth-first traversal should only require at
> most 8 nodes to be in memory at once.  That is, unless Haskell is
> keeping the rest of the already visited tree around in memory in case
> it is visited again.

It is unevaluated "increment by one" closures which are kept, not the
tree itself.

Regards,
Malcolm

Dr Malcolm Wallace (functional programming research)   +44 1904 434756
Department of Computer Science, University of York, YORK YO1 5DD, U.K.
----------------------------------- http://www.cs.york.ac.uk/~malcolm/

Sat, 11 Aug 2001 03:00:00 GMT

 Page 1 of 1 [ 3 post ]

Relevant Pages