random node in tree
Author Message random node in tree

Hi,

I am trying to select a random node in a tree (a S exprexion),
For example for this tree (* (+ 3 4) 5),
node number 5 is (5)
node number 1 is (* (+ 3 4) 5)
node number 2 is ((+ 3 4) 5)

My idea was first to compute the number of nodes in the tree like this :

(define (compute-number-nodes tree)
(cond ((null? tree) 0)
((pair? tree) (+ (compute-number-nodes (car tree))
(compute-number-nodes (cdr tree))))
(else 1)))

and then generate a random number in [1..num-of-nodes] and retraverse the
tree and return the node. Here I am stuck I can't figure out how to do it :
retraverse the tree computing the current node number and return at the same
time the given node number... (my aim is in fact to implement random tree
swapping for genetic programming, but for this I need to select a node in
one tree, a second node the segond true, and then using set-car! to sawp the
subtrees)
Can somebody help me ?
Thanks by advance....

Olivier

Sun, 02 Oct 2005 04:29:10 GMT  random node in tree
; Select a random node from a tree, without an a priori knowledge
; on the number of nodes in the tree.
;
; The algorithm is an instance of a Reservoir sampling:
; Select at random N records from a sequence -- without a
; priori knowledge of the total number of records in a sequence
; (provided that this number is greater than N). The method guarantees
; that each record is selected with a probability of N/M, where M is the
; total number of the records in the sequence.
;
; See
; Reservoir Sampling
; by Paul F. Hultquist and William R. Mahoney
; Dr. Dobbs J., January 2001, p. 189.
; The "Algorithm Alley" column.
; The algorithm was originally developed by Alan Waterman.
;

;   procedure random-node TREE -> [COUNT NODE]
; Traverse the TREE and return COUNT, the total number of nodes
; in the tree, and NODE -- one node of the tree selected with
; a probability 1/COUNT. TREE must be a pair.
;
; We consider '() to be an absence of a child. Thus
; '(a . ()) is a tree with one, left child.
; '(() . a) is a tree with one right child.
; We do not count leaves as nodes. Only internal nodes (aka, pairs)
; count. If we wish to count leaves too (which are atoms other than
; '(), replace "(if (not (pair? node)) ...)" test below with
; "(if (null? node) ...)".

; We assume that a procedure (random i j) returns a random integer k,
; i <= k <= j, that is uniformly distributed within the interval [i,j]
; (endpoints included!)

(define (random-node tree)
(let select ((node  (car tree)) (p 1) (reservoir tree)
(todo (list (cdr tree))))
(if (not (pair? node))                      ; Leaves don't count
(if (null? todo) (values p reservoir)
(select (car todo) p reservoir (cdr todo)))
(let*
((p (+ 1 p))
(k (random 1 p))
(reservoir (if (= k 1) node reservoir)))
(if (pair? node)
(select (car node) p reservoir (cons (cdr node) todo))
(if (null? todo) (values p reservoir)
(select (car todo) p reservoir (cdr todo))))))))

; Proof of the algorithm.
;
; Claim:
; At each invocation of
;       (select node p reservoir todo)
; p>=1 is the number of previously traversed nodes (up to but not including
;   'node')
; 'node' is either '() or the the (p+1)-th node of the tree
; reservoir is the node randomly selected from the traversed
;     with the probability 1/p
; todo is the stack of right brothers of 'node'

; Proof by induction:
;
; Base case: initial invocation.
; 'node' is the left child of the root or '(), todo a singleton list
; that contains its right brother, p = 1 and reservoir is the traversed
; node (which is the root).
; Claim holds.
;
; Induction hypothesis: Claim holds after q nodes are traversed,
; as we enter (select node q reservoir todo)
; If 'node' does not count (it's '() or a leaf, if we don't count leaves)
; we skip it and continue the pre-order traversal.
; If 'node' is the node that counts, we set q' to q+1, k to be a
; number uniformly distributed 1 <= k <= q'. The number k has the probability
; 1/q' = 1/(q+1) of being 1.
; We set reservoir to be 'node' with the probability 1/(q+1),
; we maintain the current value of the reservoir with the probability
; 1 - 1/(q+1) = q/(q+1). Thus reservoir is one of the q previously
; traversed nodes selected with the probability 1/q * q/(q+1) = 1/(q+1).
; If node has children, we recursively enter select, with
; the first argument being the left child of 'node' or nil,
; the second argument (q+1) -- the number of nodes traversed,--
; reservoir is the node selected uniformly at random from the traversed,
; todo is the (grown) stack of the right brothers of the first argument.
; The claim holds.
; If 'node' is not a pair but 'todo' is not empty, we re-enter
; select but the invariant holds. If we don't count leaves as nodes,
; the latter alternative does not apply.
;
; It follows from the claim that when 'select' exits,
; it returns the total number of nodes in the tree and one node
; uniformly randomly selected from the them.

; Tests

; In the following we define (random i j) to always return its left
; boundary.
(define (random i j) i)

; With such a definition of "random", (random-node tree) will return
; the last traversed node.

(define (test-random-node tree)
(display "Tree: ") (display tree) (newline)
(call-with-values
(lambda () (random-node tree))
(lambda (count rand-node)
(display "total count: ") (display count)
(display " selected node: ") (display rand-node)
(newline))))

(test-random-node '(a))
(test-random-node '(() . (b c)))
(test-random-node '(() . (b . c)))
(test-random-node '(* (+ 3 4) 5))

; Results of the test runs
; Tree: (a)
; total count: 1 selected node: (a)
; Tree: (() b c)
; total count: 3 selected node: (c)
; Tree: (() b . c)
; total count: 2 selected node: (b . c)
; Tree: (* (+ 3 4) 5)
; total count: 6 selected node: (5)

Sun, 02 Oct 2005 13:17:14 GMT

 Page 1 of 1 [ 2 post ]

Relevant Pages

Powered by phpBB® Forum Software