YAPSQ (Perl Style Question): sorting and maintaining sorted order 
Author Message
 YAPSQ (Perl Style Question): sorting and maintaining sorted order

I have a class Foo which has a meaningful less-than relationship.  I
also have a class Bar which wants to maintain a list of Foo items in
sorted order.

What is a good idiom to do this?

Right now, there is a method Foo::compare such that $a->compare($b)
returns -1, 0, or 1 if $a is less than, equal to, or greater than $b.
(Based on Foo::compare, I have the obvious one-liners for $a->less($b)
and so on.)

And the Bar class offers an `add' method such that $bar->add($foo)
walks the internal array, comparing Foo items along the way, and
inserts the new item into the right spot in the array.  (Actually, the
Bar::add method accepts a list of items and there is an internal
helper method Bar::_add1 which does a single item.)

Specific questions:

(1) Writing $a->compare($b) or $a->less($b) strikes me as a
    non-intuitive way to compare two items.  compare($a, $b) or
    less($a, $b) would be better.  But I suppose that doesn't work
    right.  Or does it?  And Foo::compare($a, $b) doesn't play well
    with inheritance.  So?

(2) Do I want to ``use overload 'cmp' => \&Foo::compare;'' in Foo.pm?
    I'm scared of overloading.

(3) On a lower level: how do I insert an item into a sorted array?  Is
    manually walking the array the right answer?  Or maybe I just want
    to do this:


    Is this reasonably fast in the presorted case?

(4) On a still lower level: how do I insert an object into an array at
    a known index?  Right now, I do this:

(5) Can the search and insert steps be combined into one clever map or
    grep expression, perhaps?  Right now, I search then insert as
    follows:

            $i++;
        }

I'm using Perl 5.6.0 on Unix, if that matters.
kai
--
I like BOTH kinds of music.



Fri, 07 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order


: (1) Writing $a->compare($b) or $a->less($b) strikes me as a
:     non-intuitive way to compare two items. [...]

Why?  They look reasonable to me.

: (2) Do I want to ``use overload 'cmp' => \&Foo::compare;'' in Foo.pm?
:     I'm scared of overloading.

I'd probably overload the numerical operators <=>, <, >, <=, >=, and ==
instead of their stringy cousins.

: (3) On a lower level: how do I insert an item into a sorted array?  Is
:     manually walking the array the right answer?  Or maybe I just want
:     to do this:


:     Is this reasonably fast in the presorted case?



        my $a = shift;

        my $i;

            $i = 0;


                last if $_ < $elt;
                ++$i;
            }


        }
    }

Assuming O(n) time complexity for splice() this sub runs in O(n * m)
time where n and m are the sizes of the existing array and the list of
items to be inserted.

: (4) On a still lower level: how do I insert an object into an array at
:     a known index?  Right now, I do this:

Use the splice() operator as in the previous example.

: (5) Can the search and insert steps be combined into one clever map or
:     grep expression, perhaps?  Right now, I search then insert as
:     follows:

:             $i++;
:         }

I'm sure someone will come up with a way. :-)

Greg
--
A list is only as strong as its weakest link.
    -- Knuth



Fri, 07 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order

Quote:

>I have a class Foo which has a meaningful less-than relationship.  I
>also have a class Bar which wants to maintain a list of Foo items in
>sorted order.

>What is a good idiom to do this?

>Right now, there is a method Foo::compare such that $a->compare($b)
>returns -1, 0, or 1 if $a is less than, equal to, or greater than $b.
>(Based on Foo::compare, I have the obvious one-liners for $a->less($b)
>and so on.)

>And the Bar class offers an `add' method such that $bar->add($foo)
>walks the internal array, comparing Foo items along the way, and
>inserts the new item into the right spot in the array.  (Actually, the
>Bar::add method accepts a list of items and there is an internal
>helper method Bar::_add1 which does a single item.)

Overloading is a good technique for classes where algebraic constructs
are expected" Complex number, vectors, groups and rings, matricies,
relational calculus etc.  I'd avoid it unless I had a real good idea why I
should use it.

Consider for a moment an implementation like this:

    sub Foo::ord {
        my $self = shift;

        ... # do what is needed to compute an ordering key

        return ($ordering_key);
    }

    sub Bar::add {
        my $self = shift;
        ...

        ...
    }

I've made some assumptions about your implementation but perhaps
you can see the core idea here.  The point is that Foo::ord returns
an opaque value that can be used by the builtin cmp operator.

This is not very efficient as it stands but it opens you up to
usage of things like the Orcish Maneuver, Schwartzian Transform
and other key caching techniques.

Take a look at

    http://www.hpl.hp.com/personal/Larry_Rosler/sort/sorting.html

For some real good ideas on sorting in perl.

chris
--
    chris fedde
    303 773 9134



Fri, 07 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order

Quote:

> I have a class Foo which has a meaningful less-than relationship.  I
> also have a class Bar which wants to maintain a list of Foo items in
> sorted order.

> What is a good idiom to do this?

TMTOWTDI :)  Looks like you already have some good ideas.

Quote:
> Right now, there is a method Foo::compare such that $a->compare($b)
> returns -1, 0, or 1 if $a is less than, equal to, or greater than $b.
> (Based on Foo::compare, I have the obvious one-liners for $a->less($b)
> and so on.)

> And the Bar class offers an `add' method such that $bar->add($foo)
> walks the internal array, comparing Foo items along the way, and
> inserts the new item into the right spot in the array.  (Actually, the
> Bar::add method accepts a list of items and there is an internal
> helper method Bar::_add1 which does a single item.)

> Specific questions:

> (1) Writing $a->compare($b) or $a->less($b) strikes me as a
>     non-intuitive way to compare two items.  compare($a, $b) or
>     less($a, $b) would be better.  But I suppose that doesn't work
>     right.  Or does it?  And Foo::compare($a, $b) doesn't play well
>     with inheritance.  So?

Yes, it does work right, and yes, it doesn't work well with inheritance.
Unless you make Foo::compare detect that $a or $b belongs to a derived
class and call compare in that class.  But no matter what you do, there
is a can of worms there.  What if both $a and $b belong to different
subclasses with different compare methods?

Quote:
> (2) Do I want to ``use overload 'cmp' => \&Foo::compare;'' in Foo.pm?
>     I'm scared of overloading.

Conquer your fear!  What could be clearer than $a lt $b?
( ;-) If you have a fortran background anyway.)

Quote:
> (3) On a lower level: how do I insert an item into a sorted array?  Is
>     manually walking the array the right answer?  Or maybe I just want
>     to do this:


>     Is this reasonably fast in the presorted case?

If you only expect a short array (i.e. less than 10 or so) I'd just walk the
array.  Otherwise, do a binary search.  But make sure to do it right
(it's easy to mess up).

Quote:
> (4) On a still lower level: how do I insert an object into an array at
>     a known index?  Right now, I do this:



Quote:
> (5) Can the search and insert steps be combined into one clever map or
>     grep expression, perhaps?  Right now, I search then insert as
>     follows:

>             $i++;
>         }


Are you expecting long lists being added to long lists?
If not, don't worry about it.  If so, perhaps sort the ones
to be added and then merge them with the sorted list.
You can do a merge with a map, but it might be clearer with
while.  And perhaps more efficient on versions of perl <= 5.6.0.
(One-to-many mapping had an inefficiency problem.)  Or just sort
them all together.

If efficiency is really important, perhaps you would want to create
a method that returns a simple scalar that could be compared instead
of calling the compare method each time.  E.g. if your objects were
dates with y/d/m/h/m/s fields, have a Foo method that returned seconds
since an epoch and then have Bar save that for each saved Foo object
for comparing later.  Then you would want to look into how Guttman-
Rosler or Schwartzian Transforms work.



Fri, 07 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order

Quote:

> Sent: Monday, August 21, 2000 15:25

> Subject: Re: YAPSQ (Perl Style Question): sorting and maintaining
sorted
> order




...

Quote:
> : (4) On a still lower level: how do I insert an object into
> an array at
> :     a known index?  Right now, I do this:

> Use the splice() operator as in the previous example.

Which is much more efficient, because it avoids copying the part of the
array below the insertion, and moves pointers above the insertion,
instead of copying all the data.

Quote:
> : (5) Can the search and insert steps be combined into one clever map
or
> :     grep expression, perhaps?  Right now, I search then insert as
> :     follows:

> :             $i++;
> :         }


As the list is sorted, if it is large a binary search would be more
efficient.  And then a splice instead of a copy.

Quote:
> I'm sure someone will come up with a way. :-)

Well, that's easy.  It is slightly worse than the two-step operation
above, because some testing continues after the insertion.  But
efficiency doesn't seem to be a goal here.

#!/usr/bin/perl -wl
use strict;


my $x = 4;

unless (my $done = $x >= $a[-1])



--
Larry Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/



Fri, 07 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order

Quote:

> : (3) On a lower level: how do I insert an item into a sorted array?  Is
> :     manually walking the array the right answer?  Or maybe I just want
> :     to do this:

I've never used such a thing since I started using Perl. I don't know how
you intend to access your data in your application, but in almost every
case when (in other languages) you might use a sorted list, I have found
that a  Hash does the job in Perl.

You would only need a sorted array if you want to do an approximate lookup,
and the lexicographic sort provides a suitable definition of "approximate".
E.g. if you want to search for  matches like m/^foo/ (note the '^' anchor).

The classic way to do this sort of thing on large collections of data is
the B-tree. Use the module "BTree" from CPAN, or if you want off-line
storage, use DB_File to access the Berkeley (now sleepycat) DB package. If
they're not large enough to need off-line storage, brute force lookups are
probably fast enough.

What the "best" way to do all this depends heavily on how much data you
have, how frequent updates are compared with reads, what sort of
approximate lookup is actually required (e.g. you could have a Hash indexed
by Text::Soundex values).

Ian



Sat, 08 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order



Other people have addressed most of your questions, but I wanted to
point out a style idiom.  Because of the way Perl does argument passing
(lists get concatenated and flattened), you can just do this:


--
                 Dan Schmidt | http://www.dfan.org
Honest Bob CD now available! | http://www.dfan.org/honestbob/cd.html



Sat, 08 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order
: I have a class Foo which has a meaningful less-than relationship.  I

: (1) Writing $a->compare($b) or $a->less($b) strikes me as a

you could write that as

        compare $a $b;  # which reads nicely

and
        less $a $b      # which doesn't read so nicely

and if $b has a method called 'to' then you can even write

        compare $a to $b ;

(you get a bareword warning but the code seems to work.)



Sat, 08 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order

Quote:

> I've never used such a thing since I started using Perl. I don't
> know how you intend to access your data in your application, but in
> almost every case when (in other languages) you might use a sorted
> list, I have found that a Hash does the job in Perl.

A hash.  Hm.  Hmmm...  Maybe you're right.  I must investigate this.

Simplified description of my application follows: I have something
similar to a database system.  The basic operators can do table lookup
based on simple conditions, such as `ATTRIBUTE OPERATOR VALUE', for a
few predefined operators.  The result of such a simple condition is a
list of row identifiers.  On top of this, I implement union and
intersection (and other operators, such as my equivalent of the join
operator).

Since I expect the tables to have a few million rows, the input for
the union and intersection operators might be on the order of 100,000
elements.

(Of course, the current implementation operates on toy data which has
more like ten tuples.  But someday, it will be finished and used on
real data...)

kai
--
I like BOTH kinds of music.



Sat, 08 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order



:
: : (1) Writing $a->compare($b) or $a->less($b) strikes me as a
:
: you could write that as
:
:       compare $a $b;  # which reads nicely
:
: and
:       less $a $b      # which doesn't read so nicely

...unless you're a Lisp hacker.

: and if $b has a method called 'to' then you can even write
:
:       compare $a to $b ;
:
: (you get a bareword warning but the code seems to work.)


Oddly, though, it has to be in the calling package and not in the
package that defines the class.  I guess that sort of syntax is
best left to MUD commands. :-)

Greg
--
The big corporations are suddenly taking notice of the web, and their
reactions have been slow.  Even the computer industry failed to see the
importance of the Internet, but that's not saying much.  Let's face it; the
computer industry failed to see that the century would end.



Sat, 08 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order
: Simplified description of my application follows: I have something
: similar to a database system.  The basic operators can do table lookup
: based on simple conditions, such as `ATTRIBUTE OPERATOR VALUE', for a
: few predefined operators.  The result of such a simple condition is a
: list of row identifiers.  On top of this, I implement union and
: intersection (and other operators, such as my equivalent of the join
: operator).
:
: Since I expect the tables to have a few million rows, the input for
: the union and intersection operators might be on the order of 100,000
: elements.

Does Bit::Vector look like it might be helpful for those operations?



Sat, 08 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order
On 22 Aug 2000 17:52:16 GMT,

Quote:

> Simplified description of my application follows: I have something
> similar to a database system.  The basic operators can do table lookup
> based on simple conditions, such as `ATTRIBUTE OPERATOR VALUE', for a
> few predefined operators.  The result of such a simple condition is a
> list of row identifiers.  On top of this, I implement union and
> intersection (and other operators, such as my equivalent of the join
> operator).

Maybe I'm missing something, but is there a reason that you are not
just using a backend like Postgresql or MySQL? Of course, all I know
of the problem is what you have told us here, but it sounds a bit like
you're doing standard relational database things.. Not that there is
anything wrong with wanting to do that, just that if you are writing
this for production, maybe a backend would be a bit faster and well..
easier on the maintenance.

Martien
--
Martien Verbruggen              | My friend has a baby. I'm writing
Interactive Media Division      | down all the noises the baby makes
Commercial Dynamics Pty. Ltd.   | so later I can ask him what he meant
NSW, Australia                  | - Steven Wright



Sat, 08 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order

Quote:

> I have a class Foo which has a meaningful less-than relationship.  I
> also have a class Bar which wants to maintain a list of Foo items in
> sorted order.

> What is a good idiom to do this?

> Right now, there is a method Foo::compare such that $a->compare($b)
> returns -1, 0, or 1 if $a is less than, equal to, or greater than $b.
> (Based on Foo::compare, I have the obvious one-liners for $a->less($b)
> and so on.)

> And the Bar class offers an `add' method such that $bar->add($foo)
> walks the internal array, comparing Foo items along the way, and
> inserts the new item into the right spot in the array.  (Actually, the
> Bar::add method accepts a list of items and there is an internal
> helper method Bar::_add1 which does a single item.)

> Specific questions:

> (1) Writing $a->compare($b) or $a->less($b) strikes me as a
>     non-intuitive way to compare two items.  compare($a, $b) or
>     less($a, $b) would be better.  But I suppose that doesn't work
>     right.  Or does it?  And Foo::compare($a, $b) doesn't play well
>     with inheritance.  So?

how about a syntactic-sugar funcion like this

sub main::compare {

  if ( UNIVERSAL::can($a,'compare') )
  {
    return $a->compare($b);
  }
  else
  {
    return $a <=> $b;
  }

Quote:
}    
> (2) Do I want to ``use overload 'cmp' => \&Foo::compare;'' in Foo.pm?
>     I'm scared of overloading.

I would avoid overloading, personally

Quote:
> (3) On a lower level: how do I insert an item into a sorted array?  Is
>     manually walking the array the right answer?  Or maybe I just want
>     to do this:


>     Is this reasonably fast in the presorted case?

very, very bad.

Quote:

> (4) On a still lower level: how do I insert an object into an array at
>     a known index?  Right now, I do this:


I would use


Quote:

> (5) Can the search and insert steps be combined into one clever map or
>     grep expression, perhaps?  Right now, I search then insert as
>     follows:

>             $i++;
>         }


how about a binary search  like

# find_a_gt_b($array_ref, $a)
#
# -- $array_ref is an array reference to a list of elements.
# -- $a is the element you want to insert

# -- this function returns the index of the element of the array
# -- ($array_ref) who's value is greater than $a

# -- this function uses a bisection method of finding
# -- the element

sub find_a_gt_b {

  my ($M, $N) = (0, $#{$ar});

  my $n   = 0;
  my $D   = '00'; # -- zero but true for first time.

  while ($D && $M<=$N)
  {
    $n     = $M + $D;
    my $p  = $a <=> $ar->[$n];

    if ($p > 0)
    {
      $M = $n;
      $n++;
    }
    else
    {
      $N = $n-1;
      last unless $p;
    }
    $D     = ($N-$M+1) >> 1;
  }

  return $n;

Quote:
}

loop {

 }

Quote:
> I'm using Perl 5.6.0 on Unix, if that matters.
> kai
> --
> I like BOTH kinds of music.

--
Doug Perham                                          o{..}o    

WorldGate Communications, Inc.                        (______)\
                                                      / \  / \  


Sat, 08 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order

Quote:

> Maybe I'm missing something, but is there a reason that you are not
> just using a backend like Postgresql or MySQL? Of course, all I know
> of the problem is what you have told us here, but it sounds a bit
> like you're doing standard relational database things..

Yes, that's what I said.  But I was simplifying things because I
assume that many people know about relational databases.

Actually, it's a system for storing XML documents, with weighting.
(This means that I'm storing words, and for each words I have a list
of occurrences.  And each occurrence comes with a weight, as well as
a pointer into the XML tree which tells me where that word occurred.
Relational algebra doesn't quite cut it for this application, but the
subject of my research is to figure out what is the XML analog of
relational algebra.)

kai
--
I like BOTH kinds of music.



Sun, 09 Feb 2003 03:00:00 GMT  
 YAPSQ (Perl Style Question): sorting and maintaining sorted order
On 23 Aug 2000 09:40:24 GMT,

Quote:

> > Maybe I'm missing something, but is there a reason that you are not
> > just using a backend like Postgresql or MySQL? Of course, all I know
> > of the problem is what you have told us here, but it sounds a bit
> > like you're doing standard relational database things..

> Yes, that's what I said.  But I was simplifying things because I
> assume that many people know about relational databases.

> Actually, it's a system for storing XML documents, with weighting.
> (This means that I'm storing words, and for each words I have a list
> of occurrences.  And each occurrence comes with a weight, as well as
> a pointer into the XML tree which tells me where that word occurred.
> Relational algebra doesn't quite cut it for this application, but the
> subject of my research is to figure out what is the XML analog of
> relational algebra.)

Ah. Structured document indexing. Relational databases would indeed not
do this well. Object databases come closer, but would probably still be
much slower than anything handrolled.

Martien
--
Martien Verbruggen              |
Interactive Media Division      | Begin at the beginning and go on
Commercial Dynamics Pty. Ltd.   | till you come to the end; then stop.
NSW, Australia                  |



Sun, 09 Feb 2003 03:00:00 GMT  
 
 [ 15 post ] 

 Relevant Pages 

1. SQL Update Help Required

2. Two questions

3. A question of Style, was: SPUG: Sort an array question

4. re-sorting a sorted list (and sorting by date)

5. Sorting question (Sort of..:-)

6. ST-a-gogo: question about sort-within-sort-....

7. Sort order in Perl

8. Perl sort different from unix sort

9. Sort Order Help Needed

10. DBD::AnyData alphabetic sort with ORDER BY

11. Changing sort order on the fly...

12. Sorting values such as 1.1, 1.1.1, 2.1 into order

 

 
Powered by phpBB® Forum Software