comparing objects for structural equivalence 
Author Message
 comparing objects for structural equivalence

How can I compare two objects for structural equivalence?

I already have a sort-of-answer.  I used Data::Dumper to dump a text
version and then compared two text versions.  The only problem is that
Data::Dumper doesn't sort hash keys, so the hash entries can come out in
different orders in the text dump.  I modified Data::Dumper to sort the
hash keys and that works (at least in the case I'm trying it in).  But,
is there any other way using the standard Perl modules and without
creating my own customized copy of Data::Dumper?  My objects only
contain hash/array references and scalars, so I'm not worried about
comparing functions and such things.

Thanks,
John Wiersba

Sent via Deja.com http://www.*-*-*.com/
Before you buy.



Mon, 24 Feb 2003 00:37:47 GMT  
 comparing objects for structural equivalence

Quote:

> How can I compare two objects for structural equivalence?
> I already have a sort-of-answer.  ... I modified Data::Dumper to sort
> the hash keys and that works (at least in the case I'm trying it in).

That seems like a nice lazy solution, if you're not in a time/memory-critical
situation. But you ought to inherit from Data::Dumper and just override the
appropriate methods.

Quote:
> But, is there any other way using the standard Perl modules and
> without creating my own customized copy of Data::Dumper?

There is a Data::Compare module that obviously intends to do what you want, but
it doesn't handle hashes correctly and is very immature. I have code from a
collection of utilities I've written that works OK, but I haven't time to
module-ize it. I'll send it to you if you email me. If you find any bugs or make
any improvements, let me know please.

--




Mon, 24 Feb 2003 05:02:18 GMT  
 comparing objects for structural equivalence


Quote:

>> How can I compare two objects for structural equivalence?
>> I already have a sort-of-answer.  ... I modified Data::Dumper to sort
>> the hash keys and that works (at least in the case I'm trying it in).

>That seems like a nice lazy solution, if you're not in a
>time/memory-critical situation. But you ought to inherit from
>Data::Dumper and just override the appropriate methods.

>> But, is there any other way using the standard Perl modules and
>> without creating my own customized copy of Data::Dumper?

>There is a Data::Compare module that obviously intends to do what you
>want, but it doesn't handle hashes correctly and is very immature. I
>have code from a collection of utilities I've written that works OK, but
>I haven't time to module-ize it. I'll send it to you if you email me. If
>you find any bugs or make any improvements, let me know please.

Maybe I don't understand the problem exactly, but pack/unpack yield/
decompose a string from/to fields (of a structure). Those strings can
be used very well for hash keys and they can be sorted very well. It'll
sure beat Data::Dumper performance ...

--
H.Merijn Brand
using perl5.005.03 and 5.6.0 on HP-UX 10.20, HP-UX 11.00, AIX 4.2, AIX 4.3,
  DEC OSF/1 4.0 and WinNT 4.0 SP-6a,  often with Tk800.022 and/or DBD-Unify
ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/
Member of Amsterdam Perl Mongers (http://www.amsterdam.pm.org/)



Mon, 24 Feb 2003 18:03:37 GMT  
 comparing objects for structural equivalence


Quote:
> Maybe I don't understand the problem exactly, but pack/unpack yield/
> decompose a string from/to fields (of a structure). Those strings can
> be used very well for hash keys and they can be sorted very well.
It'll
> sure beat Data::Dumper performance ...

The problem:  detect if two objects (in my case hashes) are recursively
structurally equivalent.  I.e. if they contain the same data.  Since I
need this for a regression test, it's also nice to have a printed form
of the objects in case they are not identical, to help diagnose what the
problem is.

Currently, I'm doing something like this:
   use Data::Dumper;
   $Data::Dumper::Terse  = 1;
   $Data::Dumper::Indent = 1;
   my $dump1 = Dumper($object1);
   my $dump2 = Dumper($object2);
   if ($dump1 ne $dump2) {
      print "Expected\n$dump1\n\nbut got\n$dump2\n";
   }

This works great for me except for the fact that Data::Dumper() doesn't
sort hash keys before dumping them.  Since hash keys in the same chain
are not necessarily in the same order in two equivalent objects (due to
the order of insertion into the hash), the two dumps can be different
even when the two objects contain exactly the same data.  (I'm guessing
about perl's implementation of hashes.)

--
John Wiersba

Sent via Deja.com http://www.deja.com/
Before you buy.



Mon, 24 Feb 2003 23:00:00 GMT  
 comparing objects for structural equivalence

Quote:

> This works great for me except for the fact that Data::Dumper() doesn't
> sort hash keys before dumping them.  Since hash keys in the same chain
> are not necessarily in the same order in two equivalent objects (due to
> the order of insertion into the hash), the two dumps can be different
> even when the two objects contain exactly the same data.

One possible way to solve the problem above is to ensure that the
order of values in the hash chain is deterministic.  This can be done
by sorting them based on the key, or alternatively on the full hash
value and then the key.  (The latter would permit a small optimization
when searching.)  I do not think this is doable using Tie (short of
writing your own extensible hash package).

I wonder if this "deterministic hash order" is a useful enough feature
to consider for perl 6.  Compared to the current approach it would
impose a performance hit on every insert (but none on a delete).  The
benefit would be the usual one that deterministic behavior is easier
to debug, easier to test with regression tests, etc.  An option for
this becomes even more important as multi-threading becomes more
widely used in Perl programs.

--
Hopefully helpfully yours,
Steve
--

Fidelity Investments   82 Devonshire St. R24D    Boston MA 02109
There is nothing so practical as a good theory.  Comments are by me,
not Fidelity Investments, its subsidiaries or affiliates.



Tue, 25 Feb 2003 01:37:54 GMT  
 comparing objects for structural equivalence

Quote:

> One possible way to solve the problem above is to ensure that the
> order of values in the hash chain is deterministic.

Maybe each() and keys() could take an optional parameter which would
specify that hash chains be returned in some canonical order.  I'm
assuming that two hashs with the same number of keys have the same
number of hash buckets.  (Is this true?)  That way there would be no
overhead unless this feature was used.

--
John Wiersba

Sent via Deja.com http://www.deja.com/
Before you buy.



Tue, 25 Feb 2003 03:47:44 GMT  
 comparing objects for structural equivalence

Quote:


> > I modified Data::Dumper to sort
> > the hash keys and that works (at least in the case I'm trying it
> > in).

> That seems like a nice lazy solution, if you're not in a
> time/memory-critical situation. But you ought to inherit from
> Data::Dumper and just override the appropriate methods.

Unfortunately, the modification I made to Data::Dumper is in a 211-line
subroutine.  I modified sub _dump() on line 311 in my copy of
Data::Dumper version 2.101.  To see the location where I modified it,
search for "each".  There is another call to each() on line 94 in Seen()
which may or may not need to be modified, too (I'm not yet sure exactly
what Seen() does).  So, replacing a 211-line subroutine with a slightly
modified copy of it, doesn't make much sense.

Quote:
> > But, is there any other way using the standard Perl modules and
> > without creating my own customized copy of Data::Dumper?

> There is a Data::Compare module that obviously intends to do what you
> want, but it doesn't handle hashes correctly and is very immature. I
> have code from a collection of utilities I've written that works OK, but
> I haven't time to module-ize it. I'll send it to you if you email me. If
> you find any bugs or make any improvements, let me know please.

I'll send you an email to get your code.  I am also looking at these
other Data modules:
   Compare
   Dump
   DumpXML
   PropertyList
   Walker
to see if they have any simple code I could rip off.  Maybe I can fix
Data::Compare to work correctly on hashes?  Why do you say it's
comparison of hashes is broken?

I'm using my comparison function in a regression test for a module I've
written, so it's actually useful to have a text dump of the data
structures being compared.  If they're different, then I can print the
expected structure and the actual structure and compare them with an
editor to find the bug.

--
John Wiersba

Sent via Deja.com http://www.deja.com/
Before you buy.



Mon, 24 Feb 2003 23:37:27 GMT  
 comparing objects for structural equivalence

Quote:

> Maybe each() and keys() could take an optional parameter which would
> specify that hash chains be returned in some canonical order.  I'm
> assuming that two hashs with the same number of keys have the same
> number of hash buckets.  (Is this true?)

It's not even true for two hashes with the same keys.  For example:

        for (1..100) { $a{$_} = 1; }
        for (2..100) { delete $a{$_}; }

        %b = %a;

        print scalar %a, ", ", scalar %b, "\n";

reports that %a has 128 buckets; %b has only 8.

-- HansM



Tue, 25 Feb 2003 08:01:14 GMT  
 comparing objects for structural equivalence


Quote:
> Maybe each() and keys() could take an optional parameter which would
> specify that hash chains be returned in some canonical order.  I'm
> assuming that two hashs with the same number of keys have the same
> number of hash buckets.  (Is this true?)

No that isn't necessarily true - it will depend on the sequence
of key insertions and removals for the hash I believe.

Tom

--

http://www.compton.nu/
...Please update your programs.



Tue, 25 Feb 2003 05:31:37 GMT  
 comparing objects for structural equivalence
[A complimentary Cc of this posting was sent to Tom Hughes


Quote:
> > Maybe each() and keys() could take an optional parameter which would
> > specify that hash chains be returned in some canonical order.  I'm
> > assuming that two hashs with the same number of keys have the same
> > number of hash buckets.  (Is this true?)

> No that isn't necessarily true - it will depend on the sequence
> of key insertions and removals for the hash I believe.

While true, this is not relevant.  All the iteration engine should do
is traverse the buckets not in the numeric order, but in the order of
"inverted binary": 0b000, 0b100, 0b010, 0b110, 0b001, 0b101, 0b011, 0b111
for an 8-bucket hash.  [Anyone knows an efficient algorithm for finding
the next elt of this sequence?]

Sorting the chain for all the buckets (and keeping a bit in the flags
"this hash was used with each/keys/values, so needs chains sorted")
would solve the other half of the problem.

Ilya



Tue, 25 Feb 2003 14:02:56 GMT  
 comparing objects for structural equivalence

Quote:

> All the iteration engine should do
> is traverse the buckets not in the numeric order, but in the order of
> "inverted binary": 0b000, 0b100, 0b010, 0b110, 0b001, 0b101, 0b011, 0b111
> for an 8-bucket hash.  [Anyone knows an efficient algorithm for finding
> the next elt of this sequence?]

That reminds me of the Gray sequence --

        000 100 110 010 011 001 101 111

-- which is characterized by having only one bit
change from one item to the next.

--
John Porter

        We're building the house of the future together.



Tue, 25 Feb 2003 20:20:58 GMT  
 
 [ 42 post ]  Go to page: [1] [2] [3]

 Relevant Pages 

1. directory datetime

2. Other simple question II Integer Range

3. looping back to a specfic array element

4. Closures, Object-Oriented Equivalence, New version of Perl and TCL

5. Closures, Object-Oriented Equivalence, New version of Perl and TCL

6. compare object identity with "use overload".

7. newbie: comparing two arrays with multiple objects

8. newbie: comparing scalar to any object in array

9. Comparing objects

10. Delphi 2.0 Update - Problems with the new BDE that comes with the update

11. Pascal is weak!!!

12. question: how to use time?

 

 
Powered by phpBB® Forum Software