comparison of 2 SGML documents 
Author Message
 comparison of 2 SGML documents

I've got a client that wants to compare two 70,000 line SGML
documents with output to paper

  common    => pass, unchanged
  deletions => strike-through
  additions => underline (or some other font)

Suggestions for modules or existing utilities?

Thanks,
Michael

--
Michael R. Wolf
    All mammals learn by playing!



Tue, 03 Aug 2004 15:36:06 GMT  
 comparison of 2 SGML documents

Quote:
>I've got a client that wants to compare two 70,000 line SGML
>documents with output to paper

>  common    => pass, unchanged
>  deletions => strike-through
>  additions => underline (or some other font)

Why not diff, or one of the graphical diff programs with colour
highlighting?  I think I would use diff -u and then munge the output
with Perl.  Of course diff is not SGML-aware, but it should do a
reasonable job if the SGML files are laid out nicely as text.

I believe there is an Algorithm::Diff on CPAN if you insist on a
pure-Perl solution.  To do a really thorough job you could use
HTML::TokeParser to convert the SGML files to lists of tokens, use
Algorithm::Diff on those somehow (it claims to compare 'any two lists
of things') and then find some output format.  It depends on whether
the changes are mostly textual changes to element content, or mostly
structural changes adding and removing elements.

--

Finger for PGP key



Wed, 04 Aug 2004 16:49:38 GMT  
 comparison of 2 SGML documents

Quote:
> I've got a client that wants to compare two 70,000 line SGML
> documents with output to paper

>   common    => pass, unchanged
>   deletions => strike-through
>   additions => underline (or some other font)

> Suggestions for modules or existing utilities?

You could try parsing with XML::LibXML's HTML parser (which is in
effect an SGML parser), and push the resulting XML through
XML::SemanticDiff. Or find some other way to make it XML, and do the
same.

Matt.



Fri, 06 Aug 2004 21:42:22 GMT  
 comparison of 2 SGML documents
Did you look at the `sgmldiff' utility already?
It's part of the DocBook tools suite, and it compares 2 SGML
docs disregarding the text difference (only looking at the tag structure).
It's a short Perl script, you can use it as a starting point.

[snip]

Quote:
> I've got a client that wants to compare two 70,000 line SGML
> documents with output to paper

>   common    => pass, unchanged
>   deletions => strike-through
>   additions => underline (or some other font)

> Suggestions for modules or existing utilities?

[snip]

HTH,
        Vassilii



Sat, 07 Aug 2004 22:44:59 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. Install Shield works properly?

2. Repost: BDE 16 and GPE

3. SGML-SPGrove-0.01: perl module for loading SGML, XML, HTML

4. SGML::parser & SGML::ultil

5. Comparison of 2 files and generating the output based on comparison

6. strict comparison vs instrict comparison

7. Query Insert

8. FSearch or FileSearch

9. pointer to function

10. Too slow Query

11. pb or delphi or vb???

12. Using SQl to be portable to any database

 

 
Powered by phpBB® Forum Software