Two bugs in HTML-Tree modules? 
Author Message
 Two bugs in HTML-Tree modules?

Here is the first problem, which I previously posted in March:

HTML-Tree has a problem with links of the form "<a...><h2>...</h2></a>". The
following program prints:

#<A HREF="REF"> Something else </A>#
#<A HREF="REF"> <I>blah</I> </A>#
#<A HREF="REF"> </A>#

not

#<A HREF="REF"> Something else </A>#
#<A HREF="REF"> <I>blah</I> </A>#
#<A HREF="REF"> <H2>Something</H2> </A>#

====================================

#!/usr/bin/perl -w

use strict;

my $html =<<EOT;
  <A href="REF"> Something else </A>
  <A href="REF"> <i>blah</i> </A>
  <A href="REF"> <H2>Something</H2> </A>
EOT

require HTML::TreeBuilder;
my $tree = HTML::TreeBuilder->new;
$tree->parse($html);


{
  # Extract the link from the HTML element.
  my $elem = ${$linkpair}[1];
  my $link = $elem->as_HTML();
  chomp $link;

  print "#$link#\n";

Quote:
}

----------------------------------------------------------------------

The second problem is that as_HTML translates ampersands in URLs into &amp;,
which is incorrect. The following program prints:

#<A HREF="http://url/with&ampersand"> Something </A>#
#<A HREF=" http://www.*-*-*.com/ ;> Something else </A>#

when it should print:

#<A HREF="http://url/with&ersand"> Something </A>#
#<A HREF=" http://www.*-*-*.com/ ;> Something else </A>#

====================================

#!/usr/bin/perl -w

use strict;

my $html =<<EOT;
  <A href="http://url/with\&ampersand"> Something </A>
  <A href=" http://www.*-*-*.com/ ;> Something else </A>
EOT

require HTML::TreeBuilder;
my $tree = HTML::TreeBuilder->new;
$tree->parse($html);


{
  # Extract the link from the HTML element.
  my $elem = ${$linkpair}[1];
  my $link = $elem->as_HTML();
  chomp $link;

  print "#$link#\n";

Quote:
}

--
Remove "microsoft." from my address to reply via email. Sorry, but most
of my junkmail comes from addresses harvested from the newsgroups.

I'm also considerate enough to read replies on news. :)



Sat, 22 Jun 2002 03:00:00 GMT  
 
 [ 1 post ] 

 Relevant Pages 

1. Bug in HTML::Parser or HTML::Tree

2. HTML-Tree bug

3. Seeking Tree::Base or other binary tree module to avoid reinventing the wheel

4. Parsing HTML with HTML::Tree

5. RFC: HTML::FileDiff -- display diff of two files in HTML

6. HTML::Element::Table, HTML::Element bug

7. Bug in Tree widget

8. bug in Tk::Tree ?

9. Bug in Tree widget?

10. HTML-Tree-0.50 make test failed

11. Output dynamic interactive graphical directory tree from perl to html

12. HELP: how to download a HTML tree

 

 
Powered by phpBB® Forum Software