HTML Table to Array 
Author Message
 HTML Table to Array

Does anyone know of a module that will rip apart an HTML table and return the
data in an array, etc.

John

.

------------
SPAM STOPPER
Regretably, this is now necesary so,
        Remove the $ for EMAIL
Thanks
------------



Wed, 23 Feb 2000 03:00:00 GMT  
 HTML Table to Array

Quote:

> Does anyone know of a module that will rip apart an HTML table and
> return the data in an array, etc.

No such module I think. I've searched thro' many web search engines and
tried various codes in the past months and the best I can have is

#assume you want to parse the HTML table in $c
$c=~s/^M//gi;                #remove dos ^M, if any
$c=~s/\n//gi;                #remove potential misplaced newline

for ($i=0;$i<=$#r;$i++) {
 $r[$i]=~s/<.*?>//gi;        #remove the <> codes
 print "[r-$i]\t$r[$i]\n";

Quote:
}

I don't know how to generalise the operation with table fields. Some
HTML pages use lower case </td><td> pairs. Some have formatting codes in
the pairs like <td align=right><font size=-1>. I need to customise the
parser for different pages.


Fri, 25 Feb 2000 03:00:00 GMT  
 HTML Table to Array

Quote:

>#assume you want to parse the HTML table in $c
>$c=~s/^M//gi;                #remove dos ^M, if any
>$c=~s/\n//gi;                #remove potential misplaced newline

You might try a single tr/// statement instead:

        $c=~tr/\r\n//d;

Quote:
>$c=~s/<\/TD>/<\/TD>\n/gi;    #force a newline at each field

...
>I don't know how to generalise the operation with table fields. Some
>HTML pages use lower case </td><td> pairs. Some have formatting codes in
>the pairs like <td align=right><font size=-1>. I need to customise the
>parser for different pages.

You can do case insensitive  searching by using the "i" option. Search
for td by using the "\b" word boundary:


I reallly ought to test this before posting... But you get the idea, I
think.

You can strip the formatting tags afterwards.

        Bart.



Fri, 25 Feb 2000 03:00:00 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. HTML Table to Array?

2. html table columns into arrays

3. HTML::Parser: Parsing HTML tables and Frames???

4. HTML::Element::Table, HTML::Element bug

5. HTML::Calendar, HTML::Element::Table - DISCUSSION

6. HTML::Parser: Parsing HTML tables and Frames???

7. Perl script: given database query, show html result table

8. HTML::Table

9. Announce: HTML::Table

10. Propose new module HTML::Table

11. module to help make HTML tables?

12. Unanswered FAQ(?): Convert HTML tables to text...

 

 
Powered by phpBB® Forum Software