ruby-htmltools, a tree-building HTML parser 
Author Message
 ruby-htmltools, a tree-building HTML parser

I have written a tree-building HTML parser that is handy for doing
analysis, repair, or transformations of HTML text.

The RAA entry is
http://www.*-*-*.com/

and you can download it at:
http://www.*-*-*.com/

It requires the html-parser library, available from
http://www.*-*-*.com/ ~nahi/Ruby/html-parser/html-parser-19990912p2.ta...

I'd like to hear your feedback. One thing that I'd like to add is some
kind of path matching like XPath or CSS selectors.

However, right now you can go:

parser = HTMLTreeParser.new()
parser.parse_file_named('whatever.html')
parser.tree.detect { |e| e.tag == 'table' }

and so on to get to individual elements. And, of course, you can use
each() to traverse the structure.

Thanks,
--
Ned Konz
http://www.*-*-*.com/
GPG key ID: BEEA7EFE



Thu, 11 Nov 2004 03:55:24 GMT  
 
 [ 1 post ] 

 Relevant Pages 

1. ruby-htmltools, a tree-building HTML parser version 1.01

2. New Ruby user / HTML Parser / Module repository

3. Need to build a tree buttom-up (parse tree)

4. Ruby Syntax Highlighting (and a Ruby Parser BUG)

5. Ruby parsers in Ruby

6. HTML Parser?

7. Searching for HTML or generic SGML parser

8. HTML parser wanted

9. Scratchy 0.6 - An Apache access_log file parser and HTML report generator

10. Scratchy 0.4 - an Apache log parser and HTML report generator

11. HTML Parser suggestions wanted

12. html-parser fails at images with given height

 

 
Powered by phpBB® Forum Software