parser problem 
Author Message
 parser problem

I am trying to create a parser for XML aka HTML.  First I am scanning
through the document to verify all the tags are correct and nested
properly.  Then I want to use a tree data structure to store the HTML
document in C, and that is where I am running into problems.  If
anyone has any tips or suggestions on how the data structure should
look please let me know.  Thanks

JAC



Sun, 22 Apr 2001 03:00:00 GMT  
 parser problem
hi,
        if this isn't a speed critical application, use Perl - a far nicer language
for such things (sorry if i offended anyone).

parsers, compilers whatever you want to call them are basically "state
machines".
        eh?  quite simple really an "HTML" tag acts a trigger to another expectant
state.  for example after an open "HTML" tag (i.e. <A NAME="here"> ) you would
expect either another open tag or a matched closing tag (i.e. </A>).  what you
need to do is keep track of nested tags, try using a stack like structure (e.g.
some sort of linked list) it doesn't really matter what as long it can be made
to operate in a "last in first out" manner.

        as i see it what your program would do is "push" (i.e. put onto the stack)
open "HTML" tags, then "pop" (i.e. take off the stack) using close "HTML" tags,
if the two tags "match" (i.e. are an appropriate pair) continue, if they don't
go eeek.

hope i was of some help,
        Hashi.

--- --- ---
  .~.   the way of the Sacred Penguin is the path of
  /V\   the truly righteous...
 // \\  

 ^`~'^  http://thor.prohosting.com/~hashaday



Tue, 24 Apr 2001 03:00:00 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. Parser problem

2. SQL Parser Problem

3. Parser generator problem

4. SAX parser in C#

5. XML parser and UTF-16

6. Parser Error Message

7. Source code of mathematical parser?

8. Parser Generators or Framework

9. an EBNF parser and coding pattern tool (LGPL)

10. Dependency parser and .NET

11. Access to VC parser ?

12. C expression parser?

 

 
Powered by phpBB® Forum Software