HTML parser in postscript? 
Author Message
 HTML parser in postscript?

Hello all,

The other day I found myself wondering if anyone has taken the time to
write an HTML parser in postscript...something like 'lpsim'
(comp.sources.postscript v01i051), or other text-to-ps code that gets
prepended to a document and then sent to the printer.

I'm not sure that such a beast would be particularly useful (in fact,
I'm pretty sure it wouldn't be), but is there anything like that out
there?  I thought it might make an interesting text-parsing example.

-- Lars



Thu, 18 Aug 2005 05:31:03 GMT  
 HTML parser in postscript?

Quote:
> Hello all,

> The other day I found myself wondering if anyone has taken the time to
> write an HTML parser in Postscript...something like 'lpsim'
> (comp.sources.postscript v01i051), or other text-to-ps code that gets
> prepended to a document and then sent to the printer.

> I'm not sure that such a beast would be particularly useful (in fact,
> I'm pretty sure it wouldn't be), but is there anything like that out
> there?  I thought it might make an interesting text-parsing example.

Parsing HTML is rather tricky since there are a lot of special cases, <p>
and <br> for instance, where the tag is not normally closed. Parsing XML
though is relatively straightforward - see http://www.rops.org/stuff/xml.ps
and xml.xml for a small exercise in programming.

Quote:
> -- Lars

--
Roger


Fri, 19 Aug 2005 08:13:00 GMT  
 HTML parser in postscript?


Quote:
> see http://www.rops.org/stuff/xml.ps
> and xml.xml for a small exercise in programming.

Thanks!  I'll take a look at it.

-- Lars



Fri, 19 Aug 2005 12:09:16 GMT  
 HTML parser in postscript?

Quote:

> Hello all,

> The other day I found myself wondering if anyone has taken the time to
> write an HTML parser in Postscript...something like 'lpsim'
> (comp.sources.postscript v01i051), or other text-to-ps code that gets
> prepended to a document and then sent to the printer.

> I'm not sure that such a beast would be particularly useful (in fact,
> I'm pretty sure it wouldn't be), but is there anything like that out
> there?  I thought it might make an interesting text-parsing example.


     http://www.pugo.org/

Jim Land
     PostScript/Ghostscript Internet Resources Web Page
          http://www.geocities.com/SiliconValley/5682/postscript.html



Sun, 21 Aug 2005 10:04:34 GMT  
 HTML parser in postscript?


Quote:

>      http://www.pugo.org/

Which is an interesting project, but doesn't actually do any HTML
parsing.  It is, however, a good example of how to combine Ghostscript
and an external program to make up for the lack of network support in
Postscript.

-- Lars



Sun, 21 Aug 2005 21:42:48 GMT  
 HTML parser in postscript?

    Roger> Parsing HTML is rather tricky since there are a lot of
    Roger> special cases, <p> and <br> for instance, where the tag is
    Roger> not normally closed.

Not to mention  that most HTML documents (from  web sites) are broken.
To be useful, you have to either make sure your input HTML document is
syntactically  correct (e.g.  by preprocessing  it with  HTMLtidy), or
incorporate error-correction into your parser.

    Roger> Parsing XML though is relatively straightforward -

Unless you  don't care about doing error-correction.   Aborting on any
non-wellformed XML  is easy.  Trying  to do error-correction  would be
more difficult.

--


Home page: http://www.informatik.uni-freiburg.de/~danlee



Mon, 22 Aug 2005 00:00:10 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. Postscript parser in Java

2. Postscript parser

3. Postscript Parser

4. Java PostScript Parser ?

5. Postscript Parser

6. Postscript Page parser?

7. Postscript Parser using Perl

8. AFM Parsers in PostScript?

9. Postscript to Html, DVI to HTML, PS to RTF, DVI to RTF

10. looking for java pdf/ps parser

11. VBScript parser/compiler?

12. HP/RTL interpreter/parser...

 

 
Powered by phpBB® Forum Software