Problem: PyXML 0.7 PrettyPrinting HTML, latin-1 in comments. 
Author Message
 Problem: PyXML 0.7 PrettyPrinting HTML, latin-1 in comments.

Running this code :
    reader = dom.ext.reader.HtmlLib.Reader()
    domObject = reader.fromUri(
      os.path.expanduser('~/kode/pythonscript/index.html'))
    htmlDom = dom.ext.StripHtml(domObject)
    dom.ext.PrettyPrint(domObject)

On this file:
<html>
<body>

<a href=" http://www.*-*-*.com/ ;>Min side p? whole note</a>
<!--  Bare en test for  se om alt er som det skal -->
</body>
</html>

Works fine.

But on this file:
<html>
<body>

<a href=" http://www.*-*-*.com/ ;>Min side p? whole note</a>
<!--  Bare en test for ? se om alt er som det skal -->
</body>
</html>

It says:

  File "D:\devtools\Python21\_xmlplus\dom\ext\Printer.py", line 356, in visitComment
    self._write('<!--%s-->' % (node.data))
  File "D:\devtools\Python21\_xmlplus\dom\ext\Printer.py", line 146, in _write
    obj = utf8_to_code(text, self.encoding)
  File "D:\devtools\Python21\_xmlplus\dom\ext\Printer.py", line 45, in utf8_to_code
    text = unicode(text, "utf-8")
UnicodeError: UTF-8 decoding error: invalid data

Notice that the ? character that trips up the call to PrettyPrint also
exists in the a tag without causing trouble. And that in the printed
output when it succeeds the ? in the a tag is substituted for an
&aring; escape or whatever it's called.

My question is: What should I do to successfully print html files that
are latin-1 encoded with PyXML? Is it possible?

--

Vennlig hilsen

Syver Enstad



Tue, 22 Jun 2004 04:39:32 GMT  
 
 [ 1 post ] 

 Relevant Pages 

1. PyXML 0.7 is released

2. Hooray for PyXML 0.7

3. Prettyprinting comments with XP in CL

4. Problems running Pmw 0.7 (Python1.5.1, Tcl/k8.1a2)

5. ISO Latin -> HTML conversion

6. 0.1 + 0.2 + 0.7 = 1.0 ( true or false )

7. PicForth 0.7 available

8. ANNOUNCE: buddha 0.7 released

9. lazy.py 0.7 - Lazy expressions and datastructures

10. functional.py 0.7 - Functional programming in Python

11. SML/NJ Exception Analyzer 0.7

12. release of Caml Light 0.7

 

 
Powered by phpBB® Forum Software