Unicode trouble 
Author Message
 Unicode trouble

Python 2.0 on Windows 2000.

How do I convert from Unicode to ANSI 8 bit characters? I keep getting
unicode error ordinal out of range when convering from unicode to ordinary
strings.

Quick help very much appreciated.



Mon, 19 May 2003 03:00:00 GMT  
 Unicode trouble

Quote:

> python 2.0 on Windows 2000.

> How do I convert from Unicode to ANSI 8 bit characters? I keep getting
> unicode error ordinal out of range when convering from unicode to ordinary
> strings.

ANSI what?

If you mean ISO-8859-1, use encode:

    u = something
    s = u.encode("iso-8859-1")

the default conversion is US ASCII -- if you try to convert
something that won't fit in a 7-bit ASCII string, you'll get a
UnicodeError exception.

</F>



Mon, 19 May 2003 03:00:00 GMT  
 Unicode trouble
Thanks for the swift help, that worked.

I have a couple of follow-ups to the answer.


Quote:
> If you mean ISO-8859-1, use encode:

Is ISO-8859-1 the same as the "normal" Windows char set (I guess I mean
european)?
Is latin-1 the same as ISO-8859?

Quote:
> the default conversion is US ASCII -- if you try to convert
> something that won't fit in a 7-bit ASCII string, you'll get a
> UnicodeError exception.

Is it possible to have the default conversion be something else?


Mon, 19 May 2003 03:00:00 GMT  
 Unicode trouble

Quote:

> Is ISO-8859-1 the same as the "normal" Windows
> char set (I guess I mean european)?

The normal Windows "ANSI" char set is a superset of ISO-8859-1.

If you really want full Windows compatibility, use "cp1252" instead
of "iso-8859-1"

:::

Quote:
> Is latin-1 the same as ISO-8859?

almost: to be precise, Latin-1 is the same thing as ISO-8859-1.
There are about 15 other 8859 variants.  Here's an overview:

    http://czyborra.com/charsets/iso8859.html

For more than you ever wanted to know about character
sets/repertoires/codes/encodings, see:

    http://www.hut.fi/u/jkorpela/chars.html

:::

Quote:
> Is it possible to have the default conversion be something else?

Not really.

(if you insist, check the site.py file in the standard library.  but
if you change the conversion, you're on your own -- and your
program will probably break in 2.1).

</F>



Mon, 19 May 2003 03:00:00 GMT  
 Unicode trouble

Quote:

> Is ISO-8859-1 the same as the "normal" Windows char set (I guess I mean
> european)?
> Is latin-1 the same as ISO-8859?

Latin-1 and ISO-8859-1 are two names for the same thing.  There are
other ISO-8859 character sets, such as ISO-8859-2 for Eastern Europe,
etc.

The default Windows character set is a superset of ISO-8859-1, there
are a number of characters in the windows set that aren't in
ISO-8859-1, most notably the "smart" quotes.

        - Ruud de Rooij.
--



Mon, 19 May 2003 03:00:00 GMT  
 Unicode trouble

| Is ISO-8859-1 the same as the "normal" Windows char set (I guess I mean
| european)?

the "normal" windows char set used in europe is a nonstandard
superset of latin-1 called code page 1252, or cp1252 for short.
it is unadvisable to use it if dealing with data that could
conceivably find its way to the outside world.

| Is latin-1 the same as ISO-8859?

latin-1 is the same as iso-8859-1.

  -- erno



Tue, 20 May 2003 15:29:02 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. More Unicode Trouble

2. Trouble with unicode

3. ANNOUNCE: unicode 0.4, command line unicode database query tool

4. Q: Unicode support for text edit control? (D5)

5. WIll Unicode solve char set problem?

6. WIll Unicode solve char set problem?

7. Unicode and Underscored alphabetics

8. Unicode and underscored alphabetics

9. Unicode mapping of APL characters

10. More Unicode

11. Got it: HTML, UNICODE, and Assembler

12. resend about HTML and UNICODE

 

 
Powered by phpBB® Forum Software