Parsing CL: Unicode support? 
Author Message
 Parsing CL: Unicode support?

Hello,

Is there a Common Lisp implementation that supports non-ASCII
character sets (i.e. it uses Unicode or some other multi-byte
representation internally for CHARACTER)?  Is there a standard way for
READ to parse code that is in a non-English character set?  I've been
reading through Section 2 (Syntax) of the CLHS, and while it doesn't
seem to mandate the use of ASCII, it only specifies READ-and-family's
behavior on the 7-bit ASCII character set.

I just glanced through Section 13 (Characters), and section 13.1.2.1
(Character Scripts) seems to imply some kind of ISO support in the
form of this script subtype.  But it seems almost purposefully vague.
Can anyone shed some light on the subject?

--
"low ping bastard: n. anybody getting more frags than the person running their
client on the server." - Steve Caskey



Thu, 18 Apr 2002 02:00:00 GMT  
 Parsing CL: Unicode support?

Quote:

> Is there a Common Lisp implementation that supports non-ASCII
> character sets (i.e. it uses Unicode or some other multi-byte
> representation internally for CHARACTER)?

Latest release of CLISP comes with unicode support.  Its BASE-CHAR
type is Unicode.

SY, Uwe
--

http://www.ptc.spbu.ru/~uwe/            |       Ist zu Grunde gehen



Thu, 18 Apr 2002 02:00:00 GMT  
 Parsing CL: Unicode support?


Quote:
> Is there a Common Lisp implementation that supports non-ASCII
> character sets (i.e. it uses Unicode or some other multi-byte
> representation internally for CHARACTER)?  Is there a standard way for

I think that a version of Allegro and the latest one of CLISP support
Unicode.

Paolo
--
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://cvs2.cons.org:8000/cmucl/doc/EncyCMUCLopedia/



Thu, 18 Apr 2002 02:00:00 GMT  
 Parsing CL: Unicode support?

Quote:



> > Is there a Common Lisp implementation that supports non-ASCII
> > character sets (i.e. it uses Unicode or some other multi-byte
> > representation internally for CHARACTER)?  Is there a standard way for

> I think that a version of Allegro and the latest one of CLISP support
> Unicode.

Harlequin LispWorks has had Unicode support on all platforms for several
years: http://www.harlequin.com

__Jason



Fri, 19 Apr 2002 03:00:00 GMT  
 Parsing CL: Unicode support?

Quote:

> Is there a standard way for
> READ to parse code that is in a non-English character set?

Actually, there's no trick to it, you just call READ -- it's just that
the result isn't standardized.  Presumably, the implementation has
reasonable syntax definitions for the additional characters.  If not,
you can change the syntax types using SET-SYNTAX-FROM-CHAR.  If the
token syntax is screwed (the "constituent traits" (ANS 2.1.4.2) are
not what you want them to be), you're in trouble, because there's no
standard way of changing it.

Quote:
> I just glanced through Section 13 (Characters), and section 13.1.2.1
> (Character Scripts) seems to imply some kind of ISO support in the
> form of this script subtype.  But it seems almost purposefully vague.

It _is_ purposefully vague.  The committee didn't feel it could
standardize this area, so they just defined the terms.  This section
says CL CHARACTERs represent ISO characters, and there's no particular
script or coding you need to support, as long as you have all the
standard characters.  Section 13.1.10 says you should document your
scripts and their syntactic properties.
--
Pekka P. Pirinen
Adaptive Memory Management Group, Harlequin Limited
"If you don't look after knowledge, it goes away."
  - Terry Pratchett, The Carpet People


Fri, 19 Apr 2002 03:00:00 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. Parse string return unicode hex values

2. Parsing Tools in CL?

3. PARSE-NAMESTRING use in CL

4. Q: Unicode support for text edit control? (D5)

5. APL2 support for Unicode

6. UNICODE support

7. CW 2.0 and unicode support

8. CW2 and unicode support ?

9. Functional Developer UNICODE support + source?

10. Dylan to support Unicode?

11. Dylan to support Unicode?

12. ruby unicode./encoding support

 

 
Powered by phpBB® Forum Software