Reviews for lisp implementations 
Author Message
 Reviews for lisp implementations

[[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]]


Quote:
>* Philip Lijnzaad
>|
>| [...] In actual practice, "ij", although one letter (actually,
>| diftong), is *always* typed and typeset as an i followed by a j. As
>| far as I'm concerned, i'd be happy to ceede this ascii value to more
>| important purposes (capital sharp s?)  When upcased, both i and j
>| have to be upcased [...]. However, most dictionaries sort the 'ij'
>| as two separate letters. Confusing, sortof.
>Most? From what I've heard (from Dutch sources, BTW) IJ is sorted as a
>separate letter after Z.  Can you elaborate on whether both happens or
>whether I've been misinformed?

I think you've been misinformed; all Dutch dictionaries I've ever seen
as well as "Het Groene Boekje" (the official list of Dutch words) sorts
"ij" as if it's an i followed by a j.

The only exception to this standard rule is the Dutch telephone
book; it sorts the ij as if it is an y.

Quote:
>And if it's really sorted separately then I think makes sense to
>consider it a separate character, as Unicode more or less does
>(although it calls it a ligature): U+0132 and U+0133.

I think the Dutch "ij" is a ligature, even though we learn differently
at school.  As you say, both I and J are upcased together as
in "IJmuiden" but that holds true for AE ligatures as well.

Casper

--
Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.



Sat, 06 Oct 2001 03:00:00 GMT  
 Reviews for lisp implementations
On 20 Apr 1999 13:02:29 GMT,

Quote:
>> Most? From what I've heard (from Dutch sources, BTW) IJ is sorted as a
>> separate letter after Z.  

No, never.

Casper> all Dutch dictionaries I've ever seen as well as "Het Groene Boekje"
Casper> (the official list of Dutch words) sorts "ij" as if it's an i
Casper> followed by a j.

yes, although I remember having used dictionaries in school that had IJ
between X and Z. It's apparently obsolete now, but:

Casper> The only exception to this standard rule is the Dutch telephone
Casper> book; it sorts the ij as if it is an y.

(didn't know that ... a bit strange and confusing, I'd say)

Casper> As you say, both I and J are upcased together as in "IJmuiden" but
Casper> that holds true for AE ligatures as well.

another point is abbreviations: I'm fairly sure that the Dutch 'ij' would be
abbreviated to 'IJ'. Making up an example: Vereniging ter bevoordering van de
ijspret would be V.B.IJ, not V.B.I. The abbreviation issue must be correlated
with the capitalization issue, and I suspect it would be the same for
ligatures in other languages/scripts.

                                                                      Philip
--
To accurately forge this signature, use a lucidatypewriter-medium-12 font
-----------------------------------------------------------------------------

+44 (0)1223 49 4639                 | Wellcome Trust Genome Campus, Hinxton
+44 (0)1223 49 4468 (fax)           | Cambridgeshire CB10 1SD,  GREAT BRITAIN
PGP fingerprint: E1 03 BF 80 94 61 B6 FC  50 3D 1F 64 40 75 FB 53



Sat, 06 Oct 2001 03:00:00 GMT  
 Reviews for lisp implementations

#+:noise-ahead
Isn't Dutch a throat disease? :)

--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa



Sat, 06 Oct 2001 03:00:00 GMT  
 Reviews for lisp implementations

Quote:

> ...
> I wondered (as an academic exercise) what should CHAR-UPCASE and
> NSTRING-UPCASE do about LATIN SMALL LETTER Y WITH DIAERESIS (assuming
> STRING-UPCASE is allowed to return a longer string which isn't
> especially nice either).  Signal an error?  Or the implementation
> would state that the character sets it uses do not include this
> letter?  (Making CHAR-UPCASE return two values, like #\I and #\J
> in this case, appears more than perverse, though who knows.)

Careful.  Recall that ANSI CL is an American standard and doesn't make
any attempt to accomodate other collating sequences.

String-upcase and friends are specifically required to work character by
character, without reference to any context:

  "More precisely, each character of the result string is produced by  
  applying the function char-upcase to the corresponding character of
  string."

I would have thought that ISO would have addressed this issue more
broadly in ISLisp, but it does not appear that they did.  There is no
string-upcase at all, and string< and friends are specifically defined
to work character by character:

   "Two strings string1 and string2 are in order (string<) if in the
first
    position in which they differ the character of string1 is char< the
    corresponding character of string2, ..."

Given that Scheme is an ISO standard, apparently tries to do either the
right thing or nothing at all, and seems to try to not include useful
utilities which are "obvious" compositions or iterations of other
utilities, I would have expected that Scheme either wouldn't have string
operations at all or would have them do the contextually right thing.
After all, if you just want to map over a sequence with some char< or
such function, just do it.  Of course, I'm wrong.  Scheme also defines
string< to work character by character, but at least it meets my
expectations by failing to define string-upcase at all.  In case I'm
misinterpreting, here's the definition for string-<? and friends:

  "These procedures are the lexicographic extensions to strings of the
  corresponding orderings on characters. For example,
  string<? is the lexicographic ordering on strings induced by the
ordering
  char<? on characters. If two strings differ in
  length but are the same up to the length of the shorter string, the
  shorter string is considered to be lexicographically less
  than the longer string."



Sat, 06 Oct 2001 03:00:00 GMT  
 Reviews for lisp implementations

Quote:


> | Not that any of this has much to do with Lisp, but:
> |
> | - U+00FF (LATIN SMALL LETTER Y DIAERESIS) is described in the Unicode
> |   standard as being French, not Dutch.

>   I said _from_ Dutch "ij".  it's an _imported_ character.  it is used in a
>   bunch of names in Belgia that historically had "ij" in their name.

Could you name two please? I live in Belgium (no need to form a
plural) and I've never seen it.

--

If there are aliens, they play Go. -- Lasker



Sat, 06 Oct 2001 03:00:00 GMT  
 Reviews for lisp implementations

Quote:

> #+:noise-ahead
> Isn't Dutch a throat disease? :)

#+:further-noise
No, we just have a fairly complete set of sounds. It helps to
recognize foreigners.

Schild en Vriend?

--

If there are aliens, they play Go. -- Lasker



Sat, 06 Oct 2001 03:00:00 GMT  
 Reviews for lisp implementations

| Could you name two please?

  not off-hand.  the rationale for ? that I have related here is that given
  to ECMA in 1982-6 when formulating and to ISO in 1987 when adopting ISO
  8859-1 through -4.

| I live in Belgium (no need to form a plural) and I've never seen it.
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^

  _very_ good!  never though of it as a plural.  we call it "Belgia"
  in Norwegian.  I'm sure it's an editing glitch.  I hope it isn't receding
  language skills.  :)

  I have actually seen it, though, which is why I remember the minutes from
  the ISO work.  it's been a while (11 years), and I regret I'm not able to
  recall them in minute detail, anymore.

#:Erik



Sat, 06 Oct 2001 03:00:00 GMT  
 Reviews for lisp implementations

    Erik> I said _from_ Dutch "ij".  it's an _imported_ character.  it
    Erik> is used in a bunch of names in Belgia that historically had
    Erik> "ij" in their name.

    Lieven> Could you name two please? I live in Belgium (no need to
    Lieven> form a plural) and I've never seen it.

Kortrijk springs to mind.  Maybe to yours too; is that why you asked
for two examples?  :-)

I had to glance at a map to find a second one: Nijvel.

Would a Belgian spell these with "y trema"?



Sat, 13 Oct 2001 03:00:00 GMT  
 
 [ 40 post ]  Go to page: [1] [2] [3]

 Relevant Pages 
 

 
Powered by phpBB® Forum Software