rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice? 
Author Message
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?

I noticed in some documentation where the mapping of the backslash
sequences of the backslash string literals was explained that there
is a mapping of \' (escaped single quote) to " (one double quote).

When I had been implementing backslash string literals it had occured
that I did actually implement it exactly as in C - with a mapping of
\" to " and \' to ' - but now I wonder if that may turn out to be an
unusual implementation strategy. What's your beef? How do you think
it should be (and do you have implemented backslash string literals?)

I wonder if there is any special undocumented peculiarity that lead to
define this \'-to-" mapping - may be some systems have problems with
a sequence ] c" hello \" world \"." [ ?? because of the following
whitespace after the escaped doublequote?

TIA, Guido



Mon, 22 Mar 2004 19:11:21 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?

Quote:

> I noticed in some documentation where the mapping of the backslash
> sequences of the backslash string literals was explained that there
> is a mapping of \' (escaped single quote) to " (one double quote).

[..]

Quote:
> I wonder if there is any special undocumented peculiarity that lead to
> define this \'-to-" mapping - may be some systems have problems with
> a sequence ] c" hello \" world \"." [ ?? because of the following
> whitespace after the escaped doublequote?

I could imagine the reverse: map " (one double quote) to \' (escaped
single quote). Reason: now C" and S" and any user words that scan
for " as end of string must be [made] aware of possible back-slash
sequences. This starts to look as a hairy fix, not possible as an
add-on.

What's wrong with using """ instead of \" ?

-marcel



Mon, 22 Mar 2004 19:52:33 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?
On Thu, 04 Oct 2001 13:11:21 +0200, Guido Draheim

Quote:

>I noticed in some documentation where the mapping of the backslash
>sequences of the backslash string literals was explained that there
>is a mapping of \' (escaped single quote) to " (one double quote).
>When I had been implementing backslash string literals it had occured
>that I did actually implement it exactly as in C - with a mapping of
>\" to " and \' to ' - but now I wonder if that may turn out to be an
>unusual implementation strategy. What's your beef? How do you think
>it should be (and do you have implemented backslash string literals?)

Since \' as ' is pointless in S" and C" strings, which can simply
embed ' in the string without escaping, the \' as " seems like a
straightforward work around to embedding the delimiter in the string.
However, with what appears to be portable source to redefine S" so
that it accepts "" as embedded ", it seems to me that it is better to
build on that foundation and omit both \' and \".

(
----------
Virtually,

Bruce McFarling, Newcastle,

)



Mon, 22 Mar 2004 20:25:22 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?


Quote:

> I noticed in some documentation where the mapping of the backslash
> sequences of the backslash string literals was explained that there
> is a mapping of \' (escaped single quote) to " (one double quote).

> When I had been implementing backslash string literals it had occured
> that I did actually implement it exactly as in C - with a mapping of
> \" to " and \' to ' - but now I wonder if that may turn out to be an
> unusual implementation strategy. What's your beef? How do you think
> it should be (and do you have implemented backslash string literals?)

> I wonder if there is any special undocumented peculiarity that lead to
> define this \'-to-" mapping - may be some systems have problems with
> a sequence ] c" hello \" world \"." [ ?? because of the following
> whitespace after the escaped doublequote?

    My guess is that it might be done to support command line processing
where the command line would not handle \" properly.

--

-GJC

-Abolish Public Schools.



Tue, 23 Mar 2004 00:38:15 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?

Quote:

> On Thu, 04 Oct 2001 13:11:21 +0200, Guido Draheim

> >I noticed in some documentation where the mapping of the backslash
> >sequences of the backslash string literals was explained that there
> >is a mapping of \' (escaped single quote) to " (one double quote).

> >When I had been implementing backslash string literals it had occured
> >that I did actually implement it exactly as in C - with a mapping of
> >\" to " and \' to ' - but now I wonder if that may turn out to be an
> >unusual implementation strategy. What's your beef? How do you think
> >it should be (and do you have implemented backslash string literals?)

> Since \' as ' is pointless in S" and C" strings, which can simply
> embed ' in the string without escaping, the \' as " seems like a
> straightforward work around to embedding the delimiter in the string.
> However, with what appears to be portable source to redefine S" so
> that it accepts "" as embedded ", it seems to me that it is better to
> build on that foundation and omit both \' and \".

an implementation allowing S" hello ""world""" does basically need to
look ahead in the input stream when the first (") has been parsed.
However I had problems implementing such an extension on top of some
forth systems that happened to be around - the various input sources
for the outer forth interpreter do often not work the same and looking
ahead in the input stream did often show unexpected behaviour especially
at the end of an input line. Basically I see all the words that require
look-ahead-parsing to be not very portable (I've seen systems that had
quite interesting change policies of ">IN" - and I do not care if anyone
would call such systems broken - they exist and they work otherwise fine).

From that perspective however I was amazed to see this (\')->(") mapping
in the backslash string literals - when the function (that converts the
characters from their input stringspan to compiled string represention)
reaches the end of the 'parse'd buffer and has a an open (\) then it can
just call 'parse' again and go on compiling - or throw if the input is
used up to the point that 'parse' returns 0. But no need to look ahead
and putback a char (or write to >IN).

Yes, I know that ("") was in openboot - a required feature there. But I do
not see the benefit as such an implementation does also need to convert
escaped chars like ("b) to their compiled representation. Having a set of
more complex computations in (s"-with-"x) or in (s\"-with-\x) does not
give me the impression of being much different other than that the
traditional 'parse place' is quick and simple (and does not need to look
ahead).

However this turns into a discussion on the pros and cons of openboot
strings which I did not want to start originally - I am more interested
in the aspect what it is wrong with a (\")-(") mapping in backslash string
literals (which is likely the reason to define a mapping (\')->(") whereas
I would have preferred a way that would have mapped a (\q)->(") and just
left the (\')->(') as one would expect it from *all* other usage areas).

cheers,
-- guido                            Edel sei der Mensch, hilfreich und gut



Tue, 23 Mar 2004 01:45:13 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?
On Thu, 04 Oct 2001 19:45:13 +0200, Guido Draheim

Quote:

>Yes, I know that ("") was in openboot - a required feature there.

No, I was talking about

http://home.earthlink.net/~neilbawd/quotstr.html

(
----------
Virtually,

Bruce McFarling, Newcastle,

)



Tue, 23 Mar 2004 10:28:56 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?

Quote:

> On Thu, 04 Oct 2001 19:45:13 +0200, Guido Draheim

> >Yes, I know that ("") was in openboot - a required feature there.

> No, I was talking about

> http://home.earthlink.net/~neilbawd/quotstr.html

which again *writes* to ">IN" ... and writing to >IN behind the back
of some system functions in some existing forth implementations - that
is what happens to be not quite portable... although I admit that Neil's
code looks damn innocent ;-)

Neil's implementation however: a) increases >IN beyond the place it has
been originally - assuming it can add 1 and the underlying system will
accept that as a continue point and  b) takes advantage of SOURCE whose
existance in ans'forth is due to problems with the various input sources that
might be available to forth systems - but which would be hard to define for
any system that is slightly older than that ... and where I still keep some
of my apps portable to. backslash string literals are portable to these too.

but that is again going to be a discussions about the interprations of
parse area (=> http://forth.sf.net/word/parse-area-store ) which is
not my interest other than that it might be related to the reasons that
swiftforth started to map (\')->(") and to not allow any (") in the
input string area even if that could be escaped (\"). Why is it bad to
escape a (") as (\") and carry on `parse`ing if an escaped end-of-string
is met?

cheers,
-- guido                            Edel sei der Mensch, hilfreich und gut

http://forth.sf.net/std/dpans/a0006.htm

Quote:
> [..](*question*)
> I originally interpreted a previous dpANS document as saying that the user
> input buffer, the one pointed to by the obsolescent TIB and #TIB , is
> read-only. That makes a different kind of sense to me. In some systems it
> could be a read-only pipe from some other process, or it could be some
> special hardware read-only buffer. It makes sense to me that the user input
> buffer should never be modified by a standard program.
> [..](*answer*)
> It is logically possible for an application to alter the 'input buffer' supplied to
> EVALUATE while that 'input buffer' is being processed. This possibility
> exists as a special case because the application actually "owns" and has full
> control over the buffer in question. As owner of the buffer, and both producer
> and consumer of the data it contains, the application can alter this buffer and
> manipulate >IN with deterministic results, assuming that it knows the buffer
> to exist in physically writable memory.

http://forth.sf.net/std/dpans/dpansa6.htm#A.6.1.2216
Quote:
> A.6.1.2216 SOURCE

> SOURCE simplifies the process of directly accessing the input buffer by
> hiding the differences between its location for different input sources. This
> also gives implementors more flexibility in their implementation of buffering
> mechanisms for different input sources. The committee moved away from an
> input buffer specification consisting of a collection of individual variables,
> declaring TIB and #TIB obsolescent.

> SOURCE in this form exists in F83, POLYFORTH, LMI's Forths and others.
> In conventional systems it is equivalent to the phrase





Tue, 23 Mar 2004 16:51:32 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?
On Fri, 05 Oct 2001 10:51:32 +0200, Guido Draheim

Quote:


>> On Thu, 04 Oct 2001 19:45:13 +0200, Guido Draheim

>> >Yes, I know that ("") was in openboot - a required feature there.
>> No, I was talking about
>> http://home.earthlink.net/~neilbawd/quotstr.html
>which again *writes* to ">IN" ... and writing to >IN behind the back
>of some system functions in some existing forth implementations - that
>is what happens to be not quite portable... although I admit that Neil's
>code looks damn innocent ;-)

NB. This is a distinct question to the specification of the escape
character handling: the only printable character with trouble in S" ."
is the quote ... therefore, a "" as " escape could be seen as a level
below a full S^" style escape string.

OTOH, while the implementation of "" escape in quotstr.html launches a
seperate question, the question is interesting in itself.  How can the
implementation in quotstr.html be made as portable as possible.

(
----------
Virtually,

Bruce McFarling, Newcastle,

)



Wed, 24 Mar 2004 11:30:07 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?

Quote:

> On Fri, 05 Oct 2001 10:51:32 +0200, Guido Draheim


> >> On Thu, 04 Oct 2001 19:45:13 +0200, Guido Draheim

> >> >Yes, I know that ("") was in openboot - a required feature there.
> >> No, I was talking about
> >> http://home.earthlink.net/~neilbawd/quotstr.html

> >which again *writes* to ">IN" ... and writing to >IN behind the back
> >of some system functions in some existing forth implementations - that
> >is what happens to be not quite portable... although I admit that Neil's
> >code looks damn innocent ;-)

> NB. This is a distinct question to the specification of the escape
> character handling: the only printable character with trouble in S" ."
> is the quote ... therefore, a "" as " escape could be seen as a level
> below a full S^" style escape string.

> OTOH, while the implementation of "" escape in quotstr.html launches a
> seperate question, the question is interesting in itself.  How can the
> implementation in quotstr.html be made as portable as possible.

It already is the most portable implementation of a word that looks ahead
in the input stream without calling a parse function - it just depends on
'source' (ie. the "input span") to have some features that is needed for
operation. If there is an older system without 'source' it would not work
and if 'source'/'parse' works a bit different then it is not the way to do
it either You may like ("") but it would not be available in such systems.
That's about it.

oh - as a comment - the code uses "1 +!" which is traditionally correct
since ">IN" shall (according to dpans94) represent the number in "CHARS"
from the start of the current input "SOURCE". Well, and "1" is correct
as long as a char has size 1...

(hmm, what happens if 'source' is in utf-8 or just 16-bit unicode or just
 whatever the current input stream happens to like, and the '>in' points
 to the current read-pointer in that parse-area, and 'parse' returns
 a(nother!) buffer with the input chars converted to current 'char'
 encoding - you can putback the ">in" variable to any value being valid
 before which is the only thing I remember that is atleast somewhat portable.
 One may claim that 'source' must be already in system char encoding as
 '>in' is defined to be in system char encoding - but that will probably
 make the system implementor to leave the word 'source' undefined  - or just
 return the last the parsed area buffer for such input streams which will
 not allow you to look ahead even that the source text in the editor has a
 (") directly following the string literal. But that's just theoretic, I only
 know actual problems with \r\n and other whitespace-equivalent escape sequences
 in the input stream that a 'parse' handles but not some lookahead operation.
 For the specific \r\n problem, Neil's code will probably be correct there, I
 did not test it on any of the old machines I have around. No time, no time...).

-- guido



Wed, 24 Mar 2004 14:14:20 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?

Quote:

>On Fri, 05 Oct 2001 10:51:32 +0200, Guido Draheim


>>> On Thu, 04 Oct 2001 19:45:13 +0200, Guido Draheim

>>> >Yes, I know that ("") was in openboot - a required feature there.
>>> No, I was talking about
>>> http://home.earthlink.net/~neilbawd/quotstr.html

>>which again *writes* to ">IN" ... and writing to >IN behind the back
>>of some system functions in some existing forth implementations - that
>>is what happens to be not quite portable... although I admit that Neil's
>>code looks damn innocent ;-)

>NB. This is a distinct question to the specification of the escape
>character handling: the only printable character with trouble in S" ."
>is the quote ... therefore, a "" as " escape could be seen as a level
>below a full S^" style escape string.

>OTOH, while the implementation of "" escape in quotstr.html launches a
>seperate question, the question is interesting in itself.  How can the
>implementation in quotstr.html be made as portable as possible.

>(
>----------
>Virtually,

>Bruce McFarling, Newcastle,

>)

May I cast one vote in favor of "" to denote a " in a string.
Advantages:
1. There is just one character involved. No need to remember
   that \ behaves special.
2. The parser has to force a blank after a string anyway.
   S" aap"TYPE would work if S" uses PARSE but I think it is
   bad style, even if it is (were?) allowed by ISO.
   So a " after what otherwise looks like a string sets you
   thinking.
3. S" style strings cannot contain " , so it is upwards compatible.
   S" aap\' noot " or such would behave different for ISO.
   (but \" doesn't have this disadvantage.)

There is a disadvantage that doesn't count for ciforth:
1. An interpreted string has to be copied before it can be
used. (A compiled string is copied anyway in a non tricky
implementation.)

(In ciforth I allocate *all* strings into the dictionary,
so all strings are permanent. I have still to run out of
my 640 Mbyte (64 Mbyte on Linux) dictionary. )

Another point where ALGOL68 (yes that is 1968) was an improvement over
all its successors. Study it. You will be surprised what they did
right. And this language was designed behind the desk!

ALGOL68 was dismissed as being too big at the time. (The *full*
description,I/O libraries,rationale and all run into the 60 pages, and
languages implementations require up to 100 kbyte.)
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
To suffer is the prerogative of the strong. The weak -- perish.



Sun, 28 Mar 2004 17:10:44 GMT  
 rfc: w.r.t. [c\"] - mapping [\'] to ["] - good practice?

Quote:



> >On Fri, 05 Oct 2001 10:51:32 +0200, Guido Draheim


> >>> On Thu, 04 Oct 2001 19:45:13 +0200, Guido Draheim

> >>> >Yes, I know that ("") was in openboot - a required feature there.
> >>> No, I was talking about
> >>> http://home.earthlink.net/~neilbawd/quotstr.html

> >>which again *writes* to ">IN" ... and writing to >IN behind the back
> >>of some system functions in some existing forth implementations - that
> >>is what happens to be not quite portable... although I admit that Neil's
> >>code looks damn innocent ;-)

> >NB. This is a distinct question to the specification of the escape
> >character handling: the only printable character with trouble in S" ."
> >is the quote ... therefore, a "" as " escape could be seen as a level
> >below a full S^" style escape string.

> >OTOH, while the implementation of "" escape in quotstr.html launches a
> >seperate question, the question is interesting in itself.  How can the
> >implementation in quotstr.html be made as portable as possible.

> >(
> >----------
> >Virtually,

> >Bruce McFarling, Newcastle,

> >)

> May I cast one vote in favor of "" to denote a " in a string.
> Advantages:
> 1. There is just one character involved. No need to remember
>    that \ behaves special.
> 2. The parser has to force a blank after a string anyway.
>    S" aap"TYPE would work if S" uses PARSE but I think it is
>    bad style, even if it is (were?) allowed by ISO.
>    So a " after what otherwise looks like a string sets you
>    thinking.
> 3. S" style strings cannot contain " , so it is upwards compatible.
>    S" aap\' noot " or such would behave different for ISO.
>    (but \" doesn't have this disadvantage.)

> There is a disadvantage that doesn't count for ciforth:
> 1. An interpreted string has to be copied before it can be
> used. (A compiled string is copied anyway in a non tricky
> implementation.)

Ahhh, that's a strong argument against "" in normal s" strings
as one can not just return the string-span inside the input
area - and that is what makes parsing input source be fast.

Quote:

> (In ciforth I allocate *all* strings into the dictionary,
> so all strings are permanent. I have still to run out of
> my 640 Mbyte (64 Mbyte on Linux) dictionary. )

> Another point where ALGOL68 (yes that is 1968) was an improvement over
> all its successors. Study it. You will be surprised what they did
> right. And this language was designed behind the desk!

> ALGOL68 was dismissed as being too big at the time. (The *full*
> description,I/O libraries,rationale and all run into the 60 pages, and
> languages implementations require up to 100 kbyte.)

Personally, I see algol to have some analogies with perl - perl is
great in expressing regex-routines and with all its hashes and array
in can build text processors in a few lines - but this shortness and
the overuse of special syntax to achieve it, that makes it somewhat
unreadable at first glance. With algol you can express complicated
matrix-using alogorithms in a few lines but don't ask me which if I
have been reading it for a minute or so - 'cause I wouldn't know...

... guido



Tue, 30 Mar 2004 02:47:29 GMT  
 
 [ 53 post ]  Go to page: [1] [2] [3] [4]

 Relevant Pages 

1. Python "Best Practice Patterns"

2. "Smalltalk Best Practice Patterns"

3. '"""' and linefeed characters

4. RFC: Creating a "STEP" Site

5. RFC: TK extension to "listbox"

6. RFC: "array foreach"

7. RFC: "array default"

8. string.join(["Tk 4.2p2", "Python 1.4", "Win32", "free"], "for")

9. VW: Best Practice for mapping tables to entities?

10. "Mapping" software (GIS) interfacing to CW

11. "string map" with binary data

12. "map"?

 

 
Powered by phpBB® Forum Software