UNIX, MS-DOS, Macintosh, etc return character? 
Author Message
 UNIX, MS-DOS, Macintosh, etc return character?

is it the same with ALL platforms?!?!

like \n  is used with MacPerl, but does \n also signify a UNIX newline?
What about a MS-DOS newline?!?!

HELP!!

Thanks alot everybody.

Please reply via personal e-mail, as to I have little time at all to scan
the newsgroup for replies to this article..  please..  thanks.

-T-


*NO* chain letters of any kind!    World Wide Web Page designer, ask for it
PGP available.   I'll offer to anyone who has a valid reason for needing it.



Thu, 03 Dec 1998 03:00:00 GMT  
 UNIX, MS-DOS, Macintosh, etc return character?

[a copy of this article was emailed to the original poster.]

Quote:
>is it the same with ALL platforms?!?!
>like \n  is used with MacPerl, but does \n also signify a UNIX newline?
>What about a MS-DOS newline?!?!

Since Standard C specifies that the end of line symbol is the single
character "\n", the standard C library for operating systems (although
I know that people will disagree with me, I'm including MS-DOS as an
operating system.) in which the end of line token is not "\n" do some
sort of conversion. In some libraries, when a line is read, it
converts the native EOL symbol into a single "\n", on input, and
convert "\n" to the native EOL on output. (on MS-DOS C libraries, it
involves stripping "\r" on input and adding "\r" on output. On the mac
it involves substituting "\r" for "\n".) Since perl in linked with the
C compilers standard C library, it imitates that behavior. (I think
that I read that CodeWarrior, which the distributed binaries of
MacPerl are compiled with do things differently. They take the
character sequence "\n" to be the ASCII CR sequence. The result is
similar as long as you use "\n" to mean end of line and not try to use
something like chr(10).)

The end result, for text files, always use "\n" as end of line, and
perl (through the C compiler libraries) will always do the right thing
for text files native to the operating system on which it is running.
If you are moving files from one operating system to another, the
files may become improper files for the target OS and you will either
need conversion utilities to fix them or you can make your own if you
know the native format for each operating system you are dealing with.
("\n" for unix and the amiga, "\r\n" for MS-DOS, Windows (again some
people may object to using the term operating system to describe
Windows), and Windows-NT, "\r" for the MacOS.)

Because of the different EOL characters, the standard C library also
added two different modes of accessing a file, binary and text. On
Unix systems, the two are treated the same, on systems with EOL
symbols which are not "\n", the conversion is only done in text mode.
Binary mode passes the characters unchanged. Perl has a function
called binmode() which converts an open filehandle from text mode to
binary mode. There is no way of changing an open filehandle back to
text mode.
--
Andrew Langmead



Fri, 04 Dec 1998 03:00:00 GMT  
 UNIX, MS-DOS, Macintosh, etc return character?

: is it the same with ALL platforms?!?!

: like \n  is used with MacPerl, but does \n also signify a UNIX newline?
: What about a MS-DOS newline?!?!

Well, yes, all versions and ports of Perl use \n to indicate the
newline character.  All versions and ports of Perl use \r to indicate
a return character.

What are you trying to do?  You did not really say what the problem
was...

Also, MS-DOS appends a ^M *Control-M* to the end of each line; UNIX
adds nothing, and I'm not sure about Macs.

Hope this helps.

Nate



Fri, 04 Dec 1998 03:00:00 GMT  
 UNIX, MS-DOS, Macintosh, etc return character?


Quote:

> : is it the same with ALL platforms?!?!

> : like \n  is used with MacPerl, but does \n also signify a UNIX newline?
> : What about a MS-DOS newline?!?!

> Well, yes, all versions and ports of Perl use \n to indicate the
> newline character.  All versions and ports of Perl use \r to indicate
> a return character.

> What are you trying to do?  You did not really say what the problem
> was...

I was asking because I use a Macintosh, and I often get files called
Info-Mac Digests, in which they are made on a Mac, then some how, they get
converted to UNIX EOL (end of line) characters.  And I use StuffIt Deluxe
for changing that EOL character..  And instead of launching that huge
program just to change the EOL characters, I was going to make the perl
script.

I want my script to read a text file, see if it has UNIX EOL's, MS-DOS
EOL's, etc and then translate those to the Macintosh format that I need.

I use a program (another one) that will "disect" the text file into proper
sections IF THE EOL's ARE IN MACINTOSH FORMAT, otherwise it puts the whole
text file as one "chapter" instead of "chapters" and "Subsections"..

That's why I asked what I asked in the newsgroup.  And thank you for
responding.

Quote:
> Also, MS-DOS appends a ^M *Control-M* to the end of each line; UNIX
> adds nothing, and I'm not sure about Macs.

> Hope this helps.

> Nate



*NO* chain letters of any kind!    World Wide Web Page designer, ask for it
PGP available.   I'll offer to anyone who has a valid reason for needing it.


Fri, 04 Dec 1998 03:00:00 GMT  
 UNIX, MS-DOS, Macintosh, etc return character?

Quote:


> > : is it the same with ALL platforms?!?!

> > : like \n  is used with MacPerl, but does \n also signify a UNIX newline?
> > : What about a MS-DOS newline?!?!

> I was asking because I use a Macintosh, and I often get files called
> Info-Mac Digests, in which they are made on a Mac, then some how, they get
> converted to UNIX EOL (end of line) characters.  And I use StuffIt Deluxe
> for changing that EOL character..  And instead of launching that huge
> program just to change the EOL characters, I was going to make the perl
> script.

> I want my script to read a text file, see if it has UNIX EOL's, MS-DOS
> EOL's, etc and then translate those to the Macintosh format that I need.

Then, you don't care what "\n" means, since that only has meaning for files
that are in the native text file format for the local OS/filesystem.

From Perl's (and C's) point of view, you are reading binary files, and
converting them to native text files.

The steps have been discussed on the MacPerl list, but the rough approach to
safely translate from any (major) filesystems format to your local file
system is to:
    binmode( STDIN ); # in case we're on MS-DOS
    undef $/;  # read entire file
    $_ = <>;
    s/\015\012/\n/g;  # translate MS-DOS CRLF to newline
    s/\015/\n/g;      # translate ASCII CR to newline
    s/\012/\n/g;      # translate ASCII NL to newline
    print;

Yes, one of the last two substitutions is not needed, depending on your
platform.  You can protect against it, if you like.




Fri, 11 Dec 1998 03:00:00 GMT  
 UNIX, MS-DOS, Macintosh, etc return character?

Quote:

> Since Standard C specifies that the end of line symbol is the single
> character "\n", the standard C library for operating systems (although
> I know that people will disagree with me, I'm including MS-DOS as an
> operating system.) in which the end of line token is not "\n" do some
> sort of conversion.

Not quite true.  C specifies that "\n" represents a newline.  It does
not specify any bit pattern for the representation.  Bit patterns are
only defined by character sets.

ASCII, as a character set, defines CR as \015 and NL al \012.  These are
just names for bit patterns, and have no relationship to the "\n" of
ANSI C.

Various file systems have different bit patterns (and length of bit
patterns) that they use to denote the boundary between 2 lines.  It's
the job of the C compiler and their associated Standard C libraries to
perform the mapping between "\n" and the file system when performing
formatted I/O (the Perl default).

Where a lot of confusion arrises is that most compilers targetted for
file systems that use a single octed to denote the text line boundary
also use that same bit pattern for "\n".  This saves translation when
performing formatted I/O.  People who use unix, therefore, often assume
that "\n" is the same as ASCII LF, but it isn't.

Normally, this distinction is not of interest to the average user.
However, when dealing with text files from other file systems, it can
get confusing until one gets very specific about where bit patterns are
defined, and what the meta symbols mean.....




Fri, 11 Dec 1998 03:00:00 GMT  
 UNIX, MS-DOS, Macintosh, etc return character?

Quote:


> : is it the same with ALL platforms?!?!

> : like \n  is used with MacPerl, but does \n also signify a UNIX newline?
> : What about a MS-DOS newline?!?!

> Well, yes, all versions and ports of Perl use \n to indicate the
> newline character.  

Yes.  However, remember that the meta character "\n" stands for "newline"
and has no defined bit pattern!  That is left up to the compiler
implementation, which often uses the same bit pattern as the file system
(where possible).

Quote:
> All versions and ports of Perl use \r to indicate
> a return character.

Not sure what you mean by "a return character".  As I understand it, ANSI C
defines "\r" to be the bit pattern not represented by "\n" from the set
"\012" (ASCII LF), "\015" (ASCII CR).  I've yet to find a use for it.

Stylistically, I use the meta symbols ("\n", "\r", etc.) when I intend what
the symbol means.  I use octal definitions( "\015", etc) when I intend a
specific bit pattern.




Fri, 11 Dec 1998 03:00:00 GMT  
 UNIX, MS-DOS, Macintosh, etc return character?

: Yes.  However, remember that the meta character "\n" stands for "newline"
: and has no defined bit pattern!  That is left up to the compiler
: implementation, which often uses the same bit pattern as the file system
: (where possible).

        Whoever taught you C was cruel.  "\n" is "newline", i.e. "linefeed",
which is always a linefeed, anywhere you go (ASCII anyway), how the program
your using determines what the end of line character is makes the difference.
This is normally the same as the operating system's (not file system's)
idea of an end of line character, but it doesn't necessarily have to be.
In Unix, end of line is represented by a linefeed, and '\n' == 0x0a, in
DOS, end of line is represented by a CRLF, and '\n' == 0x0a.

: > All versions and ports of Perl use \r to indicate
: > a return character.

: Not sure what you mean by "a return character".  As I understand it, ANSI C
: defines "\r" to be the bit pattern not represented by "\n" from the set
: "\012" (ASCII LF), "\015" (ASCII CR).  I've yet to find a use for it.

        I believe "a return character" was reffering to carriage return,
0x0d.  You would have a very sickly compiler if it thought "\r" was a
newline and "\n" was a return.
--
IPA.net Sysadmin         My girlfriend asked me which one I like better.

|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________



Wed, 16 Dec 1998 03:00:00 GMT  
 
 [ 8 post ] 

 Relevant Pages 

1. syscall(), int86x(), bdos(), intdosx(), etc. under MS-DOS

2. How to COM1, PRN, etc. on MS-DOS

3. ANNOUNCE: MSDOS::Attrib 1.00 (get/set MS-DOS style file attributes)

4. Windows<->Macintosh character set translation

5. Perl/CGI scripts from UNIX to Macintosh/ MacPerl?

6. convert macintosh file to unix

7. Convert macintosh file to unix

8. Perl/CGI scripts from UNIX to Macintosh/MacPerl?

9. Converting Macintosh files to UNIX

10. Perl 5 Book, DOS, etc.

11. Odbc module types returned etc

 

 
Powered by phpBB® Forum Software