Bug using UNIX file types 
Author Message
 Bug using UNIX file types

In both vc5 and vc6 there is a problem when openning a UNIX type file
in text mode, then trying use ftell()/fseek() or fgetpos()/fsetpos().  You
will end up positioned  at the wrong place in the file.

Has anyone else seen this?  This bug must have been around quite a while.
Does Miscrosoft simply not care about ANSI compliance?



Mon, 15 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types
Not really a bug.  It's because you have the file in text mode and therefore
newline characters are translated into carriage return, line feed sequences
(i.e. \n becomes \r\n).  This will cause the problems you have alluded to.
Open the file in binary mode instead and do your own translation if needed.
Quote:

>In both vc5 and vc6 there is a problem when openning a UNIX type file
>in text mode, then trying use ftell()/fseek() or fgetpos()/fsetpos().  You
>will end up positioned  at the wrong place in the file.

>Has anyone else seen this?  This bug must have been around quite a while.
>Does Miscrosoft simply not care about ANSI compliance?



Tue, 16 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types

Quote:

>Not really a bug.  It's because you have the file in text mode and
therefore
>newline characters are translated into carriage return, line feed sequences
>(i.e. \n becomes \r\n).  This will cause the problems you have alluded to.
>Open the file in binary mode instead and do your own translation if needed.

I think you can use "setmode" or something to tell text mode to use UNIX
style carriage returns or DOS style carriage return linefeed -- I'm almost
positive that there is a way to set this flag although the exact API call
elludes me.  Happy hunting...

--Richard



Wed, 17 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types
Several responses, including by email, have said that there is no bug and
suggested work arounds.  I had already switched to openning the files in
binary mode...and explicitly making sure I wasn't sensitive to the extra
"\r" that was sometimes there.

I appreciate the responses, but I have to disagree about whether this
is a bug or not.

The code base I am working in  has been around for over 10 years and has been
compiled on a minimum of 3 different compilers during it's lifetime.  We are
recently moving to NT and VC++.  VC++  is the first compiler to be confused by
files openned in text mode.  ftell()/fseek() and fgetpos()/fsetpos() are ANSI
standard functions and there is no caveat in the function descriptions that
say they won't work with UNIX style files openned in text mode.

(I call them UNIX style files because my editors usually differentiate between
 UNIX and DOS files, namely "\n" terminated lines versus "\r\n")

We originally chose to write files in binary and open in text mode.  With this
choice, what is written into a file is exactly what you say to write, no extra
control characters added by the write function.  When reading a file,  you
are insensitive to extra control characters that someone may have accidentally
added to a file by using an editor that only wrote out files, or lines in
files, in text mode adding a "\r" that wasn't originally there..
This combination has worked perfectly until now.


Quote:


>>Not really a bug.  It's because you have the file in text mode and
>therefore
>>newline characters are translated into carriage return, line feed sequences
>>(i.e. \n becomes \r\n).  This will cause the problems you have alluded to.
>>Open the file in binary mode instead and do your own translation if needed.

>I think you can use "setmode" or something to tell text mode to use UNIX
>style carriage returns or DOS style carriage return linefeed -- I'm almost
>positive that there is a way to set this flag although the exact API call
>elludes me.  Happy hunting...

>--Richard



Wed, 17 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types

Quote:

> Several responses, including by email, have said that there is no bug and
> suggested work arounds.  I had already switched to openning the files in
> binary mode...and explicitly making sure I wasn't sensitive to the extra
> "\r" that was sometimes there.

> I appreciate the responses, but I have to disagree about whether this
> is a bug or not.

You can disagree all you want but the situation
that caused your problem has been reflected
in the C++ standard.  That's why it provides
for binary or text mode operations on files.
As far as I can see, there is no departure
from what the standard requires.  So your
"bug" is better described as "something
you wish would work differently".

Quote:
> The code base I am working in  has been around for over 10 years and has been
> compiled on a minimum of 3 different compilers during it's lifetime.  We are
> recently moving to NT and VC++.  VC++  is the first compiler to be confused by
> files openned in text mode.

The compiler is not the least bit confused.
You are confused as to what ftell() and
fseek() should do.  The fact that the code
was written 10 years ago with non-portable
assumptions embedded in it does not
somehow make a platform buggy just
because those assumptions do not hold.

Quote:
> ftell()/fseek() and fgetpos()/fsetpos() are ANSI
> standard functions and there is no caveat in the function descriptions that
> say they won't work with UNIX style files openned in text mode.

If you look at what K&R says about fseek()
and text streams, you would realize there
is trouble awaiting programs that mix text
and binary mode on the same files.  The
idea that text-mode files might transport
across platforms is nowhere supported by
ANSI -- it's an invention that many people
know does not hold.  That's why FTP has
always had ascii and binary commands.

Quote:
> (I call them UNIX style files because my editors usually differentiate between
>  UNIX and DOS files, namely "\n" terminated lines versus "\r\n")

> We originally chose to write files in binary and open in text mode.

That was an error if portability was intended.

Quote:
> With this
> choice, what is written into a file is exactly what you say to write, no extra
> control characters added by the write function.  When reading a file,  you
> are insensitive to extra control characters that someone may have accidentally
> added to a file by using an editor that only wrote out files, or lines in
> files, in text mode adding a "\r" that wasn't originally there..
> This combination has worked perfectly until now.

The fault is in the assumption that such
code would be portable.  It isn't and that
fact could be predicted.  You seem to
believe that "worked perfectly" is some
kind of validation of the whole set of
assumptions buried in that code.  You
are seriously mistaken about that.

--
--Larry Brasfield
Above opinions may be mine alone.



Wed, 17 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types

Quote:



>> Several responses, including by email, have said that there is no bug and
>> suggested work arounds.  I had already switched to openning the files in
>> binary mode...and explicitly making sure I wasn't sensitive to the extra
>> "\r" that was sometimes there.

>> I appreciate the responses, but I have to disagree about whether this
>> is a bug or not.

>You can disagree all you want but the situation
>that caused your problem has been reflected
>in the C++ standard.  That's why it provides
>for binary or text mode operations on files.
>As far as I can see, there is no departure
>from what the standard requires.  So your
>"bug" is better described as "something
>you wish would work differently".

I stand corrected.  I've just perused the ANSI standards and to my
surprise openning a file in text mode is not part of the standard.

Since it's not in the standard, Microsoft can choose to implement
this common language extension any way it sees fit, even if that
way is different than the way many other compilers choose.
Even if, under common circumstances, the results are unexpected
and unusable, since there is not a defined standard for proper behavior,
there is no bug.

Technically, there is no bug.

End of Thread.



Thu, 18 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types

[snip]
There seems to be some controversy as to
whether the ANSI C standard library provides
for file streams to have binary or text modes.

Quote:
> I stand corrected.  I've just perused the ANSI standards and to my
> surprise openning a file in text mode is not part of the standard.

I don't happen to have a copy of the ANSI C
standard here, but your statement is at odds
with a couple sources I do have.  One is the
man entry for fopen() on my Linux system,
which states that fopen() accepts a "b" as
part of its second parameter "for compatibility
with ANSI  C3.159-1989 (``ANSI C'')".  The
other is the 2nd edition of "The C Programming
Language", by Brian W. Kernighan and Dennis
M. Ritchie which claims in its preface that it
"describes C as defined by the ANSI standard".
In its Appendix B, it states "This appendix is a
summary of the library defined by the ANSI
standard."  Later, in section B1.1, fopen() is
said to take a final "b" in its second parameter
as indicating a binary file.  Then, in section
B1.6, the fseek() functionality is said to depend
on whether it operates on a "binary file" or a
"text stream".  (I posted this fact earlier, but
its relevance appears to have not struck with
sufficient force, yet.)

So, unless you are prepared to state that
these sources have incorrectly portrayed
those portions of the standard C library,
I think you have to allow that binary and
text mode are recognized in the standard.

I suspect your error is in not recognizing
that for streams, text mode is the default
unless overridden globally or per fopen()
with the "b" indicator.  You should have
looked for binary mode, not text mode.
(And don't get hung up on Microsoft's
extension where a "t" indicator can be
used to locally override a global binary
mode override.  The manner by which
a file gets into binary or text mode is
not particularly germane here.)

Quote:
> Since it's not in the standard, Microsoft can choose to implement
> this common language extension any way it sees fit, even if that
> way is different than the way many other compilers choose.

Nonsense.  Both the ANSI C standard and the
ANSI/ISO C++ standard proscribe how variations
in newline representation are handled portably
among platforms.  That is why both standards
describe binary and text mode operations.

Quote:
> Even if, under common circumstances, the results are unexpected
> and unusable, since there is not a defined standard for proper behavior,
> there is no bug.

I've been using both binary and text mode on
that platform for years, with the VC compiler
and others, always with results that were
expected, useful and usable.

Quote:
> Technically, there is no bug.

Technically and practically.

Quote:
> End of Thread.

Let's hope so.

--
--Larry Brasfield
Above opinions may be mine alone.



Thu, 18 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types

Quote:

>Since it's not in the standard, Microsoft can choose to implement
>this common language extension any way it sees fit, even if that
>way is different than the way many other compilers choose.
>Even if, under common circumstances, the results are unexpected
>and unusable, since there is not a defined standard for proper behavior,
>there is no bug.

    I think part of the problem we are having here is that you never explain
exactly what the problem you are having is.  Reading between the lines of
your posts, it sounds like you are call ftell() at one point when the file
is opened in binary mode, saving the values, and then later expect calling
fseek() with that value to return you to that point in the file when the
file in opened in text mode.   I don't believe ftell()/fseek() is guaranteed
to work across the file being closed & reopened, but even if it is,  it's
not expected to work across the mode being changed.

--
Truth,
   James [MVP]
http://www.NJTheater.Com       -and-
http://www.NJTheater.Com/JamesCurran



Fri, 19 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types
I was content to let the thread die, but on the chance that there was
a misunderstanding when I originally stated the problem, or non-problem,
this is the symptom:

fp = fopen (filename, "rt");

fgets (str1, MAXSIZE, fp);

pos = ftell(fp);

fgets (str2, MAXSIZE, fp);
fgets (str3, MAXSIZE, fp);

fseek (fp, pos, SEEK_SET);

fgets (str4, MAXSIZE, fp);

I would expect str2 and str4 to be the same.
It turns out that if the file I am reading has lines terminating
with "\n" and not "\r\n", then str4 contains the last
4 characters of str1.  The file pointer was not positioned where I thought
it would be.  

I won't argue any more about whether this is a bug or not.


Quote:


>>Since it's not in the standard, Microsoft can choose to implement
>>this common language extension any way it sees fit, even if that
>>way is different than the way many other compilers choose.
>>Even if, under common circumstances, the results are unexpected
>>and unusable, since there is not a defined standard for proper behavior,
>>there is no bug.

>    I think part of the problem we are having here is that you never explain
>exactly what the problem you are having is.  Reading between the lines of
>your posts, it sounds like you are call ftell() at one point when the file
>is opened in binary mode, saving the values, and then later expect calling
>fseek() with that value to return you to that point in the file when the
>file in opened in text mode.   I don't believe ftell()/fseek() is guaranteed
>to work across the file being closed & reopened, but even if it is,  it's
>not expected to work across the mode being changed.



Sun, 21 Oct 2001 03:00:00 GMT  
 Bug using UNIX file types

Quote:

>I would expect str2 and str4 to be the same.
>It turns out that if the file I am reading has lines terminating
>with "\n" and not "\r\n", then str4 contains the last
>4 characters of str1.  The file pointer was not positioned where I thought
>it would be.

    Now knowing exactly what the problem is, I will agree with you --- that
is a bug.

--
Truth,
   James [MVP]
http://www.NJTheater.Com       -and-
http://www.NJTheater.Com/JamesCurran



Sun, 21 Oct 2001 03:00:00 GMT  
 
 [ 10 post ] 

 Relevant Pages 

1. Using types in a different assembly given that the type may be used or not used

2. Check the file size and number symbols in a file using C in Unix

3. using variables of type date in UNIX

4. remove file using UNIX on C

5. Questions on Unix File/Directory Manipulation using C

6. Help on file write() on Unix using C ...

7. Precompiled header-files on Unix using Make ?

8. Applying file masks in UNIX using C/C++

9. using incomplete structure types as opaque types

10. CWinApp::OnOpen - 2 file types, 1 document type??

11. CWinApp::OnOpen - 2 file types, 1 document type??

12. RPC Bug NT-UNIX

 

 
Powered by phpBB® Forum Software