big-endian vs. little-endian 
Author Message
 big-endian vs. little-endian

Hello.
I have a problem regarding byte ordering. I have some C software that
generates binary data files. On my PC (running either DOS or OS/2) using
gcc the programs can read the data files with no problems. However, under
UNIX (SunOS) running on a Sparc station, the byte order is reversed
(big-endian) and therefore software compiled on this platform will not
read a binary file written on the other platform. Short of simply
converting the byte orders of the numbers in the data file, is there a
way to handle this problem in ANSI C at runtime in an efficient and
portable way?

--
                                -- Joe Heafner

Joe Heafner, Astronomy and Physics Instructor. Work:(704)327-7000 x246
my surname with my first initial at mercury dot interpath dot com
<URL: http://www.*-*-*.com/ ~heafnerj/>



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian


| I have a problem regarding byte ordering. I have some C software that
| generates binary data files. On my PC (running either DOS or OS/2) using
| gcc the programs can read the data files with no problems. However, under
| UNIX (SunOS) running on a Sparc station, the byte order is reversed
| (big-endian) and therefore software compiled on this platform will not
| read a binary file written on the other platform. Short of simply
| converting the byte orders of the numbers in the data file, is there a
| way to handle this problem in ANSI C at runtime in an efficient and
| portable way?

You have to define what the medium of communication is, whether it be a
file or a packet.  You have to define how a number if formatted, such as
decimal digits high order first, decimal digits low order first, bytes
high order first, bytes low order first.  Some formats do include the
ability to define that on the fly, such as one byte that indicates the
coming order, then data in that order.  The Internet protocols define
the byte order in packet headers to be high order first (big-endian).

There isn't any magical way to have one program output in one order and
get it read in into another program that inputs in another order.  They
simply are not compitble because they are dealing with different formats.

An intermediate conversion program would also be non-trivial.  It has to
know other details of the format.  For example if the data had a mix of
2 byte values (if you're even storing as bytes at all indicates that the
saving of space is a goal, so it is conceivable to have 2 byte values for
integers in that range) and 4 byte values, the program doing the swapping
has to know where the 2 byte ones are and where the 4 byte ones are to do
the swapping correctly.

And don't even think of attempting this with floating point.

--
 --    *-----------------------------*      Phil Howard KA9WGN       *    --
  --   | Inturnet, Inc.              | Director of Internet Services |   --
   --  | Business Internet Solutions |       eng at intur.net        |  --
    -- *-----------------------------*      philh at intur.net       * --



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian


Quote:

> Hello.
> I have a problem regarding byte ordering. I have some C software that
> generates binary data files. On my PC (running either DOS or OS/2) using
> gcc the programs can read the data files with no problems. However, under
> UNIX (SunOS) running on a Sparc station, the byte order is reversed
> (big-endian) and therefore software compiled on this platform will not
> read a binary file written on the other platform. Short of simply
> converting the byte orders of the numbers in the data file, is there a
> way to handle this problem in ANSI C at runtime in an efficient and
> portable way?

No.  This is a thorn within the industry.  Also note that not all SunOS
machines are big-endian.  We have a similar situation in which we are
sending data across a network from a Sparc (big-endian) processor to an
Intel processor (little-endian).  The data is converted on the Sparc
side before it is shipped and as it is received.

An idea is to write a translation program to convert the file to
big-endian, before it is read on the Unix box.  This is to reduce
translating the file each time it is accessed.

--
Thomas Matthews
StorageTek



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian



Quote:

> Hello.
> I have a problem regarding byte ordering. I have some C software that
> generates binary data files. On my PC (running either DOS or OS/2) using
> gcc the programs can read the data files with no problems. However, under
> UNIX (SunOS) running on a Sparc station, the byte order is reversed
> (big-endian) and therefore software compiled on this platform will not
> read a binary file written on the other platform. Short of simply
> converting the byte orders of the numbers in the data file, is there a
> way to handle this problem in ANSI C at runtime in an efficient and
> portable way?

Probably the best thing to do in this case is convert eveything to network
byte order.
look at the functions:

htons()         host-to-network short
ntohs()         network-to-host short
htonl()         host-to-network long
ntohl()         network-to-host long

It's not ansi, but it's probably more standard than anything else out
there...



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian


Quote:
>I have a problem regarding byte ordering. I have some C software that
>generates binary data files. On my PC (running either DOS or OS/2) using
>gcc the programs can read the data files with no problems. However, under
>UNIX (SunOS) running on a Sparc station, the byte order is reversed
>(big-endian) and therefore software compiled on this platform will not
>read a binary file written on the other platform. Short of simply
>converting the byte orders of the numbers in the data file, is there a
>way to handle this problem in ANSI C at runtime in an efficient and
>portable way?

Sorry, but the only solution is to convert the data in an appropriate
manner at either run-time or with some sort of a utility program. For the
former, you need to be able to distinguish between big-endian and
little-endian files, in most cases. (This actually depends on specifics
of your own environment).

However, you should know that there are several efficient byte-swapping
algorithms available. Any of those can be used on your data.
--
  Bob           One good thing about being wrong is the joy it brings to others



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian


Quote:

>Probably the best thing to do in this case is convert eveything to network
>byte order.
>look at the functions:

>htons()             host-to-network short
>ntohs()             network-to-host short
>htonl()             host-to-network long
>ntohl()             network-to-host long

Not recommended. These aren't part of the C language.

Quote:
>It's not ansi, but it's probably more standard than anything else out
>there...

Not true. The most standard way is to write maximally-portable
encoding and decoding routines in ANSI C.

The htons() function, and friends, are only useful in conjunction with
berkeley sockets; more specifically, the type struct sockaddr_in.

It's an extremely bad idea to use them for anything else.



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian



Quote:

>Hello.
>I have a problem regarding byte ordering. I have some C software that
>generates binary data files. On my PC (running either DOS or OS/2) using
>gcc the programs can read the data files with no problems. However, under
>UNIX (SunOS) running on a Sparc station, the byte order is reversed
>(big-endian) and therefore software compiled on this platform will not
>read a binary file written on the other platform. Short of simply
>converting the byte orders of the numbers in the data file, is there a
>way to handle this problem in ANSI C at runtime in an efficient and
>portable way?

Binary files do not have to be non-portable. Just look at GIF, ZIP
or TAR files, for instance.

There are two separate issues to consider: first, is you must decide on
a file format, which is precisely specified down to the individual octet.
Secondly, you must decide whether you will write portable code to
read and write that format, or whether you will write platform-specific
code.

By far the easiest thing to do is to write highly portable code. This
means that you must treat the binary file as a sequence of bytes,
each of which is used to hold an octet (eight bits) of data. Divide
your file manipulation module into routines which write the data,
and routines that read the data. To be portable, the routines
should read and write individual bytes, or arrays of bytes,
and use highly portable operations to disassemble and reassemble the
data.

An example of a portable operation would be the use of shifting and
masking to reconstruct an unsigned integer that has been broken into a
sequence of octets. An example of a non-portable operation would be
to directly treat an array of four chars as though it were a data object of
type long int.

and recon

Quote:
>--
>                                -- Joe Heafner

>Joe Heafner, Astronomy and Physics Instructor. Work:(704)327-7000 x246
>my surname with my first initial at mercury dot interpath dot com
><URL:http://mercury.interpath.com/~heafnerj/>



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian

Quote:


>| I have a problem regarding byte ordering. I have some C software that

...

Quote:
>You have to define what the medium of communication is, whether it be a
>file or a packet.  You have to define how a number if formatted, such as
...
>An intermediate conversion program would also be non-trivial.  It has to
>know other details of the format.  For example if the data had a mix of

All of this has been invented before.

Have a look at XDR (library  routines  for  external  data  represen-
tation):

...
SYNOPSIS AND DESCRIPTION

     These routines allow C programmers to describe arbitrary data
     structures in a machine-independent fashion.  Data for remote
     procedure calls are transmitted using these routines.  

Wolfgang Denk



Sendmail may be safely run set-user-id to root.
                        -- Eric Allman, "Sendmail Installation Guide"



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian

On 22 Jan 98 15:31:54 GMT, Joe Heafner - Astronomer

Quote:

>Hello.
>I have a problem regarding byte ordering. I have some C software that
>generates binary data files. On my PC (running either DOS or OS/2) using
>gcc the programs can read the data files with no problems. However, under
>UNIX (SunOS) running on a Sparc station, the byte order is reversed
>(big-endian) and therefore software compiled on this platform will not
>read a binary file written on the other platform. Short of simply
>converting the byte orders of the numbers in the data file, is there a
>way to handle this problem in ANSI C at runtime in an efficient and
>portable way?

You may attempt to dynamically determine what the byte order should
be by making the first number is the file a known number.  When your
program reads in the first number in the file, it can check to make sure
that it equals what it should be.  If it doesn't it, you can perform
byte-swapping and then check it again.  If neither matches, you bail
out, but if one of them matches, then you know whether or not to perform
the swaps.

--

http://www.cs.wustl.edu/~jxh/        Washington University in Saint Louis

Quote:
>>>>>>>>>>>>> I use *SpamBeGone* <URL:http://www.internz.com/SpamBeGone/>



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian

Joe Heafner - Astronomer writes:

Quote:

> Hello.
> I have a problem regarding byte ordering. I have some C software that
> generates binary data files. On my PC (running either DOS or OS/2) using
> gcc the programs can read the data files with no problems. However, under
> UNIX (SunOS) running on a Sparc station, the byte order is reversed
> (big-endian) and therefore software compiled on this platform will not
> read a binary file written on the other platform. Short of simply
> converting the byte orders of the numbers in the data file, is there a
> way to handle this problem in ANSI C at runtime in an efficient and
> portable way?

There have been many useful responses to this post, but I would like
to add that I strongly recommend looking into the netCDF library
from unidata.ucar.edu. This library allows you to read and write your
data across platforms, and it allows you to include meta-data in the
data files, so the format can be self-documenting. Then you have
extension modules to interpreted languages like Perl and python
that can also read this data, so you can have interactive,
interpreted scripts for peeking at your data, etc. In a research
environment, the benefits of transitioning to netCDF (or something
similar) cannot be underestimated, even though it might mean modifying
your data-collection programs. Another option that I use on a system that
captures data very quickly is to capture and write the binary data,
then use another program to convert it to netCDF files that will then
be used by all of the different post-processing programs on different
platforms, languages, etc.

later,
jlp



Mon, 10 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian



Quote:


> > Intel processor (little-endian).  The data is converted on the
Sparc
> > side before it is shipped and as it is received.

> Why? The logical solution would be to do the conversion on the
> side that uses the "wrong" byte order. The "network" order is
> big endian, so the SPARC is right. Lookup htonl(), ntohl(), htons(),
> and ntohs().

It's not Intel's fault that the network order is wrong.  Of *course*
less-significant bytes should go into lower-numbered memory addresses.
It never ceases to amaze me how many otherwise intelligent people fail
to see this.  ;-)  <- please note the smiley.  I point it out because
there is a distinct lack of sense of humor among some people on this
NG.

--Mike Smith



Tue, 11 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian

Quote:

>It's not Intel's fault that the network order is wrong.  Of *course*
>less-significant bytes should go into lower-numbered memory addresses.
>It never ceases to amaze me how many otherwise intelligent people fail
>to see this.  ;-)  <- please note the smiley.  I point it out because
>there is a distinct lack of sense of humor among some people on this
>NG.

Not trying to get into an endianess war, but Big endian seems better to me.

One reason is that I sometimes read numbers in Hex dumps, and it is much
easier in big endian form.  Or you can do what the VAX/VMS DUMP program does
and print the rows from high address to low address.  Now that seems a silly
solution.

The other reason, and is sometimes claimed as an advantage for little endian,
is that in little endian if you want to address part of a stored integer
the address is the same.  For example, if you pass an (int*) to a function
expecting a (short*) the function will get the right value in a little
endian machine but not a big endian machine.  Why do I call this an advantage
for big endian?  On a little endian machine I won't catch this mistake until
the value gets larger than the largest value for a short.  On a big endian
machine it fails all the time, and I know I have to fix it.  

Then again, I used big endian machines first, and that might be part of
the reason for the preference.  But I don't believe it is the only reason.

-- glen



Tue, 11 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian

[snip]

Quote:
> The other reason, and is sometimes claimed as an advantage for little endian,
> is that in little endian if you want to address part of a stored integer
> the address is the same.  For example, if you pass an (int*) to a function
> expecting a (short*) the function will get the right value in a little
> endian machine but not a big endian machine.  Why do I call this an advantage
> for big endian?  On a little endian machine I won't catch this mistake until
> the value gets larger than the largest value for a short.  On a big endian
> machine it fails all the time, and I know I have to fix it.

Quite wrong.  Endianess is a matter of hardware design preference at the
microcode level.  Little endian, i.e. Intel, simplifies the hardware
(processor) design.  The big endian format is easier for humans to read
but requires more hardware to decode it.  To my recollection, it has to
do with putting values into shift registers and the tricks you can do
with it.

Intel made their choice back in the early days of 4 and 8 bit
processors.  But alas, the evil term of "backward compatibility" does
not allow Intel to change the byte ordering on future 80x86 chips
(including the Pentium series).

The wealthy person will be one who invents a hardware method of
interchanging the bytes, and it is accessible on all platforms.

Notice, I haven't stated whether big endian is better or worse.  That is
a religious issue.

--
Thomas Matthews
StorageTek



Tue, 11 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian


: Not trying to get into an endianess war, but Big endian seems better to me.

<snip>

: Then again, I used big endian machines first, and that might be part of
: the reason for the preference.  But I don't believe it is the only reason.

Exactly.  My opinion is the exact opposite for the same reason.
I find little-endian easier to read.  Using 6502's and up and now
Intel I am used to looking for lowbyte before hi.  

I think it's all preference.  There are inherent advantages and disadvantages
to either structure.  And it's silly to argue which is better because
they are both here and not going away.

Joe

--

+-----------------------------------------------------------------------
| If dolphins are so smart, then why do they live in igloos?
+-----------------------------------------------------------------------



Tue, 11 Jul 2000 03:00:00 GMT  
 big-endian vs. little-endian



Quote:
>Quite wrong.  Endianess is a matter of hardware design preference at the
>microcode level.  Little endian, i.e. Intel, simplifies the hardware
>(processor) design.  The big endian format is easier for humans to read

That is false.

Quote:
>but requires more hardware to decode it.  To my recollection, it has to
>do with putting values into shift registers and the tricks you can do
>with it.

Nonsense. Either one is just as easy.

Quote:
>The wealthy person will be one who invents a hardware method of
>interchanging the bytes, and it is accessible on all platforms.

It's been done already. MIPS processors can be wired to be little
endian or big endian.

The only difference betwen a big and little endian processor is how the
load and store datapaths are set up for quantities larger than a word.

When a big-endian processor performs a store operation on a four-byte
word, the most significant byte travels to the lowest address. When
a little endian processor does the same thing, the most significant
byte goes to the highest address. Basically, the only difference is in the
wiring of the circuit which connects the processor's internal data bus to the
external one.  At least, on a microprocessor which can only do whole word
transfers, that would be the only difference.



Tue, 11 Jul 2000 03:00:00 GMT  
 
 [ 25 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Big Endian vs Little Endian and other questions

2. big-endian VS little-endian

3. Big endian, little endian question.

4. Read double little endian on big endian machine

5. Big Endian/Little Endian

6. Big Endian->Little Endian for floats

7. big endian on SUN to little endian on Intel conversion

8. Big Endian/Little Endian

9. Big endian to Little endian

10. little-endian and big-endian in bit-order

11. Big Endian - Little Endian

12. Floats: Big-endian/Little-endian conversion

 

 
Powered by phpBB® Forum Software