Calculating the bit order on a given system
Author Message
Calculating the bit order on a given system

Hallo,

I need a way to find out the bit order in memory on an arbitrary
system.

Basically I am reading some bytes from a file. I know the bit order
within these bytes (I also know the byte order, but I have a way of
dealing with that).
If the bit order is LSB first in the file, and MSB first in memory,
obviously I am going to have to reverse the bit order within each
byte, but I'm not sure how to find the bit order in memory.

If anyone can help I'd be delighted.

Mon, 07 Feb 2005 19:10:00 GMT
Calculating the bit order on a given system

Quote:
> Hallo,

> I need a way to find out the bit order in memory on an arbitrary
> system.

It's not possible since the smallest addressable unit in C is the byte. The
hardware bit-ordering is hidden from a strictly conforming program.

Quote:
> Basically I am reading some bytes from a file. I know the bit order
> within these bytes

Quote:
> (I also know the byte order, but I have a way of
> dealing with that).

??

Quote:
> If the bit order is LSB first in the file, and MSB first in memory,
> obviously I am going to have to reverse the bit order within each
> byte, but I'm not sure how to find the bit order in memory.

There isn't one. In C, you can use << and >> to shift bits "left" or
"right". I suppose that makes "right" the least significant in the sense
that that 1 >> 1 is 0, and 1 << 1 is 2. But as you can see, that's
conceptual to the abstract machine of the C standard.

--
Peter

Mon, 07 Feb 2005 20:10:09 GMT
Calculating the bit order on a given system
On 22 Aug 2002 04:10:00 -0700, Daragh Byrne said:

Quote:
> I need a way to find out the bit order in memory on an arbitrary
> system.

#include <stdio.h>

struct test_bit_order {
unsigned int bf:16;

Quote:
};

int main(void)
{
struct test_bit_order test = {0xABCD};
unsigned char *bits = (unsigned char *)&test;

if(bits[0] == 0xAB)
{
fprintf(stderr, "big endian\n");
}
else if(bits[0] == 0xCD)
{
fprintf(stderr, "little endian\n");
}
else
{
fprintf("Funny endian\n");
}

return 0;

Quote:
}
> Basically I am reading some bytes from a file. I know the bit order
> within these bytes (I also know the byte order, but I have a way of
> dealing with that).
> If the bit order is LSB first in the file, and MSB first in memory,
> obviously I am going to have to reverse the bit order within each
> byte, but I'm not sure how to find the bit order in memory.

I'm not sure I understand your problem. Someone else may be able
to give you more meaningful advice without the need to write
endian-specific code. I suspect that you don't need to do so. But
like I said, I don't really follow what the problem is.

Cheers,
Dave.

--
David Neary,
E-Mail: bolsh at gimp dot org
CV: http://www.redbrick.dcu.ie/~bolsh/CV/CV.html

Mon, 07 Feb 2005 20:34:18 GMT
Calculating the bit order on a given system
On Thu, 22 Aug 2002 22:10:09 +1000, Peter Nilsson said:

Quote:

>> I need a way to find out the bit order in memory on an arbitrary
>> system.

> It's not possible since the smallest addressable unit in C is the byte.

This is untrue.

Quote:
>> Basically I am reading some bytes from a file. I know the bit order
>> within these bytes

I see what the problem is now. Let's say we're talking about file
transfer from a big endian to little endian system... We'll make
it easy & stick with a 2 byte header, avoiding the rest of the
file altogether.

Let's say the header format is
x    - 1 bit, on or off, indicating whether the file is compressed
xxx  - 3 bits corresponding to an isometry of a square (say)
xxxx - 4 bits giving the horizontal size of something or other
xxxx - 4 bits for the vertical size of the something or other
xx   - 2 bits, indicating a direction.
xx   - final 2 bits, padding :)

We could have the following struct to represent it...

unsigned int compressed:1;
unsigned int isometry:3;
unsigned int hsize:4;
unsigned int vsize:4;
unsigned int direction:2;

Quote:

Now for reading the info from the file, given a file descriptor
fd, we would do something like this...
{
int ch;
ch = fgetc(fd);
if(ch == EOF)
/* Something's wrong */

/* And here's the problem - say the header is
01001010 11000100
and that CHAR_BIT is 8. At this point, is ch = 74 or 82? In
other words, do we get file_header.compressed by doing ch&1
or ch>>7? And if CHAR_BIT isn't 8, we have even bigger
problems...
*/

Quote:
}

I see now. Not sure of the proper way to handle this. Or even if
it's a problem. One of the (dis)advantages of having had to code
on relatively few OSes.

Cheers,
Dave.

--
David Neary,
E-Mail: bolsh at gimp dot org
CV: http://www.redbrick.dcu.ie/~bolsh/CV/CV.html

Mon, 07 Feb 2005 21:03:46 GMT
Calculating the bit order on a given system

> On Thu, 22 Aug 2002 22:10:09 +1000, Peter Nilsson said:

> >> I need a way to find out the bit order in memory on an arbitrary
> >> system.
> >
> > It's not possible since the smallest addressable unit in C is the byte.
>
> This is untrue.

It is true.

> >> Basically I am reading some bytes from a file. I know the bit order
> >> within these bytes
> >
> > Then what's your problem?
>
> I see what the problem is now. Let's say we're talking about file
> transfer from a big endian to little endian system...

Big endian vs. little endian has nothing to do with it.  And there is
nothing in the transfer of a file that would do anything about the
bit order.  If mysteriously the most significant bit and least
significant bit changed places, etc., the text "TRr" would become
"*JN".  Transfers protocols are made such that this does not happen.
The same is the case with transfers between memory and registers.  So,
there is absolutely no way to determine the bit order of a machine.

I think the OP has something like: bit 2 of the byte denotes something
and he is wondering what bit 2 is.  As I see it it can be 1<<1, 1<<2,
1<<6 or 1<<7, depending on the numbering used.  Only the documentation
can help.
--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/

Mon, 07 Feb 2005 22:02:06 GMT
Calculating the bit order on a given system

Quote:

> > On Thu, 22 Aug 2002 22:10:09 +1000, Peter Nilsson said:

> > >> I need a way to find out the bit order in memory on an arbitrary
> > >> system.

> > > It's not possible since the smallest addressable unit in C is the byte.

> > This is untrue.

> It is true.

> > >> Basically I am reading some bytes from a file. I know the bit order
> > >> within these bytes

> > > Then what's your problem?

> > I see what the problem is now. Let's say we're talking about file
> > transfer from a big endian to little endian system...

> Big endian vs. little endian has nothing to do with it.  And there is
> nothing in the transfer of a file that would do anything about the
> bit order.  If mysteriously the most significant bit and least
> significant bit changed places, etc., the text "TRr" would become
> "*JN".  Transfers protocols are made such that this does not happen.
> The same is the case with transfers between memory and registers.  So,
> there is absolutely no way to determine the bit order of a machine.

I can't see that there's even such a thing as bit order.  The byte order
specifies whether the LSB or MSB of a word has the lower byte address -
but there's no such thing as a bit address that is distinct from the
signifigance of the bit.  Looked at another way, it's meaningless to say
whether the least significant bit is on the 'left' or the 'right' of a
byte on a given architecture.

Quote:

> I think the OP has something like: bit 2 of the byte denotes something
> and he is wondering what bit 2 is.  As I see it it can be 1<<1, 1<<2,
> 1<<6 or 1<<7, depending on the numbering used.  Only the documentation
> can help.

Yes, I agree.  It's just a property of the original file specification.

- Kevin.

Mon, 07 Feb 2005 22:23:20 GMT
Calculating the bit order on a given system

Quote:
>I see now. Not sure of the proper way to handle this. Or even if
>it's a problem. One of the (dis)advantages of having had to code
>on relatively few OSes.

don't attempt to read and write structures directly.  use an intermediary
format, typically plain text makes sense.

--
bringing you boring signatures for 17 years

Mon, 07 Feb 2005 22:21:50 GMT
Calculating the bit order on a given system

Quote:
> On Thu, 22 Aug 2002 22:10:09 +1000, Peter Nilsson said:

>>> I need a way to find out the bit order in memory on an arbitrary
>>> system.

>> It's not possible since the smallest addressable unit in C is the byte.

> This is untrue.

--
Chris "eqivocating hedgehog" Dollin
C FAQs at: http://www.faqs.org/faqs/by-newsgroup/comp/comp.lang.c.html
C welcome: http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html

Mon, 07 Feb 2005 22:21:35 GMT
Calculating the bit order on a given system

Quote:

> On Thu, 22 Aug 2002 22:10:09 +1000, Peter Nilsson said:

> >> I need a way to find out the bit order in memory on an arbitrary
> >> system.

> > It's not possible since the smallest addressable unit in C is the byte.

> This is untrue.

I'm astonished to hear this.  Would you mind elaborating?

On the larger issue, the "order" of bits within the smallest
addressable unit (presumably a byte until we learn otherwise) is
not only unknown, but meaningless.  Thought experiment: I build
a computer whose fundamental hardware elements are four-state
of flip-flops -- and store a single base-four digit in each such
gadget.  It's easy to see how I can make this computer appear to
operate in base two even though all its circuits actually work
in base four, and it's easy to see there would be no special
barriers to putting a conforming C implementation on this machine.
Now: what is the relative "order" of the two lowest-order bits
in an int?  Keep in mind that neither bit has any physical
existence independent on the other; the bit pair is encoded as
one of four states of a single gadget.

(If this sounds far-fetched, consider how modems use phase-
shift encoding to transmit several bits in one change of the
signal state; where are the individual bits in one such state
change, and what order do they appear in?)

C's operators work with values, not with representations.
Even the bit-wise operators operate on values; their operation
is described in terms of a binary notation, but that's just
the description, not an implementation stricture.  That's
easy to see: UCHAR_MAX<<1 gives the same value on both Big-
and Little-Endian systems with compatible data ranges, even
though the upmost 1-bit moves into a "leftward" byte on one
machine and a "rightward" byte on the other.  Only the value
matters, not the bits.

Actual computer systems define mappings between internal
and external data representations on various media, and these
mappings are often concerned with bit order.  But the "order"
is an artifact of the mapping, not of the internal arrangement.
Again, this is easy to see: the mapping is obviously different
for bits sent to a hard disk than for bits sent to a speaker;
neither the hard disk nor the speaker controls how the bits
are organized inside the machine.

Indeed, it's quite usual for a machine to have multiple
different "bit orders" for values stored in various subsystems.
In a typical system with a main memory, a couple levels of
cache memory, assorted CPU registers, and (usually) inaccessible
special-purpose gadgets within the ALU.  Where is it required
that all these different subsystems use the same "bit order?"

--

Mon, 07 Feb 2005 22:35:47 GMT
Calculating the bit order on a given system

Quote:

[snip]

>> Big endian vs. little endian has nothing to do with it.  And there is
>> nothing in the transfer of a file that would do anything about the
>> bit order.  If mysteriously the most significant bit and least
>> significant bit changed places, etc., the text "TRr" would become
>> "*JN".  Transfers protocols are made such that this does not happen.
>> The same is the case with transfers between memory and registers.  So,
>> there is absolutely no way to determine the bit order of a machine.

> I can't see that there's even such a thing as bit order.  The byte order
> specifies whether the LSB or MSB of a word has the lower byte address -
> but there's no such thing as a bit address that is distinct from the
> signifigance of the bit.  Looked at another way, it's meaningless to say
> whether the least significant bit is on the 'left' or the 'right' of a
> byte on a given architecture.

Hmm I think I was a little hasty.  It seems natural to define 'left' and
'right' for bits in terms of the left-shift and right-shift operators.
In which case, the answer is "in any conforming C implementation, the
least significant bits are considered the rightmost bits".  ie, (4 >> 1)
== 2 should always evaluate to true.

Mon, 07 Feb 2005 23:12:20 GMT
Calculating the bit order on a given system

Quote:

> Hallo,

> I need a way to find out the bit order in memory on an arbitrary
> system.

> Basically I am reading some bytes from a file. I know the bit order
> within these bytes (I also know the byte order, but I have a way of
> dealing with that).

You know the bit order within the bytes? Great! :-)

Quote:
> If the bit order is LSB first in the file, and MSB first in memory,
> obviously I am going to have to reverse the bit order within each
> byte, but I'm not sure how to find the bit order in memory.

Say, for the sake of argument, that you know that the bit order is LSB
first in the file.

Okay, let's start off with a test byte (in your file), which you know
has the bit pattern 10000000. (For now, we'll assume CHAR_BIT is 8,
since I don't want to add to your troubles at this stage.) Since LSB
comes first in your file, this byte has the *value* 1.

Read your test byte into an unsigned char, using unsigned char ch;

Now, if(ch == 1) then you know that memory organises bits in the same
way that your file does. Otherwise, you know it doesn't.

If you are confident that both the file and memory use 8-bit bytes, that
should be sufficient to turn the trick.

--

"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Mon, 07 Feb 2005 20:53:03 GMT
Calculating the bit order on a given system
On Thu, 22 Aug 2002 14:02:06 GMT, in comp.lang.c , "Dik T. Winter"

Quote:

> > On Thu, 22 Aug 2002 22:10:09 +1000, Peter Nilsson said:

> > >> I need a way to find out the bit order in memory on an arbitrary
> > >> system.

> > > It's not possible since the smallest addressable unit in C is the byte.

> > This is untrue.

I guess you're thinking of bitfields?

Quote:
>It is true.

I'm not so sure. Accoring to C99 3.5(2) "It need not be possible to
express the address of each individual bit of an object.". This
implies that it _may_ be possible.  So its probably implementation
defined and its safer to say that the smallest portable addressable
unit is a char.

(BTW my reading of the standard does not forbid a byte being large
enough to hold two or more uniquely addressable chars. Is that
feasible?)

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>

Tue, 08 Feb 2005 06:35:01 GMT
Calculating the bit order on a given system

Quote:

>> On Thu, 22 Aug 2002 22:10:09 +1000, Peter Nilsson said:

>>>> I need a way to find out the bit order in memory on an arbitrary
>>>> system.

>>> It's not possible since the smallest addressable unit in C is the byte.

>> This is untrue.

Read the below from C99 again....

3.5
1 bit
unit of data storage in the execution environment large enough to hold
an object that may have one of two values
2 NOTE It need not be possible to express the address of each
individual bit of an object.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>

Tue, 08 Feb 2005 06:36:17 GMT
Calculating the bit order on a given system

> On Thu, 22 Aug 2002 14:02:06 GMT, in comp.lang.c , "Dik T. Winter"
> > > > It's not possible since the smallest addressable unit in C is the byte.
> > >
> > > This is untrue.
>
> I guess you're thinking of bitfields?
>
> >It is true.
>
> I'm not so sure.

I agree. Let us adjust to: the smallest addressable unit in C that is
guaranteed by the standard is the byte.  However, when you attempt
to address a bit on a system that has addressable bits (and they do
exist), the standard requires a diagnostic.  There is a constraint
that the address operator shall not be a bit-field, and that is the
only way to get at bits.

> I'm not so sure. Accoring to C99 3.5(2) "It need not be possible to
> express the address of each individual bit of an object.". This
> implies that it _may_ be possible.  So its probably implementation
> defined and its safer to say that the smallest portable addressable
> unit is a char.
>
> (BTW my reading of the standard does not forbid a byte being large
> enough to hold two or more uniquely addressable chars. Is that
> feasible?)

6.2.6.1.  Except for bit-fields, objects are composed of contiguous
sequences of one or more bytes, ...
So, no it is not feasable, because an object of type char is composed
of at least one byte.  There is other text that limits it to one
byte only.  Note that also an object of type _Bool is composed of
at least one byte.  Note however that after:
char c = 'abcd';
the char variable may contain all four characters, especially if a
byte is 32 bits.
--
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/

Tue, 08 Feb 2005 08:22:22 GMT
Calculating the bit order on a given system

Quote:

>I need a way to find out the bit order in memory on an arbitrary
>system.

As others noted, this question is effectively meaningless in C.

For an illustration of where *else* things go wrong, even though
you (the generic "you" :-) ) think you know what "bit order" means
on your hardware, find the original MIT 680x0 port of the PCC C
compiler and the original Sun Microsystems version of the same C
compiler.  Use the following program:

int main(void) {
struct bits {
unsigned int a:1, b:30, c:1;
};
union u {
unsigned int value;
struct bits bits;
};

u.value = 0;
u.bits.a = 1;
printf("when a=1, c=0: u.value = 0x%08x\n", u.value);
u.bits.a = 0;
u.bits.c = 1;
printf("when a=0, c=1: u.value = 0x%08x\n", u.value);
return 0;
}

Compile and run this, on the same hardware, using the two compilers.
Observe that the answer is different -- one compiler says:

when a=1, c=0: u.value = 0x80000000
when a=0, c=1: u.value = 0x00000001

The other compiler says:

when a=1, c=0: u.value = 0x00000001
when a=0, c=1: u.value = 0x80000000

So which bit order does the hardware use?  Or does the hardware
magically change when the C compiler changes?  :-)

The key to understanding endianness is that it ONLY OCCURS WHEN
YOU TAKE VALUES APART AND PUT THEM TOGETHER AGAIN LATER.  Bit- and
byte-ordering issues only arise when you take the bits and bytes
out of the box, as it were, and line them up.  Sometimes the C
system provides a way to do it automatically, e.g.:

int i = some_value();
unsigned char *p = (unsigned char *)&i;

Now p[0] through p[(sizeof i) - 1] are the various bytes -- as C
defines them (CHAR_BIT units; CHAR_BIT might still be 16 or 32
though) -- as taken apart by the C system.  As long as (sizeof i)
is at least 2, and you think of 0 as "coming before" 1 (and so on),
this defines a byte order.  This is your C system's byte order,
which you have allowed it to impose upon you; it may or may not be
the hardware's byte order, if the hardware even *has* a byte order,
or it might just be something the C compiler writer dreamed up to
make life difficult.

On the other hand, *you* can take control, and dictate the order
yourself.  Suppose, for instance, you want to tell a user that
the value of "i" is thirty-two.  First you tell him it has three
tens, then you tell him it has two more ones.  In fact, this is
a heck of a lot easier to compute if you do it in the opposite
order:

/* assuming i is non-negative */
ones = i % 10;
tens = (i / 10) % 10;
hundreds = (i / 100) % 10;
/* ... and so on */

but once you have computed it, in whatever order you like, you can
then present it to the user, in whatever (perhaps different) order
you like.

You can do the same with "bytes".  Even if your C system has 16-bit
"C bytes", you can *still* break up a value into two 8-bit units,
and display those to a user, or put them in a file, in whichever
order *you* choose:

/* still assuming i is non-negative */
low_half = i % 256;
high_half = (i / 256) % 256;

/* or equivalently, if i is unsigned (to make >> defined) */
low_half = i & 0xff;
high_half = (i >> 8) & 0xff;

Now that you have the "low" and "high" halves, you can present them
in an order *you* choose.

The question is not "what is the byte order", but rather, "who is
to be the master -- you, or your C-compiler vendor?"
--
In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)

Tue, 08 Feb 2005 09:00:20 GMT

 Page 1 of 2 [ 25 post ] Go to page: [1] [2]

Relevant Pages