C portability [slightly OT] 
Author Message
 C portability [slightly OT]



Wed, 18 Jun 1902 08:00:00 GMT  
 C portability [slightly OT]
[clc - there's a bunch of source code at the bottom of this article
which was originally posted to sci.crypt in Oct 2000, and which AFAIK
hasn't yet been peer-reviewed by anyone with a brain (unless Doug Gwyn
took a look, of course, which isn't impossible). So if anyone wants to
rip into it, please feel free.]

Quote:

<snip>

> Ok, lets put this in context.

> I want to put the values in consectutive 8-bit units of memory to send over
> [say] TCP/IP.

No, you don't. You just think you do.

Quote:

> How do I do that in portable ISO C?

You can't. There's no ISO C defined interface to TCP/IP. BUT... what you
really mean is, how can you jam those bits together, right? Okay, here's
one way coming right up.

Do you remember that fight I had with David Scott in s.c about a year
ago wrt the code quality of scott19u? (Yes, this is relevant.)

Here's a link that'll drop you right into the middle of it:

http://groups.google.com/groups?q=g:thl1115158400d&selm=39F9F783.1C55...

(You might have to mend it for line wrap.)

As you might recall, Mr Scott jumped through some rather ghastly
gcc-shaped hoops in order to get 19-bit integers, and claimed this could
not be done portably. (About now, I need to say "tee-hee"...) So I
decided to show that it could be.

Here is the source code I presented during that debate (except that I've
mended the line-wrap on the #defines). I think you'll find it's a direct
answer to your question. Admittedly, the test-driver at the bottom uses
19 a lot (guess why), but you should be able to plug 8 straight into it
without difficulty.

Now, just before we get to the code - a quick request. Could we please
conduct this discussion in just *one* newsgroup? Preferably this one,
since you're vastly more on-topic here than in sci.crypt. Thanks.

/* Pretend you can see an LGPL licence here. This code, written by
Richard Heathfield in October 2000, provides routines for packing a
bunch of smaller-than-the-maximum-size ints into a bit array, and for
ripping them out again. */

#include <stdio.h>
#include <limits.h>

#define SET_BIT(a, n) (a)[(n) / CHAR_BIT] |= \
                    (unsigned char)(1U << ((n) % CHAR_BIT))
#define CLEAR_BIT(a, n) (a)[(n) / CHAR_BIT] &= \
                   (unsigned char)(~(1U << ((n) % CHAR_BIT)))
#define TEST_BIT(a, n) (((a)[(n) / CHAR_BIT] & \
                   (unsigned char)(1U << ((n) % CHAR_BIT))) ? 1 : 0)

/* Debugging function, used for printing len * CHAR_BIT
 * bits from s.
 */
int print_bits(unsigned char *s, int len)
{
  int i, j;
  for(i = 0; i < len; i++)
  {
    for(j = 0; j < CHAR_BIT; j++)
    {
      printf("%d", TEST_BIT(s, i * CHAR_BIT + j) ? 1 : 0);
    } printf(" ");
  }
  printf("\n");
  return 0;

Quote:
}

unsigned int BitsInUnsignedInt(void)
{
  static unsigned int answer = 0;
  unsigned int testval = UINT_MAX;
  if(answer == 0)
  {
    while(testval > 0)
    {
      ++answer;
      testval >>= 1;
    }
  }

  return answer;

Quote:
}

/* This function gets the Indexth n-bit unsigned int field from the bit
array. To do this, it builds the
 * unsigned int value bit by bit.
 *
 * Example call:
 *
 * unsigned int val;
 * val = Get_nBit_Int(MyBitArray,   this is the base address
 *                    19,           get a 19-bit number
 *                    13,           get the 14th number (0 to max - 1)
 *                     7);          skip 7 leading bits at the start of
the array
 *
 */
unsigned int Get_nBit_Int(unsigned char *BitArray,
                          unsigned int n,
                          unsigned int Index,
                          unsigned int BaseBit)
{
  unsigned int Value = 0;
  unsigned int j;
  unsigned int i = Index * n;

  if(n <= BitsInUnsignedInt())
  {
    i += BaseBit;
    BitArray += i / CHAR_BIT;
    i %= CHAR_BIT;

    for(j = 0; j < n; j++)
    {
      /* Move the populated bits out of the way.
       * Yes, this means that the first iteration
       * of the loop does a useless shift. I think
       * I can live with that. :-)
       */
      Value <<= 1;

      /* Populate the low bit */
      Value |= TEST_BIT(BitArray, i + j);
    }
  }

  return Value;

Quote:
}

void Put_nBit_Int(unsigned char *BitArray,
                  unsigned int n,
                  unsigned int Index,
                  unsigned int BaseBit,
                  unsigned int Value)
{
  unsigned int j;
  unsigned int i = Index * n;

  if(n <= 32)
  {
    i += BaseBit;

    BitArray += i / CHAR_BIT;
    i %= CHAR_BIT;

    j = n;
    while(j--)
    {
      /* Use the rightmost bit */
      if(Value & 1)
      {
        SET_BIT(BitArray, i + j);
      }
      else
      {
        CLEAR_BIT(BitArray, i + j);
      }
      /* Throw the rightmost bit away, moving the next bit into
position. On the
       * last iteration of the loop, this instruction is pointless.
<shrug>
       */
      Value >>= 1;
    }
  }

Quote:
}

/* Driver requires at least 19-bit ints. :-) */

int main(void)
{
  unsigned char test_array[9] = {0};

  print_bits(test_array, 9);
  printf("Storing the 19-bit value 0x7FFFF starting at bit 3.\n");
  Put_nBit_Int(test_array, 19, 0, 3, 0x7FFFF);
  print_bits(test_array, 9);
  printf("Retrieving the 19-bit value starting at bit 3: %X\n",
          Get_nBit_Int(test_array, 19, 0, 3));
  printf("Storing the 19-bit value 0x7EDCB "
         "starting at bit 3 + (2 * 19).\n");
  Put_nBit_Int(test_array, 19, 2, 3, 0x7EDCB);
  print_bits(test_array, 9);
  printf("Retrieving the 19-bit value "
         "starting at bit 3 + (2 * 19):%X\n",
         Get_nBit_Int(test_array, 19, 2, 3));

  return 0;

Quote:
}

HTH. HAND.

--

"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton



Mon, 28 Jun 2004 04:02:22 GMT  
 C portability [slightly OT]


Wed, 18 Jun 1902 08:00:00 GMT  
 C portability [slightly OT]

Quote:



> > > You're telling me there is no portable way for a machine with "char != 8
> > > bits" to talk to a HTTP [for example] server?

> > There's no portable way for /any/ program written in /any/ programming
> > language to talk to an HTTP server.

> I'm not sure what criteria you're applying for portability here.

Works everywhere. Even on my MS-DOS box.

Quote:
> I would consider CGI a portable means for a C program to talk to
> an HTTP server, because it works without using any language
> extensions. Also, I think you're overgeneralizing: I'm sure
> there are languages that have built-in means to talk to HTTP
> servers, so programs written in those languages for talking to
> HTTP servers are as portable as the languages themselves.

<shrug> Find such a language, write a program in it to talk to an HTTP
server, and run it on my MS-DOS box (which has no NIC, no modem, and no
HTTP server running on it). I'll be *really* impressed. :-)

--

"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton



Mon, 28 Jun 2004 04:12:36 GMT  
 C portability [slightly OT]

Quote:

<snip>

> So to get respect from a group I should code my stuff for every imaginable
> platform even ones where the library doesn't apply or not likely to be used
> in the first place?

Not at all. Rather, you must make a decision about what platforms you
intend to support. Having made that decision, publish it. In other
words, don't say "This is 100% ISO C" when it isn't. Don't say "this
runs on anything" when you mean "this runs on Windows, and probably
Linux too".

There's nothing terribly shameful about writing non-portable code. I
suspect most of the comp.lang.c regulars write quite a bit of n-p stuff.
The trick is to know what is portable and what is not, and to know how
important portability is (or is not) to your project.

(Of course, in comp.lang.c, we discuss portable code, so if you want
help with non-portable constructs you'll need to seek a
platform-specific newsgroup.)

Quote:
> This also horribly slows down all the code.

IME portability can indeed have a performance cost, but it's rarely a
/huge/ cost. And the benefit is that, when a super-duper new platform
comes along that can really zing your code along, you can move the code
over to it with minimum (or even zero) hassle. :-)

Quote:
> I have todo "& 255" whenever I
> touch a byte [primary example], to me anyone using a platform that cannot
> work on
> a byte unit [and requires byte unit arithmetic] is cheating themselves.

All C implementations can do arithmetic on bytes, so I don't see your
point here.

--

"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton



Mon, 28 Jun 2004 04:21:37 GMT  
 C portability [slightly OT]

[...]

Quote:
> So to get respect from a group I should code my stuff for every imaginable
> platform even ones where the library doesn't apply or not likely to be used
> in the first place?  I have todo "& 255" whenever I touch a byte [primary
> example],

Are you talking about practical portability (in which case you are most
likely not limited to ISO C, but to a more powerful standard like POSIX),
or impressing comp.lang.c?  They're two different things.  :)

Quote:
> This also horribly slows down all the code.

Have you actually done so (& 255) and observed a difference?

Also, it's not at all difficult for a compiler to optimize away any
& or % operation that has no effect, because all it has to do is to
compare the operand (255 for &, or 256 for %) with the size of the
type.  It's similar to the very common optimization of substituting
divisions by a power of two with shifting.

More importantly to me, your code would be visually misleading me.  I
imagine the most common application of your technique would be something
like this:

  unsigned long a;
  unsigned char buffer[...];

  buffer[i] = a;
  buffer[i+1] = a >> 8;
  buffer[i+2] = a >> 16;
  buffer[i+3] = a >> 24;

which takes advantage of the assignments naturally chopping off the
unnecessary bits on the "left" side on 8-bit byte platforms.  This
code is not as clear to me as:

  buffer[i] = a & 0xFF;
  buffer[i+1] = (a >> 8) & 0xFF;
  buffer[i+2] = (a >> 16) & 0xFF;
  buffer[i+3] = (a >> 24) & 0xFF;

which is idiomatic - you are extracting bits.  Given that CPU
performance progress continues to outpace I/O performance progress,
does this really slow down your server to a degree that deserves the
adjective "horrible"?

Quote:
> to me anyone using a platform that
> cannot work on a byte unit [and requires byte unit arithmetic] is
> cheating themselves.

If the {*filter*} application of your software system does not require
byte-oriented processing (DSP, for instance), then it's reasonable not
to add a traditional CPU just to do a little work.

Remember that a lot of non-technical considerations and historical
quirks caused the great similarities in computer architecture you see
today.  If the economics change in the future, it won't be surprising
to see more 24-bit CPUs dedicated to video processing, for example.
The one-billion potential computer users in China all need more than
8 bits for one Chinese character, so if China and other countries
become richer, it won't be surprising that it becomes too expensive to
separately design CPUs with 8-bit bytes for Western use.

I'm not saying you should write all your code assuming China will
become rich.  :)  I'm saying that you could:

  assert(CHAR_BIT == 8);

when you write this kind of code, especially in comp.lang.c



Mon, 28 Jun 2004 07:17:20 GMT  
 C portability [slightly OT]

[...]

Quote:
> You know what... {*filter*} it.  Big first page in my manual will say "char == 8
> bits".

Don't get too frustrated.  Different people judge code by different
standards.  I think you should expect comp.lang.c to frown upon any
code that assumes beyond what C guarantees, evaluate that objection
based on how much effort it takes to comply and how much benefit you
derive, and proceed from there.  If you go to comp.arch.9{*filter*}ar,
wouldn't you expect some heat?  :)

Quote:
> I really can't stand this shit.  "char == 9 bits" has NO practical value in
> a modern widely deployed computer.

I understand your frustration, but if this is true, many DSP-based
systems are either not practical, not modern, or not widely-deployed,
assuming that you really mean the more general "char != 8 bits" rather
than strictly "char == 9 bits".

Quote:
> You guys point out estoteric embedded systems or mainframes, but 99% of the
> people who will use my code are on home PCs [i.e x86 systems, or 68k
> systems].  So if someone on a PIC [etc] microcontroller can't use my crypto
> lib [which wouldn't fit in the memory of a typical PIC anyways] than tuff.

Please note:

  1.  The 1% of your users might ship 10,000,000 units of their
      embedded systems, and supply a much larger share of your
      royalties.  (A similar point goes even for free software.)

  2.  A PIC is unlikely to support hosted ISO C in any practical
      way, but there's much more to embedded systems than just
      microcontrollers.

  3.  I can see plenty of uses for a crypto library in embedded
      systems.



Mon, 28 Jun 2004 07:36:22 GMT  
 C portability [slightly OT]



Quote:




> > > Ok, lets put this in context.

> > > I want to put the values in consectutive 8-bit units of memory
> > > to send over [say] TCP/IP.

> > Why do they need to be in consecutive 8-bit units of memory?

> > Many platforms do not have consecutive 8-bit units of memory.

> Such as?

> I consider most platforms x86, 68k or Alpha derivatives.

> > I think you just want them in consecutive bytes.

> > > How do I do that in portable ISO C?

> > unsigned long tomsbits = 0x55443322;

> > unsigned char consecutive[4];

> > consecutive[0] = (tomsbits & 0xFF000000) >> 24;
> > consecutive[1] = (tomsbits & 0xFF0000  ) >> 16;
> > consecutive[2] = (tomsbits & 0xFF00    ) >>  8;
> > consecutive[3] =  tomsbits & 0xFF;

> > Then send consecutive to whatever function expects four bytes of memory
to
> > send over TCP/IP.

> Arrg...

> You know what... {*filter*} it.  Big first page in my manual will say "char == 8
> bits".

> I really can't stand this shit.  "char == 9 bits" has NO practical value
in
> a modern widely deployed computer.

> You guys point out estoteric embedded systems or mainframes, but 99% of
the
> people who will use my code are on home PCs [i.e x86 systems, or 68k
> systems].

But a tiny fraction of users of programs written in C
use your code.

Quote:
>So if someone on a PIC [etc] microcontroller can't use my crypto
> lib [which wouldn't fit in the memory of a typical PIC anyways] than tuff.

I don't see any problem with you focusing in on a particular
range of platforms only, this is often a 'real world' issue.
This means you'd sacrificing portability, but for many that's
not really an issue.  You'll have to decide for yourself.

-Mike



Mon, 28 Jun 2004 09:20:26 GMT  
 C portability [slightly OT]


Quote:
>... This also horribly slows down all the code.  I have todo "& 255"
>whenever I touch a byte [primary example] ...

This should, from any reasonably competent compiler used with
optimization enabled, have no effect on the generated code.

Consider, for instance, gcc's output:

    extern void use(int);

    /* in f0(), mask out unwanted high bits if any, just in case
       CHAR_BIT > 8 */
    void f0(unsigned char *p) {
        use(p[0] & 255);
        use(p[1] & 255);
    }

    /* in f1, just assume CHAR_BIT == 8, or that if CHAR_BIT > 8, no
       unwanted high bits are ever set */
    void f1(unsigned char *p) {
        use(p[0]);
        use(p[1]);
    }

I compiled this with "cc -O2 -mregparm=3 -S" and got this code (plus
of course the usual scaffolding bits, which I deleted):

f0:
        pushl %ebp
        movl %esp,%ebp
        subl $20,%esp
        pushl %ebx
        movl %eax,%ebx
        movzbl (%ebx),%eax
        call use
        movzbl 1(%ebx),%eax
        call use
        popl %ebx
        leave
        ret

f1:
        pushl %ebp
        movl %esp,%ebp
        subl $20,%esp
        pushl %ebx
        movl %eax,%ebx
        movzbl (%ebx),%eax
        call use
        movzbl 1(%ebx),%eax
        call use
        popl %ebx
        leave
        ret

Note that the generated code is absolutely identical in the two
functions.  The "& 255" is redundant because CHAR_BIT is 8 and thus
UCHAR_MAX is 255; so the compiler omitted it.  (This only tests
the "load" case, but I tried the "store" case too and gcc omits
unnecessary masks there as well.)

Note that you can, if you like, test UCHAR_MAX explicitly
as well:

    /* octet() converts an unsigned char to an "octet", i.e., 8 bits */
    #if UCHAR_MAX == 255 /* we know UCHAR_MAX >= 255 */
    #define octet(x) (x) /* if it is exactly 255, it is already 8 bits */
    #else                /* otherwise it must be more bits, so mask */
    #define octet(x) ((x) & 255)
    #endif

Now you can use octet() and be sure that even poorly-optimizing
compilers (such as the old VAX PCC) on typical 8-bit machines will
generate no unnecessary masks.
--
In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)





Mon, 28 Jun 2004 09:38:30 GMT  
 C portability [slightly OT]
On Wednesday, in article


...

Quote:
>> Come on, Tom - it's not /that/ hard to write portable code. :-)

>It is if you require a comforming compiler.

Having a conforming compiler makes writing portable code easier.

Quote:
>Also I make silly assumptions like

>char == 8-bits

OK.

Quote:
>and bits loaded in order 01234567.

This makes no sense. In C individual bits don't have addresses so
talking about bit ordering in a char is meaningless. There is no
way to make C code dependent on this.

--
-----------------------------------------


-----------------------------------------



Mon, 28 Jun 2004 09:55:47 GMT  
 C portability [slightly OT]
On Wednesday, in article


...

Quote:
>So to get respect from a group I should code my stuff for every imaginable
>platform even ones where the library doesn't apply or not likely to be used
>in the first place?

It isn't a matter of respect, you can write non-portable code if you like
as long as you are clear about it. However you do make the problem sound
much worse than it is. You don't have to consider each and every imaginable
platform, you simply use the facilities and behaviour that are guaranteed
by the standard.

Quote:
>This also horribly slows down all the code.

I have to disagree with that.

Quote:
> I have todo "& 255" whenever I
>touch a byte [primary example], to me anyone using a platform that cannot
>work on
>a byte unit [and requires byte unit arithmetic] is cheating themselves.

Consider

extern unsigned char byte;

void foo(unsigned value)
{
    byte = value;

Quote:
}

void bar(unsigned value)
{
    byte = value & 255;

Quote:
}

MSVC6 compiles this to (unrelated bits trimmed):

_foo    PROC NEAR                                       ; COMDAT

; 5    :     byte = value;

  00000 8a 44 24 04      mov     al, BYTE PTR _value$[esp-4]
  00004 a2 00 00 00 00   mov     BYTE PTR _byte, al

; 6    : }

  00009 c3               ret     0
_foo    ENDP

_bar    PROC NEAR                                       ; COMDAT

; 10   :     byte = value & 255;

  00000 8a 44 24 04      mov     al, BYTE PTR _value$[esp-4]
  00004 a2 00 00 00 00   mov     BYTE PTR _byte, al

; 11   : }

  00009 c3               ret     0

I.e. it generates the same code for foo() and bar(). The & 255 operation
has absolutely no overhead here, it is a trivial optimisation for
compilers targetting 8 bit character types to eliminate it.

--
-----------------------------------------


-----------------------------------------



Mon, 28 Jun 2004 09:59:42 GMT  
 C portability [slightly OT]

Quote:




> > > > You're telling me there is no portable way for a machine with "char != 8
> > > > bits" to talk to a HTTP [for example] server?

> > > There's no portable way for /any/ program written in /any/ programming
> > > language to talk to an HTTP server.

> > I'm not sure what criteria you're applying for portability here.

> Works everywhere. Even on my MS-DOS box.

> > I would consider CGI a portable means for a C program to talk to
> > an HTTP server, because it works without using any language
> > extensions. Also, I think you're overgeneralizing: I'm sure
> > there are languages that have built-in means to talk to HTTP
> > servers, so programs written in those languages for talking to
> > HTTP servers are as portable as the languages themselves.

> <shrug> Find such a language, write a program in it to talk to an HTTP
> server, and run it on my MS-DOS box (which has no NIC, no modem, and no
> HTTP server running on it). I'll be *really* impressed. :-)

"Portable" doesn't imply "universal".  If I write a program which
works on the system on which I wrote it, and it would work exactly the
same on any other type of system, it's portable.  Just not very.

Micah



Mon, 28 Jun 2004 15:35:20 GMT  
 
 [ 27 post ]  Go to page: [1] [2]

 Relevant Pages 
 

 
Powered by phpBB® Forum Software