Behaviour of char type. 
Author Message
 Behaviour of char type.

Hi,

I am writing an application that will make extensive use of 8-bit values.

Given:
unsigned char mybyte;

I have these questions:-
1) sizeof(char) is guaranteed to be return 1 but the actual
implementation of a char may be any size (8bits, 16 bits or greater). Is
this correct?

2) Given this code
    unsigned char mybyte;
    int count;
    for (count=0; count < 600; count++)
    {
        mybyte = count;
        printf("Value: %i, Modulo: %i\n", mybyte, mybyte % 256);
    }
the printed values are identical. ie: mybyte is behaving like an 8 bit
value and wraps around to 0 without requiring a modulo operation. Why is
this?

3) Given this code (using decls above)
        mybyte = 256;
        printf("%i, %i, %i\n", mybyte+1, mybyte+258, (mybyte+258)%256));
gives: 1, 258, 2 respectively.
Why does mybyte+1 wrap around and mybyte+258 doesn't.

Can someone please give me some super-safe guidelines on using unsigned
chars to hold byte data. Thanks alot. matthew.



Sun, 03 Apr 2005 09:34:41 GMT  
 Behaviour of char type.
Submitted by "Matthew" to comp.lang.c:

Quote:
> Hi,

> I am writing an application that will make extensive use of 8-bit values.

> Given:
> unsigned char mybyte;

> I have these questions:-
> 1) sizeof(char) is guaranteed to be return 1 but the actual
> implementation of a char may be any size (8bits, 16 bits or greater). Is
> this correct?

sizeof(char) is always 1.  The number of bits in a char is
CHAR_BIT (in <limits.h>).

Quote:

> 2) Given this code
>     unsigned char mybyte;
>     int count;
>     for (count=0; count < 600; count++)
>     {
>         mybyte = count;
>         printf("Value: %i, Modulo: %i\n", mybyte, mybyte % 256);
>     }
> the printed values are identical. ie: mybyte is behaving like an 8 bit
> value and wraps around to 0 without requiring a modulo operation. Why is
> this?

You need to look at UCHAR_MAX in <limits.h>.

Quote:
> 3) Given this code (using decls above)
>         mybyte = 256;
>         printf("%i, %i, %i\n", mybyte+1, mybyte+258, (mybyte+258)%256));
> gives: 1, 258, 2 respectively.
> Why does mybyte+1 wrap around and mybyte+258 doesn't.

Because mybyte is converted to an int before the addition takes
place.

Quote:

> Can someone please give me some super-safe guidelines on using unsigned
> chars to hold byte data. Thanks alot. matthew.

--

-----------------------------+------ This post ends with :wq


Sun, 03 Apr 2005 09:43:54 GMT  
 Behaviour of char type.

Quote:

> unsigned char mybyte;

> I have these questions:-
> 1) sizeof(char) is guaranteed to be return 1 but the actual
> implementation of a char may be any size (8bits, 16 bits or
> greater). Is this correct?

Yes, char may be any size as long as it has at least 8 bits.

Quote:
> 2) Given this code
>     unsigned char mybyte;
>     int count;
>     for (count=0; count < 600; count++)
>     {
>         mybyte = count;
>         printf("Value: %i, Modulo: %i\n", mybyte, mybyte % 256);
>     }
> the printed values are identical. ie: mybyte is behaving like an 8 bit
> value and wraps around to 0 without requiring a modulo operation. Why
> is this?

All unsigned types work this way in C.  Unsigned arithmetic is
performed modulo Utype_MAX + 1.

Quote:
> 3) Given this code (using decls above)
>         mybyte = 256;

This gives `mybyte' the value 0 on a 8-bit char system.

Quote:
>         printf("%i, %i, %i\n", mybyte+1, mybyte+258, (mybyte+258)%256));
> gives: 1, 258, 2 respectively.
> Why does mybyte+1 wrap around and mybyte+258 doesn't.

There is no wrap-around involved in either one.  `mybyte + 1' is
0 + 1, which has the value 1, `mybyte + 258' is 0 + 258, which is
258.

There is some subtlety here, though.  In an expression like
`mybyte + 1', implicit conversions are going on.  First, the
"integer promotions" are applied to `mybyte'.  These cause the
value of `mybyte' to be converted to `int', if `int' can
represent all the values in `unsigned char', which is typically
the case; otherwise, it is converted to `unsigned'.  Second, the
"usual arithmetic conversions" are applied to this value and the
literal 1 (which has type `int') in order to bring them to a
common type.  If the first operand was converted to type `int',
then nothing needs to be done, since both operands are then of
the same type; otherwise, the `int' operand is converted to type
`unsigned'.

You should read the standard if you're interested in these
details.

Quote:
> Can someone please give me some super-safe guidelines on using
> unsigned chars to hold byte data. Thanks alot. matthew.

Not sure what to offer you here.  Perhaps Chris Torek will
provide a tutorial...
--
"This is a wonderful answer.
 It's off-topic, it's incorrect, and it doesn't answer the question."
--Richard Heathfield


Sun, 03 Apr 2005 09:39:04 GMT  
 Behaviour of char type.
On Wed, 16 Oct 2002 14:34:41 +1300, Matthew

<snip>

Quote:

>Can someone please give me some super-safe guidelines on using unsigned
>chars to hold byte data. Thanks alot. matthew.

This probably depends on what you mean by "byte". In C, a byte is the
storage required for a char (which could be 8 bits or anything
larger). You probably want to deal with octets (8 bit values).

Basically, this would be my advice:

Don't assume anything not guaranteed by the standard. In your case,
this means that you should keep in mind that unsigned char may be more
than 8 bits. Use the limits.h macros if you need to perform
calculations that depend on the size. CHAR_BIT tells you the number of
bits chars use, UCHAR_MAX tells you the largest value an unsigned char
can store. You might want to define your own:

#define BYTE_BIT (CHAR_BIT*sizeof(byte))
#define BYTE_MAX ((byte)-1)

(I know that the first one looks silly, but I did it that way for a
reason. I'll explain in a bit.)

You can mask off extra bits after a calculation like this:

unsigned char a, b, c;

...

a = (b + c) & 0xff;

This will ensure that if the system uses larger unsigned chars, the
values will be the same as if it didn't.

If you need to do some kind of file or port I/O, you may have
problems. On a system with larger chars, you probably won't be able to
read or write octets.

You might be able to test your code by replacing

typedef unsigned char byte;

with

typedef unsigned long byte;

and seeing if it still works correctly. The versions of BYTE_BIT and
BYTE_MAX I gave will auto-update for whatever type you define byte to
be (as long as it is an unsigned type). That's why I put the
apparently redundant "*sizeof(byte)" in the BYTE_BIT macro.

I'm not really sure how effective this kind of testing would be.
Several standard library functions might have trouble with it unless
you multiply by sizeof(byte) a lot:

byte byte_buff[SIZE];
memset(byte_buff, 0, SIZE*sizeof(byte));

In which case you maintenance programmer will probably think you're
nuts.

Good luck.

-Kevin



Sun, 03 Apr 2005 11:19:35 GMT  
 Behaviour of char type.

Quote:
> I am writing an application that will make extensive use of 8-bit values.

At last in C90, the C language has no concept of fixed width type.

Quote:
> unsigned char mybyte;

> I have these questions:-
> 1) sizeof(char) is guaranteed to be return 1 but the actual

Yes.

Quote:
> implementation of a char may be any size (8bits, 16 bits or greater). Is
> this correct?

Yes.

Quote:
> 2) Given this code
>     unsigned char mybyte;
>     int count;
>     for (count=0; count < 600; count++)
>     {
>         mybyte = count;

If you want 8-bit, you must trim the value:

         mybyte = count & 0xFF;

Quote:
>         printf("Value: %i, Modulo: %i\n", mybyte, mybyte % 256);
>     }
> the printed values are identical. ie: mybyte is behaving like an 8 bit

Because you have an 8 bit machine.

Quote:
> value and wraps around to 0 without requiring a modulo operation. Why is
> this?

By-chance. With DSPs, you may have surprises...

Quote:
> 3) Given this code (using decls above)
>         mybyte = 256;

Implementation-dependent behaviour. If your char are 8-bit, you will have
0.

Quote:
>         printf("%i, %i, %i\n", mybyte+1, mybyte+258, (mybyte+258)%256));

Assuming an 8-bit char machine, will expands to
0+1 = 1 (int)
0+258 = 258 (int)
(0+258)%256 = 258 % 256 = 2 (int)

Quote:
> gives: 1, 258, 2 respectively.

Correct on an 8-bit char.

Quote:
> Why does mybyte+1 wrap around and mybyte+258 doesn't.

Because printf() being a variadic, char are promoted to int.

--
-ed- emdel at noos.fr ~]=[o
FAQ de f.c.l.c : http://www.isty-info.uvsq.fr/~rumeau/fclc/
C-library: http://www.dinkumware.com/htm_cl/index.html
"Mal nommer les choses c'est ajouter du malheur au monde."
-- Albert Camus.



Sun, 03 Apr 2005 13:59:38 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. what is the difference between this types: char and System.Char

2. (Multiple) converting type list char/CString/char[]

3. (Multiple) converting type list char/CString/char[]

4. How to change int type to char type

5. How to convert char type to binary type

6. Type conversion-How can I convert cstring data type to char*

7. How to convert char type to binary type

8. sscanf() behaviour on chars outside 0 to 127

9. Weird *char behaviour under MS VC 6.0

10. Defined behaviour of int = char ???

11. Strange behaviour of char arrays

12. Please Help - Weird behaviour of unsigned type

 

 
Powered by phpBB® Forum Software