Beware of Type Casting in C - How to do it Right

Introduction

In C programming, type casting refers to the process of converting or representing a value from one data type to another. There are 2 type casting types:

implicit casting: You have an expression involving multiple data types, the compiler will automatically do the appropriate type casting for you (automatic).
explicit casting: You as a programmer converts a data type from one to another intentionally (not automatic).

Type casting should be done very carefully because it could result in loss of precision. This is apparent especially if you do not understand the principles behind how different data types are represented. We will focus on explicit type casting in this article.

Explicit Type Casting Example

To explain casting principle, consider the example below where we use printf() to print variables casted to different data types to observe the effect of type casting.

void castingExample(void)
{
	int myint = 80000;
	int myint2 = 5000;
	char mychar = -1; //2's complement to represent negative numbers

	printf("myint cast to short = %d\n", (short) myint);
	printf("myint2 cast to short = %d\n", (short) myint2);
	printf("myint cast to unsigned int = %u\n", (unsigned int) myint);
	printf("myint cast to unsigned long = %lu\n", (unsigned long) myint);
	printf("mychar cast to unsigned char = %d\n", (unsigned char) mychar);
}

The above example will produce the output below. As you can see, there is indeed some loss of precisions after casting. Let’s take a closer look.

myint cast to short = 14464
myint2 cast to short = 5000
myint cast to unsigned int = 80000
myint cast to unsigned long = 80000
mychar cast to unsigned char = 255

Loss of Precision Case

myint of integer type has a value of 80000, but it becomes 14464 after casting to short data type, but why?

80000 represented in binary is:

byte3    byte2    byte1    byte0
00000000 00000001 00111000 10000000

As you can see, it needs at least 3 bytes to represent the number 80000. If we were to cast it to data type short, which is 2 bytes, we are basically removing the contents of byte2 from binary representation of 80000 above. So it becomes:

byte1    byte0
00111000 10000000

and this binary value (00111000 10000000) represents 14464 in decimal representation, which is printed by printf().

No Loss of Precision Case

myint2 of integer type has a value of 5000, and its value is still 5000 after casting to short because 5000 is a much smaller number compared to 80000 and it is small enough to be represented in 2 bytes, so there is no precision loss there.

From this representation:

byte3    byte2    byte1    byte0
00000000 00000000 00010011 10001000

To this one below, and both have the decimal representation of 5000.

byte1    byte0
00010011 10001000

How Negative Numbers Are Represented

In the same example, mychar is initialized to the value of -1 but it shows 255 when printed as integer. Isn’t -1 also an integer? What is happening here? This has something to do with how a negative number is represented in C. Something called 2's complement.

To represent -1, we first look at the value with the negative sign in binary:

byte0
00000001

Complement it (change 0 to 1 and 1 to 0)

byte0
11111110

then we add 1 to it

11111111

Binary 11111111 equals to 255 and this explains why mychar with value of -1 is printed as 255 when casting to integer.

So, How do you Cast?

If you have to do a type casing in your C application, you have to be aware of the size when casting from a larger data type to a smaller data type to avoid data loss. This normally involves that you be on top of your application logic, for example, knowing the maximum value of a variable could be in your application. This way, you know if data loss is to occur after the casting so you can adjust your logic accordingly.

Generally speaking, when you are working with small values, casting between unsigned to signed, 4 bytes to 2 bytes …etc, casting is generally fine. When the values of your variables are large, that is why you need to be careful.