# questions - bitwise xor c++

## Bit wise '&' with signed vs unsigned operand (5)

I faced an interesting scenario in which I got different results depending on the right operand type, and I can't really understand the reason for it.

Here is the minimal code:

```
#include <iostream>
#include <cstdint>
int main()
{
uint16_t check = 0x8123U;
uint64_t new_check = (check & 0xFFFF) << 16;
std::cout << std::hex << new_check << std::endl;
new_check = (check & 0xFFFFU) << 16;
std::cout << std::hex << new_check << std::endl;
return 0;
}
```

I compiled this code with g++ (gcc version 4.5.2) on Linux 64bit: *g++ -std=c++0x -Wall example.cpp -o example*

The output was:

ffffffff81230000

81230000

I can't really understand the reason for the output in the first case.

Why at some point would any of the temporal calculation results be promoted to a **signed 64bit** value (`int64_t`

) resulting in the sign extension?

I would accept a result of '0' in both cases if a 16bit value is shifted 16 bits left in the first place and then promoted to a 64bit value. I also do accept the second output if the compiler first promotes the `check`

to `uint64_t`

and then performs the other operations.

But how come `&`

with 0xFFFF (`int32_t`

) vs. 0xFFFFU (`uint32_t`

) would result in those two different outputs?

`0xFFFF`

is a signed int. So after the `&`

operation, we have a 32-bit signed value:

```
#include <stdint.h>
#include <type_traits>
uint64_t foo(uint16_t a) {
auto x = (a & 0xFFFF);
static_assert(std::is_same<int32_t, decltype(x)>::value, "not an int32_t")
static_assert(std::is_same<uint16_t, decltype(x)>::value, "not a uint16_t");
return x;
}
```

Your original 16 bits are then left-shifted which results in 32-bit value with the high-bit set (0x80000000U) so it has a negative value. During the 64-bit conversion sign-extension occurs, populating the upper words with 1s.

Let's take a look at

```
uint64_t new_check = (check & 0xFFFF) << 16;
```

Here, `0xFFFF`

is a signed constant, so `(check & 0xFFFF)`

gives us a signed integer by the rules of integer promotion.

In your case, with 32-bit `int`

type, the MSbit for this integer after the left shift is 1, and so the extension to 64-bit unsigned will do a sign extension, filling the bits to the left with 1's. Interpreted as a two's complement representation that gives the same negative value.

In the second case, `0xFFFFU`

is unsigned, so we get unsigned integers and the left shift operator works as expected.

If your toolchain supports `__PRETTY_FUNCTION__`

, a most-handy feature, you can quickly determine how the compiler perceives expression types:

```
#include <iostream>
#include <cstdint>
template<typename T>
void typecheck(T const& t)
{
std::cout << __PRETTY_FUNCTION__ << '\n';
std::cout << t << '\n';
}
int main()
{
uint16_t check = 0x8123U;
typecheck(0xFFFF);
typecheck(check & 0xFFFF);
typecheck((check & 0xFFFF) << 16);
typecheck(0xFFFFU);
typecheck(check & 0xFFFFU);
typecheck((check & 0xFFFFU) << 16);
return 0;
}
```

### Output

```
void typecheck(const T &) [T = int]
65535
void typecheck(const T &) [T = int]
33059
void typecheck(const T &) [T = int]
-2128412672
void typecheck(const T &) [T = unsigned int]
65535
void typecheck(const T &) [T = unsigned int]
33059
void typecheck(const T &) [T = unsigned int]
2166554624
```

The & operation has two operands. The first is an unsigned short, which will undergo the usual promotions to become an int. The second is a constant, in one case of type int, in the other case of type unsigned int. The result of the & is therefore int in one case, unsigned int in the other case. That value is shifted to the left, resulting either in an int with the sign bit set, or an unsigned int. Casting a negative int to uint64_t will give a large negative integer.

Of course you should always follow the rule: If you do something, and you don't understand the result, then don't do that!

The first thing to realize is that binary operators like `a&b`

for built-in types only work if both sides have the same type. (With user-defined types and overloads, anything goes). This might be realized via implicit conversions.

Now, in your case, there definitely is such a conversion, because there simply isn't a binary operator `&`

that takes a type smaller than `int`

. Both sides are converted to at least `int`

size, but what exact types?

As it happens, on your GCC `int`

is indeed 32 bits. This is important, because it means that all values of `uint16_t`

can be represented as an `int`

. There is no overflow.

Hence, `check & 0xFFFF`

is a simple case. The right side is already an `int`

, the left side promotes to `int`

, so the result is `int(0x8123)`

. This is perfectly fine.

Now, the next operation is `0x8123 << 16`

. Remember, on your system `int`

is 32 bits, and `INT_MAX`

is `0x7FFF'FFFF`

. In the absence of overflow, `0x8123 << 16`

would be `0x81230000`

, but that clearly is bigger than `INT_MAX`

so there is in fact overflow.

Signed integer overflow in C++11 is Undefined Behavior. Literally any outcome is correct, including `purple`

or no output at all. At least you got a numerical value, but GCC is known to outright eliminate code paths which unavoidably cause overflow.

[edit]
Newer GCC versions support C++14, where *this particular form of overflow* has become implementation-defined - see Serge's answer.

Your platform has 32-bit `int`

.

Your code is exactly equivalent to

```
#include <iostream>
#include <cstdint>
int main()
{
uint16_t check = 0x8123U;
auto a1 = (check & 0xFFFF) << 16
uint64_t new_check = a1;
std::cout << std::hex << new_check << std::endl;
auto a2 = (check & 0xFFFFU) << 16;
new_check = a2;
std::cout << std::hex << new_check << std::endl;
return 0;
}
```

What's the type of `a1`

and `a2`

?

- For
`a2`

, the result is promoted to`unsigned int`

. - More interestingly, for
`a1`

the result is promoted to`int`

, and then it gets sign-extended as it's widened to`uint64_t`

.

Here's a shorter demonstration, in decimal so that the difference between signed and unsigned types is apparent:

```
#include <iostream>
#include <cstdint>
int main()
{
uint16_t check = 0;
std::cout << check
<< " " << (int)(check + 0x80000000)
<< " " << (uint64_t)(int)(check + 0x80000000) << std::endl;
return 0;
}
```

On my system (also 32-bit `int`

), I get

```
0 -2147483648 18446744071562067968
```

showing where the promotion and sign-extension happens.