c++ linking - How does this program work?





files to (12)


hey it had to print something so it printed a 0. Remember in C 0 is everything else!

#include <stdio.h>

int main() {
    float a = 1234.5f;
    printf("%d\n", a);
    return 0;
}

It displays a 0!! How is that possible? What is the reasoning?


I have deliberately put a %d in the printf statement to study the behaviour of printf.




It won't convert automatically float to integer. Because both has different format of storage. So if you want to convert, use (int) typecasting.

#include <stdio.h>

int main() {
    float a = 1234.5f;
    printf("%d\n", (int)a);
    return 0;
}



The reason is that printf() is a pretty dumb function. It does not check types at all. If you say the first argument is an int (and this is what you are saying with %d), it believes you and it takes just the bytes needed for an int. In this case, asuming your machine uses four-byte int and eight-byte double (the float is converted to a double inside printf()), the first four bytes of a will be just zeroes, and this gets printed.




Because you invoked undefined behaviour: you violated the contract of the printf() method by lying to it about its parameter types, so the compiler is free to do whatever it pleases. It could make the program output "dksjalk is a ninnyhead!!!" and technically it would still be right.




That's because %d expects an int but you've provided a float.

Use %e/%f/%g to print the float.


On why 0 is printed: The floating point number is converted to double before sending to printf. The number 1234.5 in double representation in little endian is

00 00 00 00  00 4A 93 40

A %d consumes a 32-bit integer, so a zero is printed. (As a test, you could printf("%d, %d\n", 1234.5f); You could get on output 0, 1083394560.)


As for why the float is converted to double, as the prototype of printf is int printf(const char*, ...), from 6.5.2.2/7,

The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter. The default argument promotions are performed on trailing arguments.

and from 6.5.2.2/6,

If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions.

(Thanks Alok for finding this out.)




Since you tagged it with C++ as well, this code does the conversion as you probably expect:

#include <iostream.h>

int main() {
    float a = 1234.5f;
    std::cout << a << " " << (int)a << "\n";
    return 0;
}

Output:

1234.5 1234



You want %f not %d




You just need to use the appropriate format specifier (%d,%f,%s,etc.) with the relevant data type (int,float, string, etc.).




%d is decimal

%f is float

see more of these here.

You are getting 0 because floats and integers are represented differently.




It's not an integer. Try using %f.




The %d specifier tells printf to expect an integer. So the first four (or two, depending on the platform) bytes of the float are intepreted as an integer. If they happen to be zero, a zero is printed

The binary representation of 1234.5 is something like

1.00110100101 * 2^10 (exponent is decimal ...)

With a C compiler which represents float actually as IEEE754 double values, the bytes would be (if I made no mistake)

01000000 10010011 01001010 00000000 00000000 00000000 00000000 00000000

On an Intel (x86) system with little endianess (i.e. the least significant byte coming first), this byte sequence gets reversed so that the first four bytes are zero. That is, what printf prints out ...

See This Wikipedia article for floating point representation according to IEEE754.




The guide to getting fired: How to abuse function pointers in GCC on x86 machines by compiling your code by hand:

These string literals are bytes of 32-bit x86 machine code. 0xC3 is an x86 ret instruction.

You wouldn't normally write these by hand, you'd write in assembly language and then use an assembler like nasm to assemble it into a flat binary which you hexdump into a C string literal.

  1. Returns the current value on the EAX register

    int eax = ((int(*)())("\xc3 <- This returns the value of the EAX register"))();
    
  2. Write a swap function

    int a = 10, b = 20;
    ((void(*)(int*,int*))"\x8b\x44\x24\x04\x8b\x5c\x24\x08\x8b\x00\x8b\x1b\x31\xc3\x31\xd8\x31\xc3\x8b\x4c\x24\x04\x89\x01\x8b\x4c\x24\x08\x89\x19\xc3 <- This swaps the values of a and b")(&a,&b);
    
  3. Write a for-loop counter to 1000, calling some function each time

    ((int(*)())"\x66\x31\xc0\x8b\x5c\x24\x04\x66\x40\x50\xff\xd3\x58\x66\x3d\xe8\x03\x75\xf4\xc3")(&function); // calls function with 1->1000
    
  4. You can even write a recursive function that counts to 100

    const char* lol = "\x8b\x5c\x24\x4\x3d\xe8\x3\x0\x0\x7e\x2\x31\xc0\x83\xf8\x64\x7d\x6\x40\x53\xff\xd3\x5b\xc3\xc3 <- Recursively calls the function at address lol.";
    i = ((int(*)())(lol))(lol);
    

Note that compilers place string literals in the .rodata section (or .rdata on Windows), which is linked as part of the text segment (along with code for functions).

The text segment has Read+Exec permission, so casting string literals to function pointers works without needing mprotect() or VirtualProtect() system calls like you'd need for dynamically allocated memory. (Or gcc -z execstack links the program with stack + data segment + heap executable, as a quick hack.)


To disassemble these, you can compile this to put a label on the bytes, and use a disassembler.

// at global scope
const char swap[] = "\x8b\x44\x24\x04\x8b\x5c\x24\x08\x8b\x00\x8b\x1b\x31\xc3\x31\xd8\x31\xc3\x8b\x4c\x24\x04\x89\x01\x8b\x4c\x24\x08\x89\x19\xc3 <- This swaps the values of a and b";

Compiling with gcc -c -m32 foo.c and disassembling with objdump -D -rwC -Mintel, we can get the assembly, and find out that this code violates the ABI by clobbering EBX (a call-preserved register) and is generally inefficient.

00000000 <swap>:
   0:   8b 44 24 04             mov    eax,DWORD PTR [esp+0x4]   # load int *a arg from the stack
   4:   8b 5c 24 08             mov    ebx,DWORD PTR [esp+0x8]   # ebx = b
   8:   8b 00                   mov    eax,DWORD PTR [eax]       # dereference: eax = *a
   a:   8b 1b                   mov    ebx,DWORD PTR [ebx]
   c:   31 c3                   xor    ebx,eax                # pointless xor-swap
   e:   31 d8                   xor    eax,ebx                # instead of just storing with opposite registers
  10:   31 c3                   xor    ebx,eax
  12:   8b 4c 24 04             mov    ecx,DWORD PTR [esp+0x4]  # reload a from the stack
  16:   89 01                   mov    DWORD PTR [ecx],eax     # store to *a
  18:   8b 4c 24 08             mov    ecx,DWORD PTR [esp+0x8]
  1c:   89 19                   mov    DWORD PTR [ecx],ebx
  1e:   c3                      ret    

  not shown: the later bytes are ASCII text documentation
  they're not executed by the CPU because the ret instruction sends execution back to the caller

This machine code will (probably) work in 32-bit code on Windows, Linux, OS X, and so on: the default calling conventions on all those OSes pass args on the stack instead of more efficiently in registers. But EBX is call-preserved in all the normal calling conventions, so using it as a scratch register without saving/restoring it can easily make the caller crash.





c++ c memory printf endianness