[C++] GCC -fPIC option


Answers

I'll try to explain what has already been said in a simpler way.

Whenever a shared lib is loaded, the loader (the code on the OS which load any program you run) changes some addresses in the code depending on where the object was loaded to.

In the above example, the "111" in the non-PIC code is written by the loader the first time it was loaded.

For not shared objects, you may want it to be like that because the compiler can make some optimizations on that code.

For shared object, if another process will want to "link" to that code he must read it to the same virtual addresses or the "111" will make no sense. but that virtual-space may already be in use in the second process.

Question

I have read about GCC's Options for Code Generation Conventions, but could not understand what "Generate position-independent code (PIC)" does. Please give an example to explain me what does it mean.




Adding further...

Every process has same virtual address space (If randomization of virtual address is stopped by using a flag in linux OS) (For more details Disable and re-enable address space layout randomization only for myself)

So if its one exe with no shared linking (Hypothetical scenario), then we can always give same virtual address to same asm instruction without any harm.

But when we want to link shared object to the exe, then we are not sure of the start address assigned to shared object as it will depend upon the order the shared objects were linked.That being said, asm instruction inside .so will always have different virtual address depending upon the process its linking to.

So one process can give start address to .so as 0x45678910 in its own virtual space and other process at the same time can give start address of 0x12131415 and if they do not use relative addressing, .so will not work at all.

So they always have to use the relative addressing mode and hence fpic option.




A minor addition to the answers already posted: object files not compiled to be position independent are relocatable; they contain relocation table entries.

These entries allow the loader (that bit of code that loads a program into memory) to rewrite the absolute addresses to adjust for the actual load address in the virtual address space.

An operating system will try to share a single copy of a "shared object library" loaded into memory with all the programs that are linked to that same shared object library.

Since the code address space (unlike sections of the data space) need not be contiguous, and because most programs that link to a specific library have a fairly fixed library dependency tree, this succeeds most of the time. In those rare cases where there is a discrepancy, yes, it may be necessary to have two or more copies of a shared object library in memory.

Obviously, any attempt to randomize the load address of a library between programs and/or program instances (so as to reduce the possibility of creating an exploitable pattern) will make such cases common, not rare, so where a system has enabled this capability, one should make every attempt to compile all shared object libraries to be position independent.

Since calls into these libraries from the body of the main program will also be made relocatable, this makes it much less likely that a shared library will have to be copied.