compiler-construction difference - What is compiler, linker, loader?




7 Answers

=====> COMPILATION PROCESS <======

                     |
                     |---->  Input is Source file(.c)
                     |
                     V
            +=================+
            |                 |
            | C Preprocessor  |
            |                 |
            +=================+
                     |
                     | ---> Pure C file ( comd:cc -E <file.name> )
                     |
                     V
            +=================+
            |                 |
            | Lexical Analyzer|
            |                 |
            +-----------------+
            |                 |
            | Syntax Analyzer |
            |                 |
            +-----------------+
            |                 |
            | Semantic Analyze|
            |                 |
            +-----------------+
            |                 |
            | Pre Optimization|
            |                 |
            +-----------------+
            |                 |
            | Code generation |
            |                 |
            +-----------------+
            |                 |
            | Post Optimize   |
            |                 |
            +=================+
                     |
                     |--->  Assembly code (comd: cc -S <file.name> )
                     |
                     V
            +=================+
            |                 |
            |   Assembler     |
            |                 |
            +=================+
                     |
                     |--->  Object file (.obj) (comd: cc -c <file.name>)
                     |
                     V
            +=================+
            |     Linker      |
            |      and        |
            |     loader      |
            +=================+
                     |
                     |--->  Executable (.Exe/a.out) (com:cc <file.name> ) 
                     |
                     V
            Executable file(a.out)

C preprocessor :-

C preprocessing is the first step in the compilation. It handles:

  1. #define statements.
  2. #include statements.
  3. Conditional statements.
  4. Macros

The purpose of the unit is to convert the C source file into Pure C code file.

C compilation :

There are Six steps in the unit :

1) Lexical Analyzer:

It combines characters in the source file, to form a "TOKEN". A token is a set of characters that does not have 'space', 'tab' and 'new line'. Therefore this unit of compilation is also called "TOKENIZER". It also removes the comments, generates symbol table and relocation table entries.

2) Syntactic Analyzer:

This unit check for the syntax in the code. For ex:

{
    int a;
    int b;
    int c;
    int d;

    d = a + b - c *   ;
}

The above code will generate the parse error because the equation is not balanced. This unit checks this internally by generating the parser tree as follows:

                            =
                          /   \
                        d       -
                              /     \
                            +           *
                          /   \       /   \
                        a       b   c       ?

Therefore this unit is also called PARSER.

3) Semantic Analyzer:

This unit checks the meaning in the statements. For ex:

{
    int i;
    int *p;

    p = i;
    -----
    -----
    -----
}

The above code generates the error "Assignment of incompatible type".

4) Pre-Optimization:

This unit is independent of the CPU, i.e., there are two types of optimization

  1. Preoptimization (CPU independent)
  2. Postoptimization (CPU dependent)

This unit optimizes the code in following forms:

  • I) Dead code elimination
  • II) Sub code elimination
  • III) Loop optimization

I) Dead code elimination:

For ex:

{
    int a = 10;
    if ( a > 5 ) {
        /*
        ...
        */
    } else {
       /*
       ...
       */
    }
}

Here, the compiler knows the value of 'a' at compile time, therefore it also knows that the if condition is always true. Hence it eliminates the else part in the code.

II) Sub code elimination:

For ex:

{
    int a, b, c;
    int x, y;

    /*
    ...
    */

    x = a + b;
    y = a + b + c;

    /*
    ...
    */
}

can be optimized as follows:

{
    int a, b, c;
    int x, y;

    /*
     ...
    */

    x = a + b;
    y = x + c;      // a + b is replaced by x

    /*
     ...
    */
}

III) Loop optimization:

For ex:

{
    int a;
    for (i = 0; i < 1000; i++ ) {

    /*
     ...
    */

    a = 10;

    /*
     ...
    */
    }
}

In the above code, if 'a' is local and not used in the loop, then it can be optimized as follows:

{
    int a;
    a = 10;
    for (i = 0; i < 1000; i++ ) {
        /*
        ...
        */
    }
}

5) Code generation:

Here, the compiler generates the assembly code so that the more frequently used variables are stored in the registers.

6) Post-Optimization:

Here the optimization is CPU dependent. Suppose if there are more than one jumps in the code then they are converted to one as:

            -----
        jmp:<addr1>
<addr1> jmp:<addr2>
            -----
            -----

The control jumps to the directly.

Then the last phase is Linking (which creates executable or library). When the executable is run, the libraries it requires are Loaded.

assembler interpreter

I wanted to know in depth meaning and working of compiler, linker and loader. With reference to any language preferably c++.




Hope this helps you a little more.

First, go through this diagram:

                         (img source->internet)

You make a piece of code and save the file (Source code), then

Preprocessing :- As the name suggests, it's not part of compilation. They instruct the compiler to do required pre-processing before the actual compilation. You can call this phase Text Substitution or interpreting special preprocessor directives denoted by #.

Compilation :- Compilation is a process in which a program written in one language get translated into another targeted language. If there is some errors, the compiler will detect them and report it.

Assemble :- Assemble code gets translated into machine code. You can call assembler a special type of complier.

Linking:- If these piece of code needs some other source file to be linked, linker link them to make it a executable file.

There are many process that happens after it. Yes, you guessed it right here comes the role of the loader:

Loader:- It loads the executable code into memory; program and data stack are created, register gets initialized.

Little Extra info :- http://www.geeksforgeeks.org/memory-layout-of-c-program/ , you can see the memory layout over there.




Wikipedia ought to have a good answer, here's my thoughts:

  • Compiler: reads something.c source, writes something.o object.
  • Linker: joins several *.o files into an executable program.
  • Loader: code that loads an executable into memory and starts it running.



*

explained with respect to, linux/unix based systems, though it's a basic concept for all other computing systems.

*

Linkers and Loaders from LinuxJournal explains this concept with clarity. It also explains how the classic name a.out came. (assembler output)

A quick summary,

c program --> [compiler] --> objectFile --> [linker] --> executable file (say, a.out)

we got the executable, now give this file to your friend or to your customer who is in need of this software :)

when they run this software, say by typing it in command line ./a.out

execute in command line ./a.out --> [Loader] --> [execve] --> program is loaded in memory

Once the program is loaded into the memory, control is transferred to this program by making the PC (program counter) pointing to the first instruction of a.out




compiler changes checks your source code for errors and changes it into object code.this is the code that operating system runs.

You often don't write a whole program in single file so linker links all your object code files.

your program wont get executed unless it is in main memory




A compiler is a software program that compiles program source code files into an executable program. It is included as part of the integrated development environment IDE with most programming software packages. The compiler takes source code files that are written in a high-level language, such as C, BASIC, or Java, and compiles the code into a low-level language, such as machine code or assembly code. This code is created for a specific processor type, such as and Intel Pentium or PowerPC. The program can then be recognized by the processor and run from the operating system.

Loader is An operating system utility that copies programs from a storage device to main memory, where they can be executed. In addition to copying a program into main memory, the loader can also replace virtual addresses with physical addresses. Most loaders are transparent, i.e., you cannot directly execute them, but the operating system uses them when necessary.

Linker Is a program that adjusts two or more machine-language program segments so that they may be simultaneously loaded and executed as a unit Also called link editor and binder, a linker is a program that combines object modules to form an executable program. Many programming languages allow you to write different pieces of code, called modules, separately. This simplifies the programming task because you can break a large program into small, more manageable pieces. Eventually, though, you need to put all the modules together. This is the job of the linker.




A Compiler translates lines of code from the programming language into machine language.

A Linker creates a link between two programs.

A Loader loads the program into memory in the main database, program, etc.




Related

compiler-construction linker terminology loader