Menu
Norway

Image Credits: FlixBus

The Four Stages Of Life

July 09, 2020

 

Okay, I have to admit it. The title was an unintentional and involuntary clickbait and this is not a blog about the four stages in the life of an individual, namely, Brahmacharya, Grihastha, Vanaprastha, and Sannyasa.

That is not what I would write about, well, at least in this post.

 

 

The Four Stages in the Life of a C code.

 

 

Compiling a program written in C language is a multi-stage process and proceeds through the stages of Preprocessing, Compilation, Assembly, and Linking.


#include<stdio.h>
//File name: Sample.c
//This is a Macro
#define programming_standard "Hello World!\n"
int main()
{
    printf(programming_standard);
    return 0;
}

A sample program in C language.


Let's see what happens in each of these stages.

 

The Preprocessing stage 


This is the first stage of compiling a C program.

The preprocessor understands neither the syntax nor the semantics of the code. It does the following job:

 

  • Removes comments from the code
  • Substitutes the macro wherever required
  • Expands the files included in the program

 

The C code given above includes a few statements starting with the #character. These statements are called preprocessor directives and are exactly what their names suggest, directives, or commands to the pre-processor.

The result obtained after the preprocessing stage can be printed on the command-line interface by compiling the program with the E option flag.


gcc -E Sample.c


Excersie for the reader: To understand what is meant by an eon, it is highly recommended that you compile a C++ program with the E option flag, while including the <bits/stdc++> library.


Alternatively, you can print the resultant preprocessed code into a file using the o flag. It outputs the preprocessed code in a file with a .I extension.


...

__attribute__((__cdecl__)) __attribute__((__nothrow__)) wint_t fgetwchar (void);
__attribute__((__cdecl__)) __attribute__((__nothrow__)) wint_t fputwchar (wint_t);
__attribute__((__cdecl__)) __attribute__((__nothrow__)) int getw (FILE *);
__attribute__((__cdecl__)) __attribute__((__nothrow__)) int putw (int, FILE *);

# 2 "Sample.c" 2
# 6 "Sample.c"
int main()
{
 printf("Hello World!\n");
 return 0;
}

Output after preprocessing Sample.c


The Compilation Stage


Oddly enough, the second stage of compiling a C program was named as the Compilation stage.

This stage converts the preprocessed code to assembly code. *terms and conditions applied 

It is in this stage where we are made aware of any syntax violations we may have committed in the program and if the program is free of syntax errors, that is, if the terms and conditions are satisfied, the preprocessed code is translated into assembly instructions specific to the CPU architecture we are working on.

 

We can view the result of this stage by using the S flag. This creates a file with a .S extension comprising the generated assembly instructions.


gcc -S Sample.c


Okay great, but what is the deal with assembly code, and what is an assembler?

Assembly code is an intermediate human-readable language. 


Why intermediate? 

Well, this is the lowest level code that is understood by us and is just one step away from being understood by the CPU, and that last step is what an assembler is responsible for. An assembler converts assembly code, the intermediate human-readable language, to binary instructions that are understood by the processor.



The Assembly Stage


The third stage uses an assembler to translate the assembly instructions to binary instruction also known as the object code. 

Therefore, the output of this stage consists of actual instructions to be run by the target processor.

Like other stages, we can view the result of this stage as well. We use the c option flag.


gcc -c Sample.c


Running the above command will create a file named Sample.o, containing the object code of the program. The contents of this file are in a binary format and can be inspected using the od command:

od -c Sample.o


The Linking Stage


Linking is the final stage of compiling a C program. While we obtained the object code in the assembly stage, which are machine instructions that the processor understands, some pieces of the program might still be missing.

Example : it is a good practice to have the implementation and interface of a program defined separately in the form of client and server files. The compiler does not look at more than one file during compilation. So, in our case, when the compiler does not find a definition for the functions called in the client file, it assumes them to be defined in other files. Once the program has been compiled, during execution, the processor needs to know what has to be done when a particular function is called.
Therefore, to produce the executable file, the existing pieces have to be rearranged and the missing ones need to be filled in.
This process is called linking, and as the name suggests, it refers to the creation of a single executable file from multiple object files. 

The linker arranges the pieces of object code together so that functions in some piece can successfully summon functions in another one.
While the compilation stage highlighted any syntactical violations, this is the stage where we might face complaints regarding multiple definitions of the same function and about functions that may not have been defined.


There are two kinds of linking:

  • Static Linking: is the process of copying all library modules used in the program into the final executable image. The linker combines library routines with our code to resolve external references, and to generate an executable image suitable for loading into memory.
  • Dynamic Linking: is the kind of linking where the names of the external library files are placed in the final executable. This way, much of the linking process is deferred and the actual linking takes place at run time, when both, the executable files and libraries are loaded onto the memory.


Therefore, the result of this stage is the final executable file. 

When run without options, GCC will name this file as a.out on a LINUX machine and a.exe on a Windows machine. 

To rename the executable file, we pass the desired name with the o option flag.


gcc  Sample.c -o Sample


Opinion: Compiling is not the same as creating an executable!
When talking about programs, colloquial usage of the phrase “it compiles fine” to convey that a program works, might be confusing.
Since the creation of an executable is a multi-stage process, comprising compilation and linking, even if a program "compiles fine" it might not work owing to errors during the linking phase.
Though this may not be a frequent problem, it is better to call the complete process of going from the source code to an executable as a build.