What’s the Journey of a Single Line of Compiled Code Like?

A look at the complete process of transforming a source code into an executable format

Ambreen H.
Better Programming

--

Photo by Max Böttinger on Unsplash

At a high level, I’ll be looking at the output of each stage of compiling a simple C++ program using Clang. I’ll also follow our simple code in the disassembly output a little more closely and discuss parts of an ELF file.

When you fully compile your program, it produces an executable binary. For example, this simple program…

…produces binary code. Something like this (as viewed in a hex editor):

Executable binary
Executable binary

If you run this program, the output is as expected:

What’s Up?
20

At this point, all of your code and data are transformed into binary in an appropriate format that your computer can execute. It’s not that comprehensible for us, but to get a better appreciation, you can change the static string part of this program to output something different when executed.

For example, in this binary, I changed the code 5768 6174 2773 2055 703f 000a representing the string What’s Up? to 576f 6e64 6572 6675 6c21 000a, and running that modified binary would produce the output:

Wonderful!
20

Yes! Wonderful, isn’t it?

I don’t know — should I be worried? This was just swapping some ASCII code around. Changing behavior will be a lot more difficult, so there’s probably nothing to worry about.

Regardless, we were talking about the process involved in taking a source file and compiling it into a binary. I’ll keep it high-level, though.

There are several stages that take place when transforming the source code into an executable format. You can see from the Clang documentation that the steps involved are namely:

1. Pre-Processing
2. Parsing and Semantic Analysis
3. Code Generation and Optimization
4. Assembly
5. Linking
Stages of compilation
Stages of compilation

Let’s look at the output of each of the above stages.

Pre-Processing

Clang documentation describes this as:

“This stage handles tokenization of the input source file, macro expansion, #include expansion and handling of other preprocessor directives.”

The output for our program at the end of this stage shows that the macros are expanded. Notice that we have std::cout << “What’s Up?” << “\n”;, while the original code was std::cout << MSG << “\n”;.

…..namespace std __attribute__ ((__visibility__ (“default”))){# 60 “/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/iostream” 3extern istream cin;
extern ostream cout;
extern ostream cerr;
extern ostream clog;
extern wistream wcin;
extern wostream wcout;
extern wostream wcerr;
extern wostream wclog;
static ios_base::Init __ioinit;
}
# 2 “pass_by_reference_example.cpp” 2void addTen(int& num) {num += num + 10;}int main(int argc, const char* argv[]) {int a_something = 5;std::cout << “What’s Up?” << “\n”;addTen(a_something);std::cout << a_something << “\n”;return 0;}

Note: I had to clip a lot of lines of the top. It was C++ template stuff.

Parsing and Semantic Analysis

Clang documentation describes this as:

“This stage parses the input file, translating preprocessor tokens into a parse tree. Once in the form of a parse tree, it applies semantic analysis to compute types for expressions as well and determine whether the code is well formed. This stage is responsible for generating most of the compiler warnings as well as parse errors. The output of this stage is an ‘Abstract Syntax Tree’ (AST).”

This stage produced an AST. You can see how a variable such as a_something is represented in a hierarchy. So is the rest of our code. Again, I clipped a lot of lines so the output would stay within the realm of the familiar and simple.

AST
AST

Code Generation and Optimization

Clang documentation describes this as:

“This stage translates an AST into low-level intermediate code (known as ‘LLVM IR’) and ultimately to machine code. This phase is responsible for optimizing the generated code and handling target-specific code generation. The output of this stage is typically called a ‘.s’ file or ‘assembly’ file.”

This stage eventually produces target-specific assembly code. I’m inserting a screenshot of the output, but, later in the article, we can walk through the assembly code from the objdump as it interleaves the code with assembly. That’s where we’ll follow some of the code we wrote.

Assembly output
Assembly output

Assembly

Clang documentation describes this as:

“This stage runs the target assembler to translate the output of the compiler into a target object file. The output of this stage is typically called a ‘.o’ file or ‘object’ file.”

Assembler produces an object file. Our platform is Ubuntu — therefore, the type of the object file produced for this platform is ELF. Specifically, it’s:

ELF 64-bit LSB relocatable, x86–64, version 1 (SYSV), with debug_info, not stripped

This is a relocatable file, and hence not all memory addresses are resolved. The screenshot below is the header information of the ELF file:

ELF header of object file
ELF header of object file

As you can see above, Entry point address is 0x0 because it’s not an executable yet and doesn’t know the entry point of its virtual address space.

Linker

Clang documentation describes this as:

“This stage runs the target linker to merge multiple object files into an executable or dynamic library. The output of this stage is typically called an ‘a.out,’ ‘.dylib’ or ‘.so’ file.”

Finally, the linker takes the object file(s) and creates an executable, resolving the resolvable addresses.

ELF binary consists of an executable header, zero or more program headers, and zero or more sections headers. Let’s take a brief look at the components.

Executable header

The screenshot below shows the ELF header output of our executable. It gives us information about the kind of file it is, where to find other contents in the file, etc.

The Magic field is a 16-byte array that has a 4-byte magic value, indicating this is an ELF file.

ELF header of executable file
ELF header of executable file

As you can see now from the above headers output, it now has an entry point: 0x4010d0, the virtual memory address from where the execution should start. Its type is executable.

Sections

Sections organize data and code logically. They provide an organized view for the linker. Let’s explore the sections present in our binary using readelf.

Sections
Sections

I think we can look at some of the more well-known sections:

.text

.text contains the main executable code. There’s a lot of output, so I’ll just show screenshots of the few that we can recognize. Let’s start with the entry point address that we found in our ELF header. Here’s the disassembly for that:

Start method
Start method

So this isn’t really our main function — this is some _start, probably setting up and/or initializing the program.

The register rdi is for passing the first argument, and this is the address of our main function, as depicted in the screenshot below.

4010f1: 48 c7 c7 e0 11 40 00 mov rdi,0x4011e0
Disassembly of main method
Disassembly of main method

We can see here that a_something gets created at [rbp-0x14], and it’s value is set to 0x5.

int a_something = 5;4011f6: c7 45 ec 05 00 00 00 mov DWORD PTR [rbp-0x14],0x5

In our code, we call addTen and pass the reference of a_something to it. The disassembly for it is:

addTen(a_something);401228: 48 8d 7d ec    lea rdi,[rbp-0x14]
40122c: 48 89 45 e0 mov QWORD PTR [rbp-0x20],rax
401230: e8 8b ff ff ff call 4011c0 <_Z6addTenRi>

Not sure what the line 40122c is accomplishing by moving the temp rax to [rbp-0x20], but we know lea is loading the effective address at [rbp-0x14] — i.e., the address of a_something into rdi, which is used for passing the first argument and making a call to the addTen method:

Disassembly of addTen method
Disassembly of addTen method

There’s pointer bookkeeping going on here: rbp is the base pointer, and rsp is the stack pointer always pointing to the top of the stack.

4011c0: 55       push rbp
4011c1: 48 89 e5 mov rbp,rsp

rdi holds the address of a_something. That address is now copied to [rbp-0x8] and then further copied to the rax register.

4011c4: 48 89 7d f8 mov QWORD PTR [rbp-0x8],rdi
4011c8: 48 8b 45 f8 mov rax,QWORD PTR [rbp-0x8]

Now the content at the address of a_something is copied to the ecx register.

4011cc: 8b 08 mov ecx,DWORD PTR [rax]

0xa is 10 in decimal, and it’s being added to the content of ecx, which holds the value of a_something.

4011ce: 83 c1 0a add ecx,0xa

Now the result is being placed at the address in the rax register:

4011d7: 89 08 mov DWORD PTR [rax],ecx

It’s interesting that the assembly from the Clang output had a msg macro expanded, but the objdump assembly output kept it intact.

.rodata

This section contains read-only data. So, for us, it has our msg string.

.rodata

.bss

This holds uninitialized data.

Segments

Segments provide information used by the operating system and the dynamic linker to set up and load the process for execution. This is what it looks like for our process:

Segments
Segments

The LOAD types are the segments to be loaded in the memory. We can see the segment that contains our main code is in segment 03, consisting of the sections .init, .plt, .text, and .fini. It’s set to read (R) and execute (E).

Our .rodata is in segment 04.rodata, .eh_frame_hdr, .eh_frame — and it’s set to read (R).

Conclusion

And that’s it for now. Obviously, there are a lot of things to understand and go deeper with, but getting an overall view is also interesting in itself. If I get a chance, I’ll write about how this little process looks like in memory.

--

--