Every operating system uses slightly different mechanisms, but the overall flow is the same:

You launch the program
The OS loader reads the binary file format (EXE/PE on Windows, ELF on Linux)
The loader maps code/data into memory
The loader sets up the process environment
The CPU's instruction pointer (IP/RIP) jumps to the entry point
Execution begins

Let’s break it down carefully.

Image:AI Generated

1. You launch the binary

When you:

type ./program in Linux, or
double-click program.exe in Windows, or
run through CreateProcess API

…the OS takes over.

The OS:

Creates a new process control block (PCB)
Gives the process a unique process ID (PID)
Starts preparing virtual memory for the program

This is the “birth” of a process

2. The OS loader identifies the binary format

Different OSes use different binary formats:

Windows →PE/PE32+/Portable Executable

Linux → ELF (Executable and Linkable Format)

macOS→Mach-O

The loader reads the binary’s header, which includes:

CPU architecture (x86, x64, ARM)
Entry point address
Section headers (.text, .data, .bss, .rdata, etc.)
Dynamic linking info (libraries needed)
Memory layout requirements
Stack & heap initialization data

The binary header is like a blueprint.

3. Loader maps sections into memory (Virtual Memory Mapping)

Every program has sections like:

Section	Meaning
.text	machine code (instructions)
.data	initialized global variables
.bss	uninitialized global variables
.rodata	read-only data (strings, constants)
.reloc / .plt / .got	relocation & linking info

The loader uses the MMU (Memory Management Unit) and page tables to map:

.text → read + execute
.data → read + write
.bss → read + write
stack → read + write
heap → read + write, expandable

The binary is not "loaded" as one chunk — it's mapped in pages (usually 4 KB per page).

4. Loader resolves dynamic libraries (DLL / Shared libs)

Programs often depend on system libraries such as:

Windows:kernel32.dll, user32.dll, ntdll.dll
Linux:libc.so,libpthread.so

The loader:

Finds required shared libraries
Maps them into the process
Fixes up import table (IAT on Windows, GOT/PLT on Linux)
Applies relocations (adjusting address references)

This is why running "Hello World" in C still loads tens of libraries.

5. The loader sets up the initial process environment

Before the CPU runs your code, the OS prepares:

1. Stack

Allocated and pointer initialized (ESP/RSP).

Stack contains:

argc
argv[]
environment variables
auxiliary vectors (Linux)

2. Heap

Sets the base for dynamic memory (malloc, new).

3. Thread info

Creates the main thread and assigns TCB (Thread Control Block).

4. CPU State

Instruction pointer →entry point
Registers defaulted/reset
Flags cleared/set as required

6. Loader jumps to the program’s ENTRY POINT

Every binary has an entry address:

In ELF → e_entry
In Windows PE → AddressOfEntryPoint

This is not main().

In C/C++ binaries:

Entry point is runtime initialization:

Windows:__tmainCRTStartup → calls main()

Linux:_start → __libc_start_main → main()

This startup code sets up:

global constructors
memory allocators
TLS (thread-local storage)
exception stack frames

Only after this, your main() begins.

7. The CPU begins execution

The loader hands control to the program by:

RIP = entry_point

The CPU now:

Fetches the instruction (from .text section)
Decodes it
Executes it
Moves to next instruction

This is the classic fetch–decode–execute cycle.

The binary is now running like any other process on the CPU.

8. Program ends → cleanup and exit

When your program calls exit() or returns from main():

Exit code is stored
OS destroys process environment
Frees memory mappings
Closes file handles
Notifies parent process

How a Binary Loads and Executes

1. You launch the binary