Understanding ELF Entry Points vs. main: A Common Misconception in Reverse Engineering

One of the first surprises students encounter when analyzing Linux executables is that the program entry point reported by tools like readelf does not match the address of main shown in GDB. This often leads to confusion: “Why does readelf say the program starts at 0x401020, but GDB shows main at 0x401110?”


1. The ELF Entry Point

Every ELF executable has an entry point address stored in its header (e_entry). You can see it with:

readelf --file-header ./a.out

Example output:

Entry point address: 0x401020

This address is where the operating system loader transfers control when the program begins. But this is not your main function. Instead, it points to a special routine called _start, provided by the C runtime (CRT) and the linker.


2. The Role of _start

_start is the true beginning of execution. Its responsibilities include:

  • Setting up the stack and registers.
  • Passing command‑line arguments and environment variables.
  • Initializing the C runtime (libc).
  • Finally, calling __libc_start_main, which in turn calls your main.

So the CPU begins at _start (e.g., 0x401020), not at main.


3. The main Function

When you disassemble in GDB:

disas main

You might see something like:

Dump of assembler code for function main:
0x0000000000401110 <+0>: push %rbp
...
0x0000000000401186 <+118>: ret

This is your program’s main function, compiled from your source code. It lives at a different address (e.g., 0x401110) because it is just another function symbol inside the binary. It is called by libc, not directly jumped to by the loader.


4. Why the Confusion Happens

  • readelf reports the ELF header’s entry point (_start).
  • GDB shows the symbol table and disassembly of main.
  • Students often assume the entry point must be main, but in reality, it’s the runtime bootstrap.

5. How to Trace the Flow

To see the full path:

  1. Disassemble _start in GDB:
    disas _start
    
  2. Observe how _start calls __libc_start_main.
  3. Inside __libc_start_main, your main function is invoked.

This shows the chain:
Loader → _start__libc_start_mainmain → your code.


6. Real‑World Analogy

Think of a theater:

  • Entry point (_start) = the front door where the audience enters.
  • Runtime setup (__libc_start_main) = the lobby and ticket check.
  • main = the stage where the play happens.

You don’t jump straight to the stage; you go through the proper setup first.


7. Why It Matters in Reverse Engineering

  • Exploit developers and reverse engineers must know where execution really begins.
  • Malware often hooks into _start or runtime initialization, not just main.
  • Misinterpreting entry points can lead to incorrect assumptions about control flow.

Conclusion

The key lesson: the ELF entry point is _start, not main.
main is just a function called later by the runtime. Tools like readelf and GDB show different addresses because they are reporting different parts of the execution chain.

For reverse engineering students, understanding this distinction is crucial — it prevents misinterpretation of binaries and builds a solid foundation for analyzing exploits and runtime behavior.