Tutorial

x86 and x64 Registers and Calling Conventions

A practical guide to CPU registers, partial register access, flags, and how function arguments are passed on x86 and x64 Linux.

8 min read beginner

Prerequisites

  • Basic Linux command line knowledge
  • Familiarity with hexadecimal notation

Part 1 of 12 in Linux Exploitation Fundamentals

Table of Contents

Every exploitation technique — buffer overflows, ROP chains, shellcode — ultimately comes down to controlling what’s in the CPU’s registers. Before you can redirect execution or set up a syscall, you need to understand what each register does, how they relate to each other, and how the calling convention determines where function arguments live.

This tutorial covers the registers and calling conventions for both x86 (32-bit) and x64 (64-bit) Linux, with a focus on what matters for exploit development.

x86 Registers (32-bit)

General-Purpose Registers

x86 has eight 32-bit general-purpose registers. While any of them can hold arbitrary data, each has a conventional role:

RegisterNameConventional Use
EAXAccumulatorReturn values, syscall numbers
EBXBaseBase pointer for memory access, syscall arg 1
ECXCounterLoop counters, syscall arg 2
EDXDataI/O operations, syscall arg 3
ESISource IndexSource pointer for string operations
EDIDestination IndexDestination pointer for string operations
EBPBase PointerPoints to the base of the current stack frame
ESPStack PointerPoints to the top of the stack

Special-Purpose Registers

RegisterPurpose
EIPInstruction Pointer — address of the next instruction to execute
EFLAGSStatus flags set by arithmetic and comparison operations

EIP is the register you’re trying to control in a buffer overflow. You cannot write to it directly with mov — it’s modified implicitly by instructions like ret, call, and jmp.

x64 Registers (64-bit)

x64 extends every x86 register to 64 bits and adds eight new general-purpose registers:

General-Purpose Registers

RegisterConventional Use
RAXReturn values, syscall numbers
RBXCallee-saved (preserved across function calls)
RCX4th function argument, syscall clobbered
RDX3rd function argument, syscall arg 3
RSI2nd function argument, syscall arg 2
RDI1st function argument, syscall arg 1
RBPBase pointer (or general-purpose if frame pointer omitted)
RSPStack pointer
R85th function argument
R96th function argument
R10Syscall arg 4 (replaces RCX for syscalls)
R11Syscall clobbered
R12–R15Callee-saved

Special-Purpose Registers

RegisterPurpose
RIPInstruction Pointer
RFLAGSStatus flags (extended from EFLAGS)

Partial Register Access

One of the most important concepts for shellcode development: x86 and x64 registers can be accessed in smaller pieces.

x86 Partial Access

Each 32-bit register contains smaller accessible portions:

EAX (32 bits)
├── AX (lower 16 bits)
│   ├── AH (upper 8 bits of AX)
│   └── AL (lower 8 bits of AX)

The same pattern applies to EBX/BX/BH/BL, ECX/CX/CH/CL, and EDX/DX/DH/DL.

ESI, EDI, EBP, and ESP only expose a 16-bit lower half (SI, DI, BP, SP) — no 8-bit access on x86.

x64 Partial Access

x64 extends this scheme. Every 64-bit register provides access to its lower 32, 16, and 8 bits:

RAX (64 bits)
├── EAX (lower 32 bits)
│   ├── AX (lower 16 bits)
│   │   ├── AH (upper 8 of AX)
│   │   └── AL (lower 8 of AX)

The new registers R8–R15 use a different naming convention:

R8 (64 bits)
├── R8D (lower 32 bits)
│   ├── R8W (lower 16 bits)
│   │   └── R8B (lower 8 bits)

Why This Matters for Exploitation

Partial register access is essential for avoiding null bytes in shellcode. Compare:

mov eax, 0xb         ; b8 0b 00 00 00  — contains three null bytes
xor eax, eax         ; 31 c0           — zeros EAX, no nulls
mov al, 0xb          ; b0 0b           — sets only the low byte

Both produce EAX = 0x0000000b, but the second approach is null-free.

Note: On x64, writing to a 32-bit register (like EAX) automatically zeros the upper 32 bits of the full 64-bit register. Writing to 16-bit or 8-bit portions does not zero the upper bits. This behavior matters when constructing values in shellcode.

The Flags Register

The EFLAGS/RFLAGS register contains status bits set by arithmetic and comparison instructions. The flags you’ll encounter most in exploitation:

FlagNameSet When
ZFZero FlagResult of last operation was zero
CFCarry FlagUnsigned overflow occurred
SFSign FlagResult was negative (MSB is 1)
OFOverflow FlagSigned overflow occurred
DFDirection FlagControls string operation direction

How Flags Drive Execution

Conditional jumps read flags to make decisions:

cmp eax, 0           ; Sets ZF=1 if EAX is zero
je  target           ; Jump if ZF=1 (Equal / Zero)
jne target           ; Jump if ZF=0 (Not Equal / Not Zero)
jl  target           ; Jump if SF≠OF (Less Than, signed)
jb  target           ; Jump if CF=1 (Below, unsigned)

In GDB-PEDA, the EFLAGS display shows active flags in brackets:

EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)

Uppercase means the flag is set; lowercase means cleared.

Calling Conventions

A calling convention defines how functions receive arguments and return values. Getting this right is critical for ROP chains and ret2libc attacks — if you put arguments in the wrong place, the target function won’t see them.

x86: cdecl (C Declaration)

On 32-bit Linux, the dominant calling convention is cdecl:

  • Arguments: pushed onto the stack right-to-left
  • Return value: in EAX
  • Caller-saved: EAX, ECX, EDX (caller must save these if it needs them after the call)
  • Callee-saved: EBX, ESI, EDI, EBP (callee must restore these before returning)
  • Stack cleanup: caller removes arguments after the call returns

Example — calling printf("Value: %d\n", 42):

push 42               ; Second argument (pushed first — right-to-left)
push format_string     ; First argument
call printf
add esp, 8            ; Caller cleans up 2 arguments (4 bytes each)

Stack layout at the moment printf begins executing:

        ┌─────────────────────┐
ESP → │ return address       │  ← pushed by CALL
        ├─────────────────────┤
        │ format_string ptr   │  ← arg 1
        ├─────────────────────┤
        │ 42                  │  ← arg 2
        └─────────────────────┘

This is why ret2libc payloads on x86 are structured as:

[ function address ][ return address ][ arg1 ][ arg2 ][ ... ]

The function pops the return address from ESP, then reads arguments from the stack positions above it.

x64: System V AMD64 ABI

On 64-bit Linux, function arguments go in registers first:

ArgumentRegister
1stRDI
2ndRSI
3rdRDX
4thRCX
5thR8
6thR9
7th+Stack (right-to-left)
  • Return value: in RAX (and RDX for 128-bit returns)
  • Caller-saved: RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11
  • Callee-saved: RBX, RBP, R12, R13, R14, R15
  • Stack alignment: RSP must be 16-byte aligned before a call instruction

Example — calling write(1, buf, len):

mov rdi, 1             ; fd = stdout
mov rsi, buf           ; buffer address
mov rdx, len           ; byte count
call write

No stack manipulation needed for six or fewer arguments. This is why x64 ROP chains require pop rdi; ret gadgets — you need to load argument registers from the stack.

16-Byte Stack Alignment

The System V AMD64 ABI requires RSP to be 16-byte aligned at the point of a call instruction. Since call pushes an 8-byte return address, RSP will be misaligned (8-byte aligned, not 16) at the start of the called function. Functions that use SSE instructions (common in libc) will crash with a segfault if alignment is wrong.

In ROP chains, if your exploit crashes inside a library function, try adding a single ret gadget before the function call to adjust alignment:

[ pop rdi; ret ] [ "/bin/sh" addr ] [ ret ] [ system() addr ]

The extra ret pops 8 bytes off the stack, shifting RSP from 8-byte to 16-byte alignment.

Syscall Conventions

Syscalls use a different convention from function calls. This matters when you’re writing shellcode or building ROP chains that invoke syscalls directly.

x86 Syscalls (int 0x80)

PurposeRegister
Syscall numberEAX
Arg 1EBX
Arg 2ECX
Arg 3EDX
Arg 4ESI
Arg 5EDI
Arg 6EBP

x64 Syscalls (syscall instruction)

PurposeRegister
Syscall numberRAX
Arg 1RDI
Arg 2RSI
Arg 3RDX
Arg 4R10
Arg 5R8
Arg 6R9

Note: The x64 syscall convention uses R10 instead of RCX for the 4th argument. This is because the syscall instruction clobbers RCX (it stores the return address in RCX). Similarly, R11 is clobbered (it stores RFLAGS).

Observing Registers in GDB-PEDA

Viewing All Registers

gdb-peda$ info registers

PEDA’s context display shows registers automatically at every breakpoint, color-coded by whether they’ve changed since the last stop.

Viewing Specific Registers

gdb-peda$ i r rdi rsi rdx
rdi            0x7fffffffe5a0   140737488348576
rsi            0x0              0
rdx            0x0              0

Modifying Registers

Useful for testing what happens if a register has a specific value:

gdb-peda$ set $rdi = 0x41414141
gdb-peda$ set $rax = 59

Watching Register Changes

Set a breakpoint and step through instructions, watching how registers change:

gdb-peda$ b *main
gdb-peda$ r
gdb-peda$ si

PEDA highlights modified registers in red after each step, making it easy to follow data flow.

Key Takeaways

  • x86 has 8 general-purpose registers; x64 extends these to 64 bits and adds R8–R15.
  • EIP/RIP is the instruction pointer you’re trying to control in a buffer overflow. It cannot be set directly — only through ret, call, jmp, or similar instructions.
  • Partial register access (AL, AX, EAX, RAX) is essential for writing null-free shellcode.
  • On x86, function arguments go on the stack (cdecl). On x64, the first six go in registers: RDI, RSI, RDX, RCX, R8, R9 (System V AMD64 ABI).
  • Syscall conventions differ from function call conventions — notably, x64 syscalls use R10 instead of RCX for the 4th argument.
  • x64 requires 16-byte stack alignment before call instructions. If your ROP chain crashes in a library function, add a ret gadget to fix alignment.