Every embedded device runs code. Most of that code was never written to be secure, and most manufacturers don’t expect anyone to look at it. But firmware updates are usually downloadable, unencrypted, and packed with binaries that were cross-compiled with minimal hardening. Getting from a .bin file on a vendor’s support page to a shell in Ghidra is often surprisingly straightforward.
This tutorial walks through the full workflow: downloading firmware, using binwalk to identify and extract the contents, navigating the filesystem to find high-value targets, and reverse engineering those targets in Ghidra to locate real vulnerabilities — hardcoded credentials, command injection, authentication bypasses.
No physical device or soldering required. We work entirely with firmware images.
The target mindset
Before touching any tools, think about what you’re looking for. Firmware reverse engineering isn’t about disassembling every function. It’s about identifying the small number of binaries that handle untrusted input and auditing those.
High-value targets in embedded firmware:
| Binary | Why | Common vulns |
|---|---|---|
| Web server / CGI handlers | Directly reachable from the network | Command injection, auth bypass, stack overflow |
| DHCP/DNS daemons | Handle network input | Buffer overflow, format string |
| Update/upgrade handlers | Parse downloaded files | Path traversal, code execution |
| Configuration daemons | Store and apply settings | Hardcoded credentials, shell injection |
| Custom proprietary daemons | Often least audited | All of the above |
The proprietary binaries — anything not from an open-source project — are the most interesting. Vendors write custom code for device management, and that code rarely gets the scrutiny of mainstream projects.
Setting up the toolkit
Install the tools on your workstation.
# Debian/Ubuntu
sudo apt install -y binwalk jefferson squashfs-tools cramfstools \
p7zip-full unzip mtd-utils gzip bzip2 lzma xz-utils \
python3 python3-pip
pip3 install ubi_reader
# Ghidra: download from https://ghidra-sre.org/
# Unpack and add to PATH# Arch Linux
sudo pacman -S binwalk squashfs-tools p7zip unzip mtd-utils \
python python-pip
# Ghidra from AUR
yay -S ghidraNote
jeffersonis a JFFS2 extraction tool.ubi_readerhandles UBI/UBIFS images common on NAND-based devices. Install both — you won’t know which filesystem format a firmware uses until you look.
Optional: SPI flash tools
If you’re extracting from physical hardware (not covered here, but good to have):
sudo apt install -y flashrom
# For UART: minicom or picocom
sudo apt install -y picocomGetting firmware
Firmware images come from several sources, roughly in order of effort:
- Vendor download page — check the support/downloads section for the device model
- Firmware update traffic — intercept the device’s update check with a proxy
- Flash memory dump — desolder or clip the SPI flash chip and read it with
flashrom(hardware needed)
For this tutorial, download a firmware image from a vendor’s public support page. Router firmware is ideal because it’s widely available, contains a full Linux system, and the web management interface is a rich attack surface.
mkdir -p ~/firmware-lab && cd ~/firmware-lab
# Example: download a router firmware (substitute with your target)
wget -O firmware.bin "https://support.example.com/firmware/router_v1.2.3.bin"Warning
Legal considerations Reverse engineering firmware you own or have authorization to test is legal in most jurisdictions under interoperability and security research exemptions. Do not distribute extracted proprietary code. Check your local laws.
Analyzing with binwalk
Binwalk scans a binary file for embedded filesystems, compressed archives, and known file signatures. Start with a signature scan.
binwalk firmware.binDECIMAL HEXADECIMAL DESCRIPTION
-------------------------------------------------------
0 0x0 uImage header, image size: 1572864
64 0x40 LZMA compressed data, properties: 0x5D
1572928 0x180040 Squashfs filesystem, little endian, version 4.0,
size: 5242880 bytes, 312 inodes, blocksize: 131072This tells you the firmware contains:
- A U-Boot uImage header at offset 0 (the kernel bootloader format)
- LZMA-compressed data at offset 64 (likely the kernel)
- A SquashFS filesystem at offset 0x180040 (the root filesystem — this is what we want)
Entropy analysis
Binwalk’s entropy scan reveals encryption and compression visually.
binwalk -E firmware.binThis generates an entropy plot. High, flat entropy (close to 1.0) means the data is encrypted or compressed. If the entire file is high-entropy with no structure, the firmware is likely encrypted — and you’ll need to find the decryption key (often in the bootloader or a previous unencrypted firmware version).
Typical patterns:
Entropy: 0.1 ████ ← headers, padding (low entropy)
Entropy: 0.99 ████████████████████████████ ← compressed kernel (high, expected)
Entropy: 0.97 ████████████████████████████ ← squashfs (high, expected)Tip
Encrypted firmware If the whole image is high-entropy, check for a two-stage update format: an unencrypted header with a decryption routine, followed by encrypted payload. Some vendors ship a “bootstrap” firmware that’s unencrypted — downgrade to that version first, extract the decryption logic, then decrypt the latest firmware.
Extracting the filesystem
Use binwalk’s extraction mode.
binwalk -e firmware.binDECIMAL HEXADECIMAL DESCRIPTION
-------------------------------------------------------
...
1572928 0x180040 Squashfs filesystem, ...
-> extracted to: _firmware.bin.extracted/squashfs-root/Binwalk creates _firmware.bin.extracted/ with the unpacked contents. The SquashFS root filesystem is in squashfs-root/.
ls _firmware.bin.extracted/squashfs-root/bin dev etc lib mnt proc sbin sys tmp usr var wwwThat’s a complete Linux filesystem. If binwalk’s automatic extraction fails (it sometimes does with unusual formats), extract manually:
# Find the offset and extract the squashfs block
dd if=firmware.bin of=rootfs.squashfs bs=1 skip=1572928
# Unsquash it
unsquashfs rootfs.squashfs
# Creates squashfs-root/For other filesystem types:
# JFFS2 (NOR flash)
jefferson firmware.jffs2 -d jffs2-root/
# UBIFS (NAND flash)
ubireader_extract_files firmware.ubi -o ubifs-root/
# CRAMFS
cramfsck -x cramfs-root/ firmware.cramfsUnderstanding the firmware layout
Navigate the extracted filesystem and build a map.
cd _firmware.bin.extracted/squashfs-root
# What architecture?
file bin/busybox
# bin/busybox: ELF 32-bit LSB executable, MIPS, MIPS32 rel2 version 1, dynamically linked, stripped
# What libc?
file lib/libc.so* lib/libc-* 2>/dev/null
# lib/libc.so.0 -> libuClibc-0.9.33.2.so
# What's the init system?
cat etc/inittab 2>/dev/null | head -10
# What services start at boot?
ls etc/init.d/
cat etc/rc.d/rcS 2>/dev/nullThe file output tells you:
- Architecture: MIPS 32-bit little-endian (common in routers)
- Linking: dynamically linked (libraries are in
/lib) - Stripped: no debug symbols (normal for production firmware)
- C library: uClibc (lighter than glibc, common in embedded)
Note
The architecture determines which Ghidra language to use and which QEMU variant you need if you want to run the binaries. MIPS and ARM are the most common. See the cross-compiling tutorial for setting up QEMU for these architectures.
Identifying high-value targets
Web interface binaries
The web management interface is the highest-priority target. It’s network-reachable and handles user input.
# Find the web root
ls www/ var/www/ usr/share/www/ 2>/dev/null
# Find CGI binaries (these handle form submissions)
find . \( -path "*/cgi-bin/*" -o -name "*.cgi" \) 2>/dev/null./www/cgi-bin/admin.cgi
./www/cgi-bin/setup.cgi
./www/cgi-bin/firmware_upgrade.cgi
./usr/sbin/httpdEach .cgi file is a compiled binary that processes HTTP requests. httpd is the web server itself (often a custom fork of GoAhead, mini_httpd, or a proprietary implementation).
Hardcoded credentials
Search for credentials in plaintext config files and binaries.
# Config files
grep -ri "password\|passwd\|secret\|token\|key" etc/ 2>/dev/nulletc/shadow:root:$1$abc$xyz...:0:0:99999:7:::
etc/config/admin.conf:admin_password=admin123
etc/ppp/chap-secrets:* * "ISPpassword"# Strings in binaries
strings usr/sbin/httpd | grep -iE "password|admin|root|login|auth"admin
admin123
Authorization: Basic
/etc/config/admin.conf
invalid passwordWarning
How common is this? Extremely. Hardcoded credentials in firmware are one of the most frequently found vulnerability classes in IoT devices. They often appear as default passwords, backdoor accounts, or API keys compiled directly into binaries.
# SSH/TLS keys
find . \( -name "*.pem" -o -name "*.key" -o -name "id_rsa" -o -name "id_dsa" \) 2>/dev/null
# Certificate files (shared across all devices of this model)
find . \( -name "*.crt" -o -name "*.cert" \) 2>/dev/nullIf you find private keys baked into the firmware, every device of that model shares them — SSL/TLS is effectively broken for all of them.
Binary security posture
Check what protections the target binaries were compiled with.
# If checksec is available (install from github.com/slimm609/checksec.sh)
for bin in usr/sbin/httpd www/cgi-bin/*.cgi; do
echo "=== $bin ==="
checksec --file="$bin" 2>/dev/null
done=== usr/sbin/httpd ===
RELRO STACK CANARY NX PIE
No RELRO No canary found NX disabled No PIENo RELRO, no stack canary, no NX, no PIE. This is typical for embedded firmware — cross-compilation toolchains often have protections disabled by default, and vendors rarely enable them.
Reverse engineering in Ghidra
Now load the most interesting binary into Ghidra for deep analysis.
Setting up the project
# Launch Ghidra
ghidraRun &- Create a new project: File → New Project → Non-Shared Project
- Import the target binary: File → Import File → select
usr/sbin/httpd - Ghidra auto-detects the architecture from the ELF header (MIPS:LE:32:default for our example)
- Click Yes when asked to analyze, accept default analyzers, and wait for analysis to complete
Finding input handlers
Start by locating functions that process HTTP requests. These are the entry points for network-reachable attacks.
In Ghidra’s Symbol Tree or Search → For Strings, look for HTTP-related strings.
Search → For Strings:
"GET "
"POST "
"Content-Length"
"password"
"/cgi-bin/"Double-click a string reference to see where it’s used. Ghidra shows cross-references (XREFs) — functions that reference that string.
Tracing from input to vulnerability
Here’s a systematic approach. Take the admin.cgi binary as an example.
Step 1: Find the entry point for user input.
CGI binaries receive input via environment variables (QUERY_STRING, CONTENT_LENGTH) and stdin. Search for calls to getenv.
Search → For Strings: "QUERY_STRING"Follow the XREF to find where the query string is read:
// Ghidra decompilation (cleaned up)
char *query = getenv("QUERY_STRING");
char *content_len = getenv("CONTENT_LENGTH");
int len = atoi(content_len);
char *post_data = malloc(len + 1);
fread(post_data, 1, len, stdin);Step 2: Follow the data through processing.
Find where query or post_data is parsed and used. Look for parameter extraction:
// Common pattern in CGI binaries
char *username = get_param(post_data, "username");
char *password = get_param(post_data, "password");
char *cmd = get_param(post_data, "command");Step 3: Find the sink — where user data reaches a dangerous function.
The most common vulnerability classes in embedded firmware:
Command injection
Search for calls to system(), popen(), execve().
// Vulnerable pattern
char cmd_buf[256];
sprintf(cmd_buf, "ping -c 1 %s", user_input); // no sanitization
system(cmd_buf);In Ghidra: Search → For All References to system (in the symbol tree, find the system import, right-click → References → Find References To).
Every call site is a potential injection point. Check whether user input reaches the buffer passed to system() without sanitization.
// Decompiled Ghidra output (MIPS, cleaned up)
void handle_ping(char *target_ip) {
char buf[128];
// VULNERABILITY: target_ip comes directly from POST parameter
// No input validation — inject with: 127.0.0.1; cat /etc/shadow
sprintf(buf, "ping -c 4 %s > /tmp/ping_result", target_ip);
system(buf);
}Stack buffer overflow
Search for strcpy, sprintf, strcat, gets — unbounded copy functions.
void process_auth(char *post_data) {
char username[32];
char password[32];
// VULNERABILITY: no length check
strcpy(username, get_param(post_data, "username"));
strcpy(password, get_param(post_data, "password"));
...
}In Ghidra, the decompiler shows these clearly. Check the buffer size (visible in the stack frame layout) against the input source.
Tip
Stack frame analysis in Ghidra Click on a function, then open Window → Function Graph or look at the decompiler’s variable declarations. Local variables show their stack offsets. If a
char[32]buffer is passed tostrcpywith unbounded input, you’ve found an overflow. The stack offset tells you exactly how many bytes to the return address.
Authentication bypass
Search for comparison functions and authentication logic.
// Vulnerable pattern: strcmp returns 0 on match
if (strcmp(password, "admin123") == 0) {
authenticated = 1;
}
// Another pattern: checking a hardcoded hash
if (strcmp(md5(password), "5f4dcc3b5aa765d61d8327deb882cf99") == 0) {
// "password" in MD5 — trivially reversible
authenticated = 1;
}Also look for authentication bypass — paths that reach privileged functionality without checking auth at all:
// Direct CGI handler — check if auth is verified before processing
void handle_firmware_upload(char *post_data) {
// Is there an auth check before this function?
// If not, anyone can upload firmware without logging in
save_firmware(post_data);
system("mtd write /tmp/firmware.bin firmware");
}Building a vulnerability map
As you find issues, document them systematically.
┌─────────────────────────────────────────────────────┐
│ Firmware: Router Model X v1.2.3 │
│ Architecture: MIPS32 LE │
│ Extracted: squashfs, uClibc 0.9.33 │
├─────────────────────────────────────────────────────┤
│ Finding 1: Command injection in admin.cgi │
│ Function: handle_ping @ 0x00401a30 │
│ Sink: system() call with unsanitized POST param │
│ Input: "target" parameter from /cgi-bin/admin.cgi │
│ Auth required: Yes (session cookie) │
│ Severity: High (post-auth RCE) │
├─────────────────────────────────────────────────────┤
│ Finding 2: Hardcoded credentials │
│ File: /etc/config/admin.conf │
│ Creds: admin / admin123 │
│ Severity: Critical │
├─────────────────────────────────────────────────────┤
│ Finding 3: Stack overflow in setup.cgi │
│ Function: process_wifi_config @ 0x00402100 │
│ Buffer: 64 bytes, strcpy from POST "ssid" param │
│ No NX, no canary, no ASLR │
│ Severity: Critical (pre-auth or post-auth RCE) │
├─────────────────────────────────────────────────────┤
│ Finding 4: Shared TLS private key │
│ File: /etc/ssl/server.key │
│ Impact: HTTPS interception for all units │
│ Severity: High │
└─────────────────────────────────────────────────────┘Running extracted binaries in QEMU
For dynamic analysis, you can run extracted binaries without a full device using QEMU user-mode emulation.
# Install QEMU user-mode for the target architecture
sudo apt install qemu-user-static
# Copy the firmware's libraries
export ROOTFS=~/firmware-lab/_firmware.bin.extracted/squashfs-root
# Run a binary using the firmware's own libraries
qemu-mipsel-static -L $ROOTFS $ROOTFS/usr/sbin/httpdIf the binary expects specific files or devices, create them.
# Many daemons check /dev/nvram or /proc/mtd
sudo mkdir -p /tmp/fake-nvram
echo "admin_password=admin123" > /tmp/fake-nvram/nvram.ini
# Set environment variables the CGI expects
export QUERY_STRING="target=127.0.0.1;id"
export REQUEST_METHOD="GET"
export CONTENT_LENGTH=0
# Run the CGI directly
qemu-mipsel-static -L $ROOTFS $ROOTFS/www/cgi-bin/admin.cgiWarning
NVRAM emulation Many embedded binaries call
nvram_get()to read configuration. If the binary crashes immediately, it’s likely failing to read NVRAM. Tools likenvram-faker(an LD_PRELOAD library) or Firmadyne can emulate NVRAM for you. This is the most common obstacle in dynamic firmware analysis.
For full system emulation with networking (to interact with the web interface), use the QEMU environment from the cross-compiling tutorial with the extracted rootfs mounted via NFS or packed into an image.
Automating the analysis
Create a script that performs the initial triage automatically.
#!/bin/bash
# firmware-triage.sh - automated first-pass firmware analysis
FIRMWARE=$1
if [ -z "$FIRMWARE" ]; then
echo "Usage: $0 <firmware.bin>"
exit 1
fi
echo "=== Firmware Triage: $FIRMWARE ==="
echo ""
echo "--- File Info ---"
file "$FIRMWARE"
echo "Size: $(du -h "$FIRMWARE" | cut -f1)"
echo ""
echo "--- Binwalk Signature Scan ---"
binwalk "$FIRMWARE"
echo ""
echo "--- Extracting ---"
binwalk -e "$FIRMWARE" -q
EXTRACTED="_$(basename "$FIRMWARE").extracted"
if [ ! -d "$EXTRACTED" ]; then
EXTRACTED=$(ls -dt _*.extracted 2>/dev/null | head -1)
fi
ROOTFS=$(find "$EXTRACTED" -type d \( -name "squashfs-root" -o -name "jffs2-root" -o -name "rootfs" \) 2>/dev/null | head -1)
if [ -z "$ROOTFS" ]; then
echo "No filesystem extracted. May need manual extraction."
exit 1
fi
echo "Rootfs: $ROOTFS"
echo ""
echo "--- Architecture ---"
file "$ROOTFS"/bin/busybox 2>/dev/null || file "$ROOTFS"/bin/* 2>/dev/null | head -1
echo ""
echo "--- Interesting Binaries ---"
echo "Web servers:"
find "$ROOTFS" -type f \( -name "httpd" -o -name "lighttpd" -o -name "nginx" -o -name "goahead" \) 2>/dev/null
echo "CGI handlers:"
find "$ROOTFS" -name "*.cgi" 2>/dev/null
echo "Custom daemons:"
find "$ROOTFS" -path "*/sbin/*" -type f 2>/dev/null | while read -r f; do
file "$f" 2>/dev/null | grep -q "ELF" && echo "$f"
done
echo ""
echo "--- Hardcoded Credentials Search ---"
grep -ri "password\|passwd\|secret\|token" "$ROOTFS/etc/" 2>/dev/null | head -20
echo ""
echo "Strings in web binaries:"
find "$ROOTFS" -type f \( -name "httpd" -o -name "*.cgi" \) 2>/dev/null | while read -r bin; do
hits=$(strings "$bin" | grep -ciE "password|admin|root|backdoor|secret")
echo " $bin: $hits credential-related strings"
done
echo ""
echo "--- Binary Protections ---"
find "$ROOTFS" -type f \( -name "httpd" -o -name "*.cgi" \) 2>/dev/null | head -5 | while read -r bin; do
echo " === $(basename $bin) ==="
checksec --file="$bin" 2>/dev/null || echo " (checksec not available)"
done
echo ""
echo "--- SSH/TLS Keys ---"
find "$ROOTFS" -type f \( -name "*.pem" -o -name "*.key" -o -name "id_rsa" -o -name "id_dsa" \) 2>/dev/null
echo ""
echo "=== Triage Complete ==="chmod +x firmware-triage.sh
./firmware-triage.sh firmware.binLimitations and next steps
This tutorial covers the most accessible firmware analysis workflow, but real-world targets introduce complications.
What we didn’t cover:
- Encrypted firmware — some vendors encrypt update files; you need the decryption key (often in the bootloader or an older unencrypted version)
- Custom compression — proprietary packing formats that binwalk doesn’t recognize; look for compression routines in the bootloader
- Bare-metal firmware — devices without Linux (microcontrollers running RTOS or no OS) require different tools (Binary Ninja, IDA Pro, or Ghidra with custom loaders)
- Hardware interfaces — UART shell access, JTAG debugging, and SPI flash dumping give you more information than firmware files alone
Where to go from here:
- Run the extracted binaries under GDB in QEMU and build working exploits for the vulnerabilities you find — the cross-compiling tutorial provides the environment
- Apply the attack surface audit methodology to the extracted filesystem
- Report vulnerabilities responsibly through the vendor’s security contact or a coordinated disclosure program