Reverse Engineering and CTF Challenge Notes
Reverse engineering is the process of understanding how software works without access to its original source code.
In security challenges (CTFs), the goal is often to recover hidden data or logic by dissecting a binary.
This walkthrough documents my first attempt at such a challenge, focusing on ELF-based reverse engineering and the reasoning process behind each step.
1. Static Analysis
Static analysis means inspecting the binary without running it.
file ./binary # Type, architecture, stripped?
strings ./binary # Extract readable strings
checksec ./binary # Security features (PIE, NX, RELRO, Canaries)
ELF Header
readelf -h matryoshka
Shows class, endianness, type, entry point, etc.
Here, Type: DYN means it’s a PIE executable, not a shared library.
The INTERP program header:
readelf -l matryoshka | grep INTERP
# [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
confirms it’s an executable using the dynamic loader.
Quick Summary via file
file matryoshka
# ELF 64-bit LSB pie executable ... stripped
“Stripped” = symbol table removed → variable/function names lost.
Inspect Strings
strings -n 10 matryoshka | less
Look for clues, readable data, or embedded keys.
Security Properties
checksec --file matryoshka
| Feature | Status | Meaning |
|---|---|---|
| PIE | ✅ | Position-independent (ASLR randomization) |
| NX | ✅ | Non-executable stack |
| Canary | ✅ | Stack overflow detection |
| RELRO | Partial | Some GOT protection |
| Symbols | ❌ | Stripped |
2. Dynamic Linking & libc Calls
readelf -r matryoshka lists relocations.
Interesting imports included:
memfd_create
ftruncate
fdopen
fwrite
execve
memfd_create stands out — it creates anonymous in-memory files which can later be executed via /proc/self/fd/<n>.
3. Decompilation & Function Discovery
Using radare2 (or Ghidra):
r2 matryoshka
[0x000010d0]> aaa # analyze all
[0x000010d0]> afl # list functions
Functions found included main, fcn.00101344, fcn.001012f4, etc.
The call chain revealed this logic:
main
└── operation()
├── xor_cipher()
├── write_to_memfd()
└── exec_memfd_stream()
Decompiled Logic (after renaming)
void xor_cipher(char *buf, int len, uint8_t key) {
for (int i=0; i<len; i++)
buf[i] ^= key;
}
FILE *write_to_memfd(char *buf, int len) {
int fd = memfd_create("x", 1);
ftruncate(fd, len);
FILE *f = fdopen(fd, "r+");
fwrite(buf, 1, len, f);
rewind(f);
return f;
}
void exec_memfd_stream(FILE *f) {
char path[64];
sprintf(path, "/proc/self/fd/%d", fileno(f));
char *argv[] = { NULL };
execve(path, argv, NULL);
}
main performs:
xor_cipher(magic_blob, len, argv[1][0] - 0x57);
FILE *f = write_to_memfd(magic_blob, len);
exec_memfd_stream(f);
So the program:
- XOR-decrypts an embedded blob.
- Writes it to an in-memory file.
- Executes that file.
If the XOR key is wrong, the blob is garbage → execve fails.
4. Recovering the XOR Key
From .rodata:
magic_blob[0..3] = 70 4A 43 49
Expected ELF magic = 7F 45 4C 46
XOR both sequences:
(70 4A 43 49) ^ (7F 45 4C 46) = 0F 0F 0F 0F
→ XOR key = 0x0F
Since the program subtracts 0x57 from input char,
input_char = 0x0F + 0x57 = 0x66 = 'f'
Run:
./matryoshka f
It silently decrypts the blob, writes it to /proc/self/fd/3, and tries to execve it.
5. Tracing Execution
Using strace
strace -f -s 200 -o trace.txt ./matryoshka f
grep execve trace.txt
Second execve call failed with EFAULT (bad argv),
but a valid ELF was already written to fd 3.
Extract the In-Memory File
Find PID → copy fd:
ls -l /proc/<pid>/fd
cp /proc/<pid>/fd/3 /tmp/magic_blob.bin
file /tmp/magic_blob.bin
Result: another valid ELF executable.
6. Nested Binaries (Matryoshka Concept)
Each extracted ELF repeated the same pattern:
- Hard-coded string,
- XOR cipher,
- memfd + execve.
By recursively extracting:
matryoshka → magic_blob.bin → magic_blob2.bin → magic_blob3.bin
The final layer contained:
int main(int argc, char **argv) {
if (atoi(argv[1]) == 9)
puts("u win good job!!!!");
else
fail();
}
./magic_blob3.bin 9
# u win good job!!!!
7. What We Learned
| Concept | Insight |
|---|---|
| ELF structure | Headers, program segments, and section tables |
| PIE & ASLR | Why offsets are used instead of fixed addresses |
| memfd_create | Running programs entirely from RAM |
| XOR cipher | Simple but effective obfuscation |
| execve | Direct system call interface |
| strace & gdb | System-call and runtime tracing |
| Nested binaries | “Matryoshka” style layering |
🧠 Reflection
This challenge elegantly combines:
- basic cryptography (XOR),
- Linux process internals,
- ELF familiarity,
- and runtime analysis.
It demonstrates how self-modifying or self-executing code can conceal payloads — a technique used both in CTF puzzles and real-world malware.
Future Ideas
- Write a script to automate recursive extraction of nested ELF layers.
- Explore how memfd-based execution can evade disk forensics.
- Try similar challenges on sites like:
🏁 Final Thoughts
“Each layer you peel back reveals a little more of how the system really works.”
Reverse engineering isn’t just about “breaking” programs —
it’s about learning how compilers, linkers, and the OS cooperate to make software run.