Reverse Engineering and CTF Challenge Notes

Reverse engineering is the process of understanding how software works without access to its original source code.
In security challenges (CTFs), the goal is often to recover hidden data or logic by dissecting a binary.
This walkthrough documents my first attempt at such a challenge, focusing on ELF-based reverse engineering and the reasoning process behind each step.


1. Static Analysis

Static analysis means inspecting the binary without running it.

file ./binary           # Type, architecture, stripped?
strings ./binary        # Extract readable strings
checksec ./binary       # Security features (PIE, NX, RELRO, Canaries)

ELF Header

readelf -h matryoshka

Shows class, endianness, type, entry point, etc.
Here, Type: DYN means it’s a PIE executable, not a shared library.

The INTERP program header:

readelf -l matryoshka | grep INTERP
# [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]

confirms it’s an executable using the dynamic loader.

Quick Summary via file

file matryoshka
# ELF 64-bit LSB pie executable ... stripped

Stripped” = symbol table removed → variable/function names lost.

Inspect Strings

strings -n 10 matryoshka | less

Look for clues, readable data, or embedded keys.

Security Properties

checksec --file matryoshka
FeatureStatusMeaning
PIEPosition-independent (ASLR randomization)
NXNon-executable stack
CanaryStack overflow detection
RELROPartialSome GOT protection
SymbolsStripped

2. Dynamic Linking & libc Calls

readelf -r matryoshka lists relocations.
Interesting imports included:

memfd_create
ftruncate
fdopen
fwrite
execve

memfd_create stands out — it creates anonymous in-memory files which can later be executed via /proc/self/fd/<n>.


3. Decompilation & Function Discovery

Using radare2 (or Ghidra):

r2 matryoshka
[0x000010d0]> aaa        # analyze all
[0x000010d0]> afl        # list functions

Functions found included main, fcn.00101344, fcn.001012f4, etc.
The call chain revealed this logic:

main
 └── operation()
      ├── xor_cipher()
      ├── write_to_memfd()
      └── exec_memfd_stream()

Decompiled Logic (after renaming)

void xor_cipher(char *buf, int len, uint8_t key) {
    for (int i=0; i<len; i++)
        buf[i] ^= key;
}

FILE *write_to_memfd(char *buf, int len) {
    int fd = memfd_create("x", 1);
    ftruncate(fd, len);
    FILE *f = fdopen(fd, "r+");
    fwrite(buf, 1, len, f);
    rewind(f);
    return f;
}

void exec_memfd_stream(FILE *f) {
    char path[64];
    sprintf(path, "/proc/self/fd/%d", fileno(f));
    char *argv[] = { NULL };
    execve(path, argv, NULL);
}

main performs:

xor_cipher(magic_blob, len, argv[1][0] - 0x57);
FILE *f = write_to_memfd(magic_blob, len);
exec_memfd_stream(f);

So the program:

  1. XOR-decrypts an embedded blob.
  2. Writes it to an in-memory file.
  3. Executes that file.

If the XOR key is wrong, the blob is garbage → execve fails.


4. Recovering the XOR Key

From .rodata:

magic_blob[0..3] = 70 4A 43 49
Expected ELF magic = 7F 45 4C 46

XOR both sequences:

(70 4A 43 49) ^ (7F 45 4C 46) = 0F 0F 0F 0F

→ XOR key = 0x0F

Since the program subtracts 0x57 from input char,

input_char = 0x0F + 0x57 = 0x66 = 'f'

Run:

./matryoshka f

It silently decrypts the blob, writes it to /proc/self/fd/3, and tries to execve it.


5. Tracing Execution

Using strace

strace -f -s 200 -o trace.txt ./matryoshka f
grep execve trace.txt

Second execve call failed with EFAULT (bad argv),
but a valid ELF was already written to fd 3.

Extract the In-Memory File

Find PID → copy fd:

ls -l /proc/<pid>/fd
cp /proc/<pid>/fd/3 /tmp/magic_blob.bin
file /tmp/magic_blob.bin

Result: another valid ELF executable.


6. Nested Binaries (Matryoshka Concept)

Each extracted ELF repeated the same pattern:

  • Hard-coded string,
  • XOR cipher,
  • memfd + execve.

By recursively extracting:

matryoshka → magic_blob.bin → magic_blob2.bin → magic_blob3.bin

The final layer contained:

int main(int argc, char **argv) {
    if (atoi(argv[1]) == 9)
        puts("u win good job!!!!");
    else
        fail();
}
./magic_blob3.bin 9
# u win good job!!!!

7. What We Learned

ConceptInsight
ELF structureHeaders, program segments, and section tables
PIE & ASLRWhy offsets are used instead of fixed addresses
memfd_createRunning programs entirely from RAM
XOR cipherSimple but effective obfuscation
execveDirect system call interface
strace & gdbSystem-call and runtime tracing
Nested binaries“Matryoshka” style layering

🧠 Reflection

This challenge elegantly combines:

  • basic cryptography (XOR),
  • Linux process internals,
  • ELF familiarity,
  • and runtime analysis.

It demonstrates how self-modifying or self-executing code can conceal payloads — a technique used both in CTF puzzles and real-world malware.


Future Ideas


🏁 Final Thoughts

“Each layer you peel back reveals a little more of how the system really works.”

Reverse engineering isn’t just about “breaking” programs —
it’s about learning how compilers, linkers, and the OS cooperate to make software run.