From e5e5443fcc3627922a7ee5643d589e23fffb0abd Mon Sep 17 00:00:00 2001 From: Ariadne Conill Date: Wed, 25 Jan 2023 10:26:43 -0800 Subject: [PATCH] fix up code formatting on qemu blog --- ...-emulation-to-reverse-engineer-binaries.md | 108 ++++++++++-------- 1 file changed, 62 insertions(+), 46 deletions(-) diff --git a/content/blog/using-qemu-user-emulation-to-reverse-engineer-binaries.md b/content/blog/using-qemu-user-emulation-to-reverse-engineer-binaries.md index cc9970c..4926858 100644 --- a/content/blog/using-qemu-user-emulation-to-reverse-engineer-binaries.md +++ b/content/blog/using-qemu-user-emulation-to-reverse-engineer-binaries.md @@ -11,6 +11,7 @@ However, most people don't realize that you can run a `qemu-user` emulator which You don't need `gdb` for this to be a powerful reverse engineering tool, however.  The emulator itself includes many powerful tracing features.  Lets look into them by writing and compiling a sample program, that does some recursion by [calculating whether a number is even or odd inefficiently](https://ariadne.space/2021/04/27/the-various-ways-to-check-if-an-integer-is-even/): +```c #include #include @@ -29,84 +30,94 @@ int main(void) {    printf("isEven(%d): %d\\n", 1025, isEven(1025));    return 0; } +``` Compile this program with `gcc`, by doing `gcc -ggdb3 -Os example.c -o example`. The next step is to install the `qemu-user` emulator for your architecture, in this case we want the `qemu-x86_64` package: -$ doas apk add qemu-x86\_64 -(1/1) Installing qemu-x86\_64 (6.0.0-r1) +``` +$ doas apk add qemu-x86_64 +(1/1) Installing qemu-x86_64 (6.0.0-r1) $ +``` Normally, you would also want to install the `qemu-openrc` package and start the `qemu-binfmt` service to allow for the emulator to handle any program that couldn't be run natively, but that doesn't matter here as we will be running the emulator directly. The first thing we will do is check to make sure the emulator can run our sample program at all: -$ qemu-x86\_64 ./example +``` +$ qemu-x86_64 ./example isEven(1025): 0 +``` Alright, all seems to be well.  Before we jump into using `gdb` with the emulator, lets play around a bit with the tracing features.  Normally when reverse engineering a program, it is common to use tracing programs like `strace`.  These tracing programs are quite useful, but they suffer from a design flaw: they use `ptrace(2)` to accomplish the tracing, which can be detected by the program being traced.  However, we can use qemu-user to do the tracing in a way that is transparent to the program being analyzed: -$ qemu-x86\_64 -d strace ./example -22525 arch\_prctl(4098,274903714632,136818691500777464,274903714112,274903132960,465) = 0 -22525 set\_tid\_address(274903715728,274903714632,136818691500777464,274903714112,0,465) = 22525 +``` +$ qemu-x86_64 -d strace ./example +22525 arch_prctl(4098,274903714632,136818691500777464,274903714112,274903132960,465) = 0 +22525 set_tid_address(274903715728,274903714632,136818691500777464,274903714112,0,465) = 22525 22525 brk(NULL) = 0x0000004000005000 22525 brk(0x0000004000007000) = 0x0000004000007000 -22525 mmap(0x0000004000005000,4096,PROT\_NONE,MAP\_PRIVATE|MAP\_ANONYMOUS|MAP\_FIXED,-1,0) = 0x0000004000005000 -22525 mprotect(0x0000004001899000,4096,PROT\_READ) = 0 -22525 mprotect(0x0000004000003000,4096,PROT\_READ) = 0 +22525 mmap(0x0000004000005000,4096,PROT_NONE,MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED,-1,0) = 0x0000004000005000 +22525 mprotect(0x0000004001899000,4096,PROT_READ) = 0 +22525 mprotect(0x0000004000003000,4096,PROT_READ) = 0 22525 ioctl(1,TIOCGWINSZ,0x00000040018052b8) = 0 ({55,236,0,0}) isEven(1025): 0 22525 writev(1,0x4001805250,0x2) = 16 -22525 exit\_group(0) +22525 exit_group(0) +``` But we can do even more.  For example, we can learn how a CPU would hypothetically break a program down into translation buffers full of micro-ops (these are TCG micro-ops but real CPUs are similar enough to gain a general understanding of the concept): -$ qemu-x86\_64 -d op ./example +``` +$ qemu-x86_64 -d op ./example OP: -ld\_i32 tmp11,env,$0xfffffffffffffff0 -brcond\_i32 tmp11,$0x0,lt,$L0 +ld_i32 tmp11,env,$0xfffffffffffffff0 +brcond_i32 tmp11,$0x0,lt,$L0 ---- 000000400185eafb 0000000000000000 -discard cc\_dst -discard cc\_src -discard cc\_src2 -discard cc\_op -mov\_i64 tmp0,$0x0 -mov\_i64 rbp,tmp0 +discard cc_dst +discard cc_src +discard cc_src2 +discard cc_op +mov_i64 tmp0,$0x0 +mov_i64 rbp,tmp0 ---- 000000400185eafe 0000000000000031 -mov\_i64 tmp0,rsp -mov\_i64 rdi,tmp0 +mov_i64 tmp0,rsp +mov_i64 rdi,tmp0 ---- 000000400185eb01 0000000000000031 -mov\_i64 tmp2,$0x4001899dc0 -mov\_i64 rsi,tmp2 +mov_i64 tmp2,$0x4001899dc0 +mov_i64 rsi,tmp2 ---- 000000400185eb08 0000000000000031 -mov\_i64 tmp1,$0xfffffffffffffff0 -mov\_i64 tmp0,rsp -and\_i64 tmp0,tmp0,tmp1 -mov\_i64 rsp,tmp0 -mov\_i64 cc\_dst,tmp0 +mov_i64 tmp1,$0xfffffffffffffff0 +mov_i64 tmp0,rsp +and_i64 tmp0,tmp0,tmp1 +mov_i64 rsp,tmp0 +mov_i64 cc_dst,tmp0 ---- 000000400185eb0c 0000000000000019 -mov\_i64 tmp0,$0x400185eb11 -sub\_i64 tmp2,rsp,$0x8 -qemu\_st\_i64 tmp0,tmp2,leq,0 -mov\_i64 rsp,tmp2 -mov\_i32 cc\_op,$0x19 -goto\_tb $0x0 -mov\_i64 tmp3,$0x400185eb11 -st\_i64 tmp3,env,$0x80 -exit\_tb $0x7f72ebafc040 -set\_label $L0 -exit\_tb $0x7f72ebafc043 -\[...\] +mov_i64 tmp0,$0x400185eb11 +sub_i64 tmp2,rsp,$0x8 +qemu_st_i64 tmp0,tmp2,leq,0 +mov_i64 rsp,tmp2 +mov_i32 cc_op,$0x19 +goto_tb $0x0 +mov_i64 tmp3,$0x400185eb11 +st_i64 tmp3,env,$0x80 +exit_tb $0x7f72ebafc040 +set_label $L0 +exit_tb $0x7f72ebafc043 +[...] +``` If you want to trace the actual CPU registers for every instruction executed, that's possible too: -$ qemu-x86\_64 -d cpu ./example +``` +$ qemu-x86_64 -d cpu ./example RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000 RDX=0000000000000000 RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000004001805690 R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000000 @@ -127,11 +138,13 @@ DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 CCS=0000000000000000 CCD=0000000000000000 CCO=EFLAGS EFER=0000000000000500 -\[...\] +[...] +``` You can also trace with disassembly for each translation buffer generated: -$ qemu-x86\_64 -d in\_asm ./example +``` +$ qemu-x86_64 -d in_asm ./example ---------------- IN:   0x000000400185eafb:  xor    %rbp,%rbp @@ -152,14 +165,16 @@ IN:   0x000000400185eb29:  inc    %rax 0x000000400185eb2c:  test   %rcx,%rcx 0x000000400185eb2f:  jne    0x400185eb21 -\[...\] +[...] +``` All of these options, and more, can also be stacked.  For more ideas, look at `qemu-x86_64 -d help`.  Now, lets talk about using this with `gdb` using qemu-user's gdbserver functionality, which allows for `gdb` to control a remote machine. To start a program under gdbserver mode, we use the `-g` argument with a port number.  For example, `qemu-x86_64 -g 1234 ./example` will start our example program with a gdbserver listening on port 1234.  We can then connect to that gdbserver with `gdb`: +``` $ gdb ./example -\[...\] +[...] Reading symbols from ./example... (gdb) target remote localhost:1234 Remote debugging using localhost:1234 @@ -176,11 +191,12 @@ Breakpoint 1, isEven (x=1025) at example.c:12 No locals. #1  0x0000004000001269 in main () at example.c:16 No locals. +``` All of this is happening without any knowledge or cooperation of the program.  As far as its concerned, its running as normal, there is no ptrace or any other weirdness. However, this is not 100% perfect: a program could be clever and run the `cpuid` instruction and check for `GenuineIntel` or `AuthenticAMD` and crash out if it doesn't see that it is running on a legitimate CPU.  Thankfully, qemu-user has the ability to spoof CPUs with the `-cpu` option. -If you find yourself needing to spoof the CPU, you'll probably have the best results with a simple CPU type like `-cpu Opteron_G1-v1` or similar.  That CPU type spoofs an Opteron 240 processor, which was one of the first x86\_64 CPUs on the market.  You can get a full list of CPUs supported by your copy of the qemu-user emulator by doing `qemu-x86_64 -cpu help`. +If you find yourself needing to spoof the CPU, you'll probably have the best results with a simple CPU type like `-cpu Opteron_G1-v1` or similar.  That CPU type spoofs an Opteron 240 processor, which was one of the first x86_64 CPUs on the market.  You can get a full list of CPUs supported by your copy of the qemu-user emulator by doing `qemu-x86_64 -cpu help`. There's a lot more qemu-user emulation can do to help with reverse engineering, for some ideas, look at `qemu-x86_64 -h` or similar.