This is the first post of what should be a blog series following my progress with the “Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation”, Bruce Dang, Alexandre Gazet, Elias Bachaalany, Sebastien Josse, ISBN: 978-1-118-78731-1.
The book includes number of exercises and the authors encourage the people to blog their solutions. That is exactly what I intent to do.
The solutions source code and write-ups PDF versions could be found also at the github repository at https://github.com/malchugan/PRE-Exercises.
Chapter1-Exercise1 – Task
“This function uses a combination SCAS and STOS to do its work. First, explain what is the type of the [EBP+8] and [EBP+C] in line 1 and 8, respectively. Next, explain what this snippet does.
01: 8B 7D 08 mov edi, [ebp+8]
02: 8B D7 mov edx, edi
03: 33 C0 xor eax, eax
04: 83 C9 FF or ecx, 0FFFFFFFFh
05: F2 AE repne scasb
06: 83 C1 02 add ecx, 2
07: F7 D9 neg ecx
08: 8A 45 0C mov al, [ebp+0Ch]
09: 8B FA mov edi, edx
10: F3 AA rep stosb
11: 8B C2 mov eax, edx”
Excerpt from: “Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation”,
Bruce Dang, Alexandre Gazet, Elias Bachaalany, Sebastien Josse, ISBN: 978-1-118-78731-1
My Short answer
EBP+8 is of type pointer to a char (first element of null terminated string)
EBP+C is of type char (1 byte)
The snippet is a body of a function responsible for replacing every character from a given string with another predefined character.
Using some initial previous knowledge about how the function calls in assembly work I assumed that the snippet is most probably part of a function body. Those suspicions were fed by the use of EBP register, particularly [EBP+8] and [EBP+0Ch]. Usually those are pointers to the function arguments. The following is a standard representation of the stack layout after a function call.
Detailed Intel x86 function calls explanation could be found on http://unixwiz.net/techtips/win32-callconv-asm.html
Proof of concept
I decided to write an assembly program and run it through IDA Debugger to check if my interpretation is holding up which it did.
Note: This is a Linux assembly file using AT&T syntax. To assemble and link the code bellow execute the following form the command line:
$as -gstabs -o ex1_att.o ex1_att.s
$ld -o ex1_att ex1_att.o
As a result you should have ex1_att executable which you could examine with GDB, IDA or any other debugger of your choice.
Here is the listing of the program ex1_att.s:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
.data myString: .asciz "abcdefgh" .text .globl _start .type TestFunc, @function TestFunc: # function prologue push %ebp # preserve the old EBP value movl %esp, %ebp # function body - same as the code from the book but in AT&T syntax movl 8(%ebp), %edi # load the first function parameter in EDI - pointer to myStirng movl %edi, %edx # store the initial EDI value in EDX xor %eax, %eax # EAX = 0 or $0xffffffff, %ecx # set ECX to the maximum representable value repne scasb # compare EDI content byte by byte with AL (NULL) or ECX becomes 0 :) add $2, %ecx # at the end of repne scasb ECX has value (-strlen -2). With this sum ECX = -strlen neg %ecx # ECX = strlen movb 12(%ebp), %al # load second function argument movl %edx, %edi # load the initial EDI value (firts argument) back to EDI rep stosb # store strlen number of bytes in EDI al with the al value (character) movl %edx, %eax # store EDX value in EAX # function epilogue movl %ebp, %esp # restore the old ESP value pop %ebp # restore the old EBP value ret _start: # push the function arguments into the stack push $0x2A # ASCII '*' push $myString # pointer to myString # call the function call TestFunc # exit gracefully (call software interrupt 1 - exit) movl $1, %eax movl $0, %ex int $0x80