[Reverse Engineering Workshop] The Call Stack & OllyDbg
We hope the previous task went well. If you got stuck, or if something was left unclear - don’t hesitate to ask.
This time we’ll focus on the stack and its state throughout the run-time of a program. This material will be reviewed during the workshop, but don’t think we are wasting your time - this stuff needs reviewing. The more you hear it, the better you understand it. The call stack and calling conventions are confusing topics in x86 architecture. Don’t make this scare you. We will go through this together it will eventually make perfect sense.
Ok, let’s go!
Our second preparation will target the following topic:
The call stack
Time estimation: 2 hours.
The Call Stack & Calling Conventions
In computer science, a call stack is a data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack".
In the previous prep-mail, you’ve read about x86 instructions. Do you remember any instructions related to the stack? push? pop? call? ret? we’ll soon address them again.
Purposes of the Call Stack
1. Storing return addresses
The main reason for having a call stack is to keep track of the point to which each active subroutine should return when it finishes executing. If, for example, a subroutine DrawSquare calls a subroutine DrawLine from four different places, DrawLine must know where to return when its execution completes. To accomplish this, the address following the call instruction - the return address - is pushed onto the call stack with each call.
2. Local-Data storage
A function frequently needs memory space for storing the values of local variables, the variables that are known only within the function and do not store values after it returns. It is convenient to allocate space for these variables by moving the top of the stack towards the lower addresses, to provide the necessary space.
3. Parameter passing
Subroutines often require that parameters values be supplied to them by the code which calls them. The call stack works well as a place for these parameters, especially since each call to a subroutine will be given separate space on the call stack for those values.
Call Stack Structure
A call stack is composed of stack frames - data structures containing subroutine state information. What is state information? Function parameters, local variables, return address and the frame’s base address.
Each stack frame corresponds to a call to a subroutine which has not yet returned.
For example, if a subroutine named DrawLine is currently running (and was called by DrawSquare), the top part of the call stack might be laid out like in the picture (very rough, don’t use as reference):
The stack frame at the top of the stack is for the currently executing routine. The stack frame usually includes at least the following items (in push order):
the arguments (parameter values) passed to the routine (if any);
the return address back to the routine's caller (e.g. in the DrawLine stack frame, an address into DrawSquare’s code); and
space for the local variables of the routine (if any).
Stack & Frame Pointers
The next two paragraphs might be very confusing. If necessary, read them again and again, use a pen and a paper, repeat until you feel comfortable with the presented information. It's important :)
Recall: ESP is the Stack Pointer - it always points to the top of the stack.
When values are pushed onto the stack - it decreases. When values are popped off the stack - it increases.
This register constantly changes.
During a function’s execution, we need a way to access the function data - its parameters and local variables. Due to the dynamic nature of ESP - we can’t use it as an anchor. Instead, we take the value of ESP at the moment when the function is called and we put it in EBP - the Base Pointer. During the function execution, EBP will stay fixed (unlike ESP). Thus, EBP can be used as an anchor to access parameters and variables.
Note that when moving ESP’s value into EBP, we override the value already stored in EBP. For this reason, we need to preserve EBP’s previous value somehow - this is done by pushing EBP onto the stack right before it receives its new value.
Imagine DrawSquare calls DrawLine. DrawSquare’s stack frame will appear first (in higher addresses) on the stack.
When DrawLine is called, the current EBP (Base Pointer of DrawSqaure) will be pushed onto the stack.
EBP will then be assigned the current value of ESP. From now on, EBP is used as DrawLine’s base pointer to access its variables and parameters.
Throughout DrawLine, ESP might change many times, but EBP is fixed.
When DrawLine terminates, ESP will be be set to the value of EBP. This sets the pointer to the top of the stack to be equal to the base pointer, and practically frees the memory allocated on the stack for the local variables.
DrawSquare’s EBP will be popped from the stack. From now on, EBP can be used again as DrawSquare’s base pointer.
So, in fact, for some function func, func's stack frame contains the EBP value of func’s caller, but the EBP register itself is func’s base pointer.
Now watch this video to visualize what you’ve read up till now:
OllyDbg is a 32-bit assembler level analyzing debugger for Windows. Emphasis on binary code analysis makes it particularly useful in cases where source is unavailable.
Now in simple words :D
Olly lets us run a program instruction by instruction, put breakpoints, see the state of the registers, memory and stack and change (assemble) the code.
Download Olly here. We will use it in the last part of our workshop.
Now let’s perform a short sanity check and go through the four most useful shortcuts in Olly.
Download the famous game Minesweeper from this link or from wherever seems benign enough for you ;)
Open OllyDbg. Make sure there are no error messages etc. (you might need to run it as administrator).
In Olly, open Minesweeper.
Hit F9. This makes the program run until the next breakpoint (BP) or until user input is needed.
Find the address 01003E36 and click on it to highlight it. Hit F2 to put a BP on the instruction in that address.
Hit F9. The program should be stopped at address 01003E36.
Hit F7 to step into the function.
Now hit F8 to step over an instruction.
Ok now close everything, we’re done here :D
1. Which register holds the address of the top of the stack?
2. Suppose the function multiply(x, y) creates 2 local variables - tmp1 and tmp2 - during its execution. What instruction do you expect seeing that allocates memory for these variables?
3. In what order will multiply’s parameters, x and y, be pushed onto the stack before the call to multiply?
4. How can each of them be accessed during the function execution? Write the exact expressions.