Assigning Numbers to Memory Cells Using Pointers
Explore how to assign numbers to memory cells through pointers using C, C++, and x64 assembly language. Understand the role of registers like %RAX and %RBX, and practice debugging with GDB commands such as breakpoints, disassembly, and step execution to analyze memory and register changes.
Pointers to assign numbers
The following sequence of pseudocode is a set of instructions to interpret the contents of the %RAX register as the address of a memory cell and then assign a value to that memory cell:
address a -> rax
1 -> (rax)
Assigning numbers in C and C++
In C and C++, assigning a number is called dereferencing a pointer, and we write:
Assigning numbers in assembly language
In assembly language, we write:
The lea (load effective address) instruction is a method of obtaining an address from one of the Intel processors’ memory addressing modes. It moves the contents of the specified memory address to the destination register. Again, we see movl instead of mov because integers occupy -bit memory cells, and we want to address only a -bit memory cell. When it comes to x64 Linux, memory cells that contain integers are half the size (-bit) of memory cells that contain addresses (-bit).
Assigning numbers in GDB disassembly output
In GDB disassembly output, we expect to see something like this:
Explanation
To illustrate some instructions without depending on how the compiler translates the C and C++ code, we wrote the program in assembler.
- Line 10: Calculates the effective address of
aand stores it in the%raxregister. - Line 11: Assigns
1into the indirect register (address is in register%rax). - Line 16: Reads the value from address
%raxand assigns it to%edx.
-
Line 17: Adds the value of registers
%edxand indirect register%rbxand assigns the result to the indirect register%rbx. -
Line 19: Increments the long word at the address held in the
%raxregister. -
Line 22: Performs signed multiplication of the values read from address
%rbxand%eaxand assigns the result to%eax. -
Line 25: Assigns the value
0x3cto%rax. -
Line 27: Transfers control to the operating system.
Note: You can practice all the commands in the coding playground provided at the end of the lesson.
We need to compile and link the assembly code first, then load it into GDB and disassemble the main function:
After running the above commands, the output should be as given below:
We put a breakpoint in the main function, run the program until GDB breaks in, and then disassemble the main function:
break main
The breakpoint is shown below:
Now run the program:
set disable-randomization off
run
The output after executing the above command would be as follows:
Now we disassemble the main function:
disass main
The output is given below:
Let’s examine the variables a and b to verify the memory layout using info variables GDB command:
info variables
We get the following output:
We also verify that the values of the %RAX and %RBX registers are initialized to zero.
info registers rax rbx
The output is given below:
We instruct GDB to automatically display the current instruction to be executed and the value of register %RIP using GDB command:
display/i $rip
We get the following output:
We display the value of register %RAX using the GDB command:
display/x $rax
The output is given below:
We display register %RBX using the GDB command:
display/x $rbx
The output is shown below:
To see the content of variable a, we use the GDB command:
display/x (int)a
We get the following output:
To see the content of variable b, we use the GDB command:
display/x (int)b
The output is given below:
The figure below shows the pseudocode and assembly language program instructions of number assignment using pointers.
Now we execute the first four instructions that correspond to our pseudocode using the stepi or si GDB command to run one line of code at a time. Let’s execute the si command:
si
The kernel is running on lea 0x402000, %rax, loaded in %RIP. It affects the values of two registers. The %RAX register is assigned 0x402000, and the contents of the %RIP register are updated to the next instruction.
This time, The kernel runs on movl $0x1, (%rax). This means that 1 is assigned into the indirect register (address is in register %rax, which is a variable a). The %RIP register is updated to the next instruction.
With the kernel running on lea 0x402004, %rbx, loaded in %RIP, the %RBX register is assigned 0x402004, and the contents of the %RIP register are updated. The %RIP register is updated to the next instruction.
Now, the kernel runs on movl $0x1, (%rbx). This means that 1 is assigned into the indirect register (address is in register %rbx, which is a variable b). The %RIP register is updated to the next instruction.
All this corresponds to a memory layout shown in the figure below.
Summary of the lesson
Below is a summary of the commands we learned in this lesson:
- The command to compile and link the program:
- The command to run the gdb and program:
- The command to insert a breakpoint in the program:
- The command to disable the randomization of the program:
- The command to run the function:
- The command to disassemble the function:
- The command to verify the memory layout of the function:
- The command to verify the value of registers in the function:
- Below are the commands to display the current value in registers and variables in a function by running the following commands one by one.
- The command
stepiorsiis used to execute the instructions step by step:
Try it yourself
Click on the “Run” button to execute the following code. Practice all the commands discussed above in the terminal window.
.global _start
.data
a: .int 0
b: .int 0
.text
main:
_start:
lea a, %rax
movl $1, (%rax)
lea b, %rbx
movl $1, (%rbx)
mov $0x3c, %rax
mov $0, %rdi
syscall