Gdb Walkthrough

So far, we have been using gcc to compile C code into executable programs and, occasionally, into assembly language. For the upcoming homework and labs it we need to expand our palette to include objdump and gdb.

To get started, you need to make sure the tools are installed in your Linux installation (whether a physical machine, a VM, or Windows Subsystem for Linux). If you are using a lab computer, everything is already there. For other cases, use the following commands to install the tools:

sudo apt update
sudo apt install gcc build-essential
sudo apt install gcc-multilib
sudo apt install gdb

gcc-multilib lets you compile for 32-bit as well as 64-bit. gdb installs the debugger.

gcc Compiler/Linker

Compiles code in C, C++, and a variety of other languages and links code from multiple modules and libraries into an executable file.

Commonly-used options

  • -m32
    Compile and link for 32-bit mode
  • -O0
    Turn off all optimizations. Usually this makes assembly code easier to read and debug. (That’s capital “O” followed by zero.)
  • -O1 -O2 -O3
    Add varying degrees of optimization. These certainly make the code more efficient and sometimes make it easier to understand.
  • -Os
    Optimize for size (instead of performance). This may also make the generated code more intelligible.
  • -S
    Compile to assembly language - to see how things are done. (That’s a capital ‘S’.)
  • -S -masm=intel
    Compile to intel-syntax assembly language.
  • -S -masm=att
    Compile to AT&T-syntax assembly language. (Not really necessary because that’s the default.)
  • -c
    Compile to an object file to be linked or dumped.
  • -fno-asynchronous-unwind-tables -fno-pie -no-pie -mpreferred-stack-boundary=3
    Remove extra stuff from the generated code so you can see just what the compiler is doing. For details on what these mean see this article, this article, and this article from StackOverflow.
  • -fverbose-asm
  • -ggdb
    Include debugging symbols

Example

Create a file called add.c with the following contents:

int add(int a, int b)
{
  return a + b;
}

Compile it to assembly using the following command:

gcc -m32 -O0 -S -fno-asynchronous-unwind-tables -fno-pie -mpreferred-stack-boundary=3 add.c

This will produce a file called add.s with the assembly language version of the code.

Try the same thing without the -m32 option to see how 64-bit code differs from the 32-bit version.

objdump Object and executable file dump tool

Dumps the contents of object and executable files including disassembling machine code.

Disassembling means to take machine code and convert it into the equivalent assembly-language code. It’s very useful when you don’t have access to the source code of a program and you want to reverse engineer it to figure out how it works. You will need to disassemble code to complete the upcoming bomb and attack labs.

Commonly-used options

  • -d
    Disassemble all .text (code) sections.
  • -M intel
    Use Intel syntax when disassembling code.
  • -s
    Binary dump all sections.

Example

Use the same add.c file from the gcc example. Compile it to an object file with the following command:

gcc -O0 -c -fno-asynchronous-unwind-tables -fno-pie add.c

This produces an object (machine-language) file called add.o.

Now dump the contents using the following command:

objdump -d -s add.o

You will notice that the object file has three sections. The .text section contains the machine code. Since we used both the -d and the -s options, it also disassembles the .txt section to show us the assembly language version. If you look at the hexadecimal dump of the .text section it’s the same as the hexadecimal machine language to the left of the disassembled code. I think you’ll agree that it’s much easier to interpret the assembly language than the machine language.

gdb GNU Debugger

Loads a program and lets you examine it while running including setting breakpoints, examining register contents, examining variables, changing values and so forth.

Example

Update add.c with the following contents:

#include <stdio.h>

int add(int a, int b)
{
  return a + b;
}

int main(int argc, char **argv)
{
  printf("%d\n", add(3, 5));
}

Compile and link the program with the following command. Note the options that make the code easier to read and include symbols for the debugger to use.

gcc -O0 -fno-asynchronous-unwind-tables -fno-pie -no-pie -ggdb add.c

Start gdb by specifying the program to be debugged on the command line.

gdb a.out

Common GDB Commands

Once GDB is loaded you type commands to debug the code. Here are some of the most commonly-used commands:

  • disassemble <name>
    Disassemble the named function
  • list
    List the source code in the vicinity of the current location.
  • break <name>
    Set a breakpoint on the named function
  • break *<address>
    Set a breakpoint at the specified address
  • run
    Start the program from the beginning
  • continue
    Continue running the program after reaching a breakpoint
  • stepi
    Step through the next assembly instruction, diving into functions when called
  • nexti
    Step through the next assembly instruction, skipping over function calls
  • step
    Step through the next source code line, diving into functions when called
  • next
    Step through the next source code line skipping over function calls
  • info registers
    Show the contents of all registers
  • info register rax
    Show the contents of one register
  • print a
    Print the value of a variable
  • set $rax = 42
    Set the value of a register
  • set var a = 42
    Set the value of a variable
  • layout asm
    Change the screen layout to show the disassembly.
  • layout regs
    Change the screen layout to show the registers.
  • set disassembly-flavor intel
    Disassemble in Intel syntax
  • set disassembly-flavor att
    Disassemble in AT&T syntax (this is the default)
  • exit
    Exit the debugger

With the program loaded into the debugger, use the following commands to get started:

disassemble add
disassemble main
break add
run
list
info registers
backtrace
continue

Through those steps you were able to examine the two functions. Set a breakpoint on the add function. Run until it stopped at the breakpoint. List the source code. Examine the contents of the registers. Look at the call stack. And let the program run to completion.

Next, we’ll do some more advanced work in the debugger. First, we’ll change to “TUI” (Textual User Interface) which keeps valuable information on the screen.

layout regs

This shows the disassembly of the code to be run and the contents of the registers on the screen.

Now, let it run until the breakpoint again.

run

Now you can see that the instruction pointer is positioned on the first actual code of the add function. It has skipped over the preamble. The code you are looking at is about to move the argument a into register %edx.

Let’s single-step through that operation.

stepi

In the registers, it highlights things that have changed. rip changed to point at the next instruction. rdx got loaded with a 3 which is the value of variable a.

We’re going to modify that value to manipulate the output of the program.

set $rdx = 2

Let the program run to completion.

continue

You see that the program printed 7 which is the result of 2 + 5 instead of 3 + 5.

TUI Display Commands

The “Textual User Interface” is helpful as it can show disassembly, source code, registers and more. Here are some of the commands you can use to control the TUI mode.

  • ctrl-x ctrl-a
    Toggle the TUI mode on and off (Hold ctrl while pressing x, a)
  • win
    Turn the TUI on
  • layout src
    Use the source TUI mode which shows the source code near the current instruction pointer
  • layout asm
    Use the asm TUI mode which shows the disassembled machine code
  • layout regs
    Use the regs TUI mode which shows the registers and the disassembled machine code
  • layout split
    Use the split TUI mode which shows the source code and the disassembled machine code
  • layout next and layout prev
    Apply the next or previous TUI layout.
  • help layout
    Show available layout commands.

Experiment with the different display modes and see how they help you understand how the program is working.

Examining Memory

The x command stands for “eXamine.” It is used for examining the contents of memory. Try the following command:

x/16xb add

This tells the debugger to eXamine 16 bytes in hexadecimal format starting at the beginning of the function add. For detail on the options for formatting (hex, decimal, etc.) word size (byte, half word, word, etc.), amount to print and address type help x.

Try different display modes. Place breakpoints in different places. Change values in registers and in variables. See if you can get the program to crash.

We can only get you started in this walkthrough. More detail can be found here: GDB Step By Step Introduction.