Tokugawa Ieyasu
http://toku.es/2010/05/delta-offset/
May 2010
Distributed under a CC Attribution-Share Alike 3.0 Unported License.
We'll begin with something that, sooner or later, every virus written in assembler will need. First of all, let me show you a somewhat unusual program:
It's unusual in the sense that data is inside the code section, but that's how the viruses are. In order to infect other files, both the code and data must be placed sequentially in the same section, so they can be copied with a simple loop inside a new host (ok, I know that this is not always true, but it's not important right now).
Take a look to the above code inside any executable:
004004D0: 'Hello, World!',0x0A 004004DE: BA 0E 00 00 00 mov edx, 0x0E 004004E3: 8D 0C 25 D0 04 40 00 lea ecx, [0x004004D0] 004004EA: BB 01 00 00 00 mov ebx, 0x01 004004EF: B8 04 00 00 00 mov eax, 0x04 004004F4: CD 80 int 0x80 004004F6: BB 00 00 00 00 mov ebx, 0x00 004004FB: B8 01 00 00 00 mov eax, 0x01 00400500: CD 80 int 0x80
As can be seen with the lea instruction, the code will only work when loaded at that same address. This is not a problem most of the time, but a virus is executed from a new address within every new host, and needs to be programmed in a way that it can access code and data in such situations.
Taking advantage in the fact that the call instruction doesn't use fixed addresses to do its work, the problem can be solved as below:
But before understand how the trick works, we must understand how the call works. This instruction pushes the value of the EIP register onto the stack and jumps to the address given by the operand. In this particular case, it adds the offset of the called code to the instruction pointer and simply goes on with the execution in the new location.
Now let's see how the above code looks inside executables:
004004D0: 'Hello, World!',0x0A 004004DE: E8 00 00 00 00 call dword 0x004004E3 004004E3: 5D pop rbp 004004E4: 48 81 ED E3 04 40 00 sub rbp, 0x004004E3 004004EB: BA 0E 00 00 00 mov edx, 0x0E 004004F0: 8D 8D D0 04 40 00 lea ecx, [rbp + 0x004004D0] 004004F6: BB 01 00 00 00 mov ebx, 0x01 004004FB: B8 04 00 00 00 mov eax, 0x04 00400500: CD 80 int 0x80 00400502: BB 00 00 00 00 mov ebx, 0x00 00400507: B8 01 00 00 00 mov eax, 0x01 0040050C: CD 80 int 0x80
NOTE: You shouldn't be confused with the call address in the disassembled version of the code. As I explained before, the call adds the immediate value 0 to the instruction pointer, but the disassembler shows the address where the execution of the code will continue. Is not a fixed address.
It must be clear now how works this trick, but for the sake of clarity we'll see a practical example. Let's imagine that the code is moved to the address 0x00500000:
00500000: 'Hello, World!',0x0A 0050000E: E8 00 00 00 00 call dword 0x00500013 00500013: 5D pop rbp 00500014: 48 81 ED E3 04 40 00 sub rbp, 0x004004E3 0050001B: BA 0E 00 00 00 mov edx, 0x0E 00500020: 8D 8D D0 04 40 00 lea ecx, [rbp + 0x004004D0] 00500026: BB 01 00 00 00 mov ebx, 0x01 0050002B: B8 04 00 00 00 mov eax, 0x04 00500030: CD 80 int 0x80 00500032: BB 00 00 00 00 mov ebx, 0x00 00500037: B8 01 00 00 00 mov eax, 0x01 0050003C: CD 80 int 0x80
After the call and the pop, we get the new address of that pop (which is now 0x00500013) in the rbp register. If we substract the original address from this value, we get how many bytes the code has been moved. And if we add this quantity to every old address, we get the correct new addresses where necesary.
0x00500013 - 0x004004E3 = 0x000FFB30
0x000FFB30 + 0x004004D0 = 0x00500000 ('Hello, World!',0x0A)
The code presented has been used since the very beggining, and you can find it in almost any assembler virus. Sometimes it's as simple as you see here, sometimes it's more tricky (to fool heuristics analyzers), but the goal is always the same. You also need it, and probably your viruses will get detected because of it (if you don't hide it).
I should also say that the code is not optimized at all, and is intended to run under a 2.6.x Linux kernel using software interruptions to implement system calls. I'm going to prepare articles about these two subjects as soon as possible.
OS: Ubuntu 10.04 with Linux Kernel 2.6.32-22-generic x86_64
CPU: Intel Core 2 Duo