A few minutes ago I finished a small example that shows how to load and execute unmodified Linux ELF binaries in Windows. Basically, this is the same as Wine, just the other way around. You can download the source code from my github page. The code is very simple and only supports the system calls SYS_write and SYS_exit, just enough to write “Hello, World!” onto the screen. Nevertheless, it shows the basic principle.
At the beginning, the loadable segments in the ELF binary are copied into memory. The memory is allocated using VirtualAlloc and the bytes from the file are copied into the reserved block using WriteProcessMemory. The program header in the ELF file contains the destination address and size. As the code in the binary is not position-independent, it has to be placed exactly at this location. In the example I assume that it does not overlap with the code of the loader application, as this would require to rewrite all absolute addresses.
After loading, the instruction pointer is set to the start of the program by jumping to the entry point given in the ELF header.
The interesting part is the emulation of the Linux system calls. In Linux, a system call is executed as follows:
1. Put index of the call in register EAX.
2. Put parameters in registers EBX, ECX, EDX, ESX, and EDI.
3. Trigger int 80.
If this instruction sequence is run on Windows, it results in an exception. So, all that has to be done is to catch the exception, emulate the system call, and return to the instruction after the instruction that raised the exception. The mechanism for dealing with exceptions in Windows is Structured Exception Handling. It allows the program to register an exception handler function that is called whenever an exception occurs, passing the faulting instruction and the contents of the CPU registers to the function.
As mentioned before, only SYS_write and SYS_exit are supported for now. That means that it is not possible to run Linux programs compiled and linked with glibc, as the glibc startup code does lots of stuff that requires much more system calls.
Unfortunately, there is another major difference between Linux and Windows programs that can not be emulated as easy as a system call: Thread-local storage (TLS). Usually, all threads in a process share the same address space. However, sometimes it is necessary that variables should be thread-local, i.e., each thread has its own local copy of the variable. The implementation of TLS in Windows and Linux is similar, but incompatible, and I could not find a way to emulate the TLS implementation of Linux in Windows. In Linux, thread-local variables are stored in distinct memory segments per thread, identified by entries in the GDT. On a context switch, the GDT entry is updated to point to the memory block that contains the per-thread data. The (constant) selector for the GDT entry is stored in the GS register. All memory accesses to thread-local variables are prefixed by %gs, redirecting them to the thread-local memory location. The problem is that it seems to be impossible to manipulate the GDT and set the GS register in Windows, although GS is not used (on 32-bit systems).
It is possible to work around this limitation by rewriting all instructions that contain memory locations prefixed by %gs. However, this is complicated and probably very slow. Up to now, I haven’t tried it.