Calling Arbitrary Functions In EXEs: Performing Calls to EXE Functions Like DLL Exports

 Motivation

When reversing or fuzzing an executable, being able to run an arbitrary function with controlled data is extremely helpful. Through iteratively playing with the function's parameters and examining the output, we can better understand the function's logic.

Background

A dll (Dynamic Linked Library) with our target function would allow us to conveniently review and test the function as we wish. The only problem is that usually the function we want to examine resides in an exe, not a dll. Converting¹ an exe to a dll is a solvable challenge. After all, both an exe and a dll share the same PE (Portable Executable) file format So let's explore, how can we convert¹ an exe to a dll?
Spoiler: there are a few more steps than just changing the extension 😉
¹ "convert to DLL" = fundamentally behave like a DLL.

I'll use this exe created from the following code and target the decode_string function for demonstration purposes throughout this post.

Challenges

There are 2 primary areas we need to change when converting an exe to a dll:
  1. File Header - Both file types have the same PE header format, but there is one specific flag that is unique to dlls. This is discussed later in detail.
  2. Entry Point Code - Both file types have code that run once the file is loaded, but there are fundamental differences (in the purpose, integration, and structure) of that code between the two files types. This is also discussed in detail later on.

Step 1: Modify The File Header

The OS differentiates between an exe and a dll by their "Characteristics" field in the PE header. A dll has the IMAGE_FILE_DLL (0x2000) flag set, while an exe doesn't. So to solve our first problem, we'll turn this flag on.

This handy python script turns on the IMAGE_FILE_DLL flag of a PE. Alternatively, you can use your favorite PE editor to do so. Turning on the IMAGE_FILE_DLL flag on our example exe will result in this file.

Fig 1. Comparison Before And After Setting The IMAGE_FILE_DLL Flag

Step 2: Patch The Entry Point

After a PE file is loaded, the file's entry point code is executed, which is set by the AddressOfEntryPoint field in the PE header. For exe files, the entry point is wrapper code that calls  the "main" function, but for dll files, the entry point is the "DllMain" function that behaves differently.

The main() and DllMain() functions have 3 significant differences (as mentioned earlier in challenge #2) that must be overcome for our modified exe to successfully load like a dll:

Purpose

main()'s purpose is to perform and manage the core functionality of the executable. However, DllMain's purpose is to only perform minimal initialization and then return immediately.

Return Value

A successful exit from main() is indicated by returning 0. Conversely, a successful exit from DllMain() is True (any non-zero value). If DllMain() returns False (0) when we try to load it, the OS fails the load with error code 1114 "A dynamic link library (DLL) initialization routine failed."

Function Prototype

The entry point of a .exe isn't called with any parameters on the stack. Instead, the code at the entry point that wraps main() uses Windows functions to prepare main()'s arguments (argv, argc, and envp) as seen in Figure 2:
Fig 2. main()'s Arguments Are Prepared With Function Calls

This is not the case for DllMain. DllMain is given 3 parameters (hinstDLL, fdwReason, lpvReserved), which it must clean from the stack before returning, according to its stdcall calling convention, to ensure stable code continuation.

The Patch

The good news is we can correct these 3 discrepancies between the exe and dll entry points with one simple patch: overwrite the code in the exe entry point to simply "return 1".

Fig 3. New Entry Point Code

Fig 4. Overwriting The .exe Entry Point With The Patch

This patch changes the Entry Point code to "return immediately with a successful code and clean the stack variables."

After this step, we are done patching the target file :). We will now focus on writing the code invoking the call to our new "DLL".

Step 3: Invoke The Call

Now that we have a file that behaves like a DLL, we can load it with LoadLibrary(). LoadLibrary() will load our modified exe into our process's address space and run the Entry Point code.

Calculating RVA

Next, to call our target function we need to calculate its RVA (Relative Virtual Address), the function's offset from the base of the file. This is done by simply subtracting the function's offset (the address as it appears in IDA) from the program's Image Base (can be found using any PE viewer).

Fig 5. Finding The Image Base Using A PE Viewer
As seen in Figure 5 above, my example exe's image base is 0x400000.

Fig 6. Finding the target function's offset with IDA
And as seen in Figure 6, my target function's offset is 0x401040.

By subtracting the Image Base form the target function's offset we compute that the function's RVA = 0x401040 - 0x400000 = 0x1040.

Finally, all that's left is performing a call to LoadLibrary() and applying confusing C function pointer syntax to call the target function. If our exe were more complex, we would probably need additional calls before directly calling the target function - see the Improving Implementation section below. However, with this simple example exe, we are all done.

Click the image below to enlarge it or view the code on Github here. I was extra verbose with comments to explain exactly what each line does.

Fig 7. Source Code Of Program Loading Our Modified EXE As A DLL And Invoking Our Target Function
And here is the result running the code:

Fig 8. Succesfully Running Our Patched Executable As A .DLL

Yay! Success! We hacked our .exe file into a .dll, and successfully ran a function from it.

Closing Notes

I hope you enjoyed learning a new trick, patching a .exe to use as a .dll with LoadLibrary, to run arbitrary functions in executables!

Thank you to Ben for helping me write and review this post! Follow him on Twitter @B_H101 for infosec tweets.

Self plug: Follow me on Twitter @va_start for new blog posts and infosec tweets too.


Improving Implementation

Something I didn't mention in this post for the sake of brevity is that usually complex programs have some initialization done before they reach certain functions. For example, a logging function might expect the computer's hostname to be in a certain global variable before it runs, and will crash if it's not.

To prevent breaking the code in such a way, we need to allow the initialization code to run. The easiest way to do this is finding the function that performs the initialization, and then directly calling that function before calling the real target function. You can leave a comment on this post or reach me on Twitter if you have any questions or comments 😄

Comments

  1. One thing that is confusing me is that if I try to follow this step by step with another editor, for instance, I'm trying to use Dev-Cpp, the binary formed is completely different. The EntryPoint symbol specially, and I haven't figured it out how to do this with the binary generated here. Could you try to do the same thing but compiling using Dev-Cpp?

    ReplyDelete
  2. Sure :)
    I'll get to it when I free up a bit. I'll update when I have something 👍

    ReplyDelete
  3. WOW, that took a surprising amount of debugging but I found the fault:)! Devcpptools compiles it's executables without a dynamic base by default. You can use a tool such as CFF Explorer and view the DLLCharacteristsics field* (under NT Headers>Optional Headers) and see that the "DLL can move" flag* is not set. This means that the executable only "knows" how to load to one address. Therefore, if we try to load the executable to a process where that address is already taken, for example by our invoker.exe, the dll will fail to load, giving the error 487-ERROR_INVALID_ADDRESS.
    This can be fixed by ensuring our invoker.exe isn't loaded to the address our target dll wants to load to too (which is 0x400000). In the compilation process of the invoker exe, we can set a controlled image base instead of the conflicting default. I wasn't sure how to do this in devcpptools (I'm sure it's somewhere in the options though), so I used visual studio to compile the invoker and set the image base to 0x800000. In VS, the image base address can be set in the project properties > configuration properties > Linker > Advanced > Base Address.
    After compiling the invoker with a new address, loading the converted devcpptools exe as a dll worked! :D
    Also, if you intend to use the stub entry point code ("mov eax, 1; ret 0xC") from the blog, ensure you're compiling both the invoker and target exe to 32 bit
    Let me know if any other problems arise in your experiments! ;)

    * the field names misleadingly include "DLL", but they apply to EXEs and other PE files alike

    ReplyDelete
  4. How to deal global variables relocation and tls table,crt library relocation.

    ReplyDelete
  5. Is there a tutorial how to modify a non exported function to Export tables.

    ReplyDelete
  6. You don’t need to deal with any relocations :)
    Since we’re using LoadLibrary, the loader automatically takes care of relocations for us.
    If there are any initialization functions that need to run before the code, just call those functions first.

    ReplyDelete
  7. Is it possible to load this dll(exe) from memory directly without storing on the disk first.

    ReplyDelete
  8. You can add a dll export with CFF Explorer. Just go to the exports table and add an entry :)

    ReplyDelete
  9. If you want to load your custom dll from memory, I recommend just keeping it as an exe and using a custom loader such as: https://github.com/polycone/pe-loader
    If you want to load the converted dll anyway, then you could “reverse” the patches in memory (converting it back to a exe), and then use the loader (linked above)

    ReplyDelete
  10. Wouldn't it be easier to use Frida? It can call any function given its memory address, change its args, etc

    Nevertheless, great article and thank you (:

    ReplyDelete
    Replies
    1. Thanks for the kind words!:)
      Frida could indeed be a great alternative, depending on the requirements. Patching the exe has zero runtime performance overhead, which means this method could be utilized in fuzzing and other performance hungry applications. I don't know how lean the frida overhead is. In addition, patching the binary requires no runtime support, which could be useful for environments where you don't want to install Frida. Nonetheless, Frida is a viable alternative in many cases
      Hope I gave some perspective about when to use each method :D

      Delete

Post a Comment

Popular posts from this blog

Guy's 30 Reverse Engineering Tips & Tricks

uTorrent CVE-2020-8437 Vulnerability And Exploit Overview