A favorite technique by malware authors is to use macros in their office documents to utilize a normal system executable and replace the code inside, a technique known as “process hollowing”. The primary goal of this post is to identify this technique and understand how it is employed. I’ve also posted a video that walks through shellcode analysis using Ghidra on YouTube
Starting with the Macros
To get started, inspect the macros and see where the code begins execution. For this document, this begins with the Document_Open function – which can be found in the ThisDocument stream.
As is often the case with the macro code, there is a substantial amount of obfuscation. This document will be using shellcode to perform the process hollowing technique, let’s focus on finding where this shellcode is staged in memory.
This document has two streams: ThisDocument and cowkeeper. At the beginning of the “cowkeeper” stream you’ll find a series of aliases, which are used to eventually make Windows API calls. Of these functions, VirtualAllocEx and RtlMoveMemory are likely being used for memory allocation and copying shellcode into the allocation. If you’re unfamiliar with any of these APIs, it is worth taking some time to study them on MSDN.
Let’s trace where these are being used – keep in mind that in the VBA that the alias will be used. Let’s start with betterment, as the maldoc needs memory to move the shellcode in to. Betterment is called inside the foam function.
Betterment, or VirtualAllocEx, will return a pointer to the newly allocated memory. This is assigned to diener, which is eventually returned by this function. In addition, the use of RtlMoveMemory, or antecedency, is also used. The next step is to find where this function is called. You will also find the call to foam inside the ThisDocument stream:
As you trace the return value through the variable bayberry, you’ll see that is used in some simple addition: bayberry + anklet. This is assigned to aprum and used as an argument to a function called cabriolet. If you look back at all of the aliases we analyzed earlier, you’ll see that this is for the function EnumDateFormatsW – how could this be used to execute shellcode?
Look at the function on MSDN, the first argument (which is our newly allocated memory) expects a pointer to an “application defined callback function”!
The last thing to figure out is what was added to the base of our address before it is called, this could be an offset into the shellcode, which will be important to understand in order to disassemble the code correctly (this essentially defines the entry point). You can continue to trace through the macros to see what this value is, or you can use the office IDE to set a breakpoint and inspect these values dynamically. Either is a viable option. For this guide, I’ll use dynamic analysis to assist me. If you set a breakpoint on the call to cabriolet, this will prevent its execution and allow you to view the values of the arguments.
The value, in hex, of aprum was (in this case, the address will change):
Viewing process memory using Process Hacker 2, the base of the RWX allocation is at 0x70D0000, which means there is an offset of 0xE5D.
And here’s our shellcode:
Analyzing the Shellcode
Once you’ve extracted the shellcode you’ll want to disassemble this code, this will allow you to perform analysis on this code. You can begin by loading the shellcode into your disassembler, I’ll be using Ghidra for this article. Once the shellcode has been loaded/analyzed, go to the function defined at an offset of +0xE5D.
Since this is shellcode, it will need to construct its own import table – that is, resolve Windows API functions on its own. Normally, when a program is loaded into memory the operating system handles this for the program, since shellcode does not go through the normal process of loading it’s on its own. As you inspect the code, you’ll notice sequences of hex values being moved onto the stack:
You’ll also notice a call to sub_cf1 after this sequence of values. The first technique is called stack-strings, the program is building the ASCII string of the API that it needs to resolve. If you right-click on each hex value, you can change the display to “character constant”.
It would make sense that any time these strings are used, there must be functionality to resolve the functions. Throughout most of the shellcode the stack strings will be followed by a call to sub_cf1. You can confirm this by performing dynamic analysis, set a breakpoint on (or after) the call to sub_cf1 and inspect the content of the EAX register – it should be a pointer to the function in the string.
Of course, going through the binary and changing all of these types can be tedious – since Ghidra has a plugin framework, you could automate this work through such plugins. It’s worth spending some time searching for existing plugins, one may already exist! Hint, you can find one here.
You can spend some time analyzing sub_cf1 if you’re interested in learning how the function addresses are being resolved. However, our focus is to find the process hollowing technique. Process hollowing will include the following APIs:
Once you are able to identify these strings, you can trace how, and when, they are used in the shellcode. Most of these are used at an offset of 0x11d5.
CreateProcess will be called to load the desired EXE into memory – one of the arguments will be the path to the EXE, another will be to create the process in a suspended state. Starting in a suspended state stops the process from starting execution. ZwUnmapViewOfSection will be used to remove the original TEXT section of the chosen binary, this allows the shellcode to call VirtualAlloc to allocate new memory, along with WriteProcessMemory to copy new code into the process. From there, it will call GetThreadContext and use the returned value to update the point of entry for the new code. The final steps are to call SetThreadContext and ResumeThread. Now it appears that an instance of SVCHOST is running from the System32 directory, but the actual code has been replaced!