Digital Self-Defense Lab - Battlefield Malware Analysis (Part 1)

Battlefield Malware Analysis (Part 1)

Posted on 30 Nov 2019 by anonymous

Hello Folks! In this blog post series named "Battlefield Malware Analysis" we will be investigating different tools and techniques that you as a Malware Analysts / SOC Analyst / Incident Responder / (you name it) can use to make your life easier when dealing with Malware Analysis. The "Battlefield" aspect of this blog post series stems from the fact that we will be covering real life malware analysis problems which we encounter on a daily basis, and show you simple yet effective ways to cope with them. This blog post series aims to be as practical as possible, by this we mean that we won't solely focus on the theoretical aspects of malware analysis but rather get our hands dirty by dissecting malware.

Enough talk. Now it’s time to fasten our seatbelts and dig through some malware!

Introduction

In the first part of “Battlefield Malware Analysis” we will take a look at script based obfuscation and how it can be defeated in a fast and efficient way by using process injection and API hooking. Before we jump straight into the practical part let us define what we mean by script based obfuscation and why it is still so prevalent these days.

Script based obfuscation is a technique used by malware authors that allows them to hide the malicious intent of their scripts from malware analysts and anti-virus. This is achieved by abusing scripting language features like eval functions¹, anonymous functions² and string based encryption, encoding or transformation.

When we think of recent malware campaigns we often see that the initial attack vector used by todays attackers is still phishing. With the help of legitimate looking phishing e-mails attackers are able to get a first foothold into their target organization.

Back in the golden days of malware (80’s and 90’s) it was very common to see malicious attachments like for example “file.pdf.exe” as part of phishing e-mails. Nowadays people are aware of the fact there is baerly no valid reason to deliver exectuable files via e-mail attachments, and therefore we often see them being blocked by default.

Malware authors adapted to these restrictions by abusing legitimate file formats which are commonly used as e-mail attachments and provide some sort of scripting capabilities which will allow them to download and execute malicious code on their victims host.

The prevalence of script based obfuscation techniques during the delivery stage of the Cyber Kill-Chain³ is closely related to the fact that executable files are no longer a reliable way for initial infection.

Scenario

Now it’s time to introduce the scenario that we will be dealing within the first part of our blog post series:

You are a malware analyst tasked to analyze a bunch of malicious JScript attachments that were 
delivered as part of phishing campaign to a group of c-level executives from your company. 

Some of the c(lick) level executives from your company already opened the attachments and now you are
in a hurry because they expect quick answers to clarify what happend.

It's time to boot up your analyst workstation and provide the information your they requested from you.

Here you can download the malicious JScript attachment.

Lab Setup

In order to complete the exercise and become the hero of your company we recommend the following tools:

Windows Server 2012 R2 64-Bit
Python 3.7.5
Frida 12.7.20
wscript.exe
(x64dbg [Apr 29 2019])

Other OS/Software versions will probably work too.

Which APIs to look for?

When dealing with malicious scripts like VBScript or JScript you will notice that the vast majority of malicious scripts depend on activex controls⁴ / COM Objects⁵ to overcome the limitations imposed by the scripting language interpreter. By using activex controls from within JScript for example it is possible to accomplish certain tasks such as writing to the registry, creating files or executing other applications, which would be otherwise not possible without direct access to the Windows API. Malware authors will very likely try to hide those kind of actions by using obfuscation techniques for the purpose of avoiding detection from human analysts and anti-virus. The following JScript named “example.js” is intended as a toy example to demonstrate how activex controls can be used to start another application:

 
// example.js

<script language="JScript">
    function fnShellExecuteJ()
    {
        var objShell = new ActiveXObject("shell.application");
        
        objShell.ShellExecute("notepad.exe", "", "", "open", 1);
    }
</script>

In Microsoft Windows, JScript files are associated with the Windows Script Host⁶ (wscript.exe). This means that when a user double clicks a JScript file it will immediately be executed by wscript.exe. These circumstances are the reason why obfuscated JScript files are so popular among attackers. They provide a simple and effecitve way of executing malicious code on a target system. In the case of “example.js” a double click would result in the execution of the Windows API function ShellExecuteExW from Shell32.dll, which in turn will create the process “notepad.exe”.

But how does wscript.exe know that ShellExecuteExW⁷ is implemented in Shell32.dll? This information can be obtained from the windows registry in two simple steps:

wscript.exe needs to lookup HKEY_CLASSES_ROOT\Shell.Application\CLSID.
The CLSID = {?} value is used by wscript.exe to lookup the InProcServer32 under HKEY_CLASSES_ROOT\CLSID\{?}, which holds the implementation of ShellExecuteExW.

Now it’s time to take a look at the malicious JScript attachments that were send to the c-level execs of our company. The first thing we notice when we search for common keywords like ShellExecute is that we found one match. But unlike in “example.js” the parameters passed to ShellExecute are obfuscated, as presented in figure 1.

Figure 1: obfuscated.js

In the next step we need to get rid of the obfuscation without wasting to much precious time and brain resources on deobfuscating stuff in our head. By debugging wscript.exe and passing “obfuscated.js” as an argument it is possible to verify that the Windows API function ShellExecuteExW indeed is executed. This can be seen in the Debbuger Window (x64dbg) depicted in figure 2:

Figure 2: Shell32.ShellExecuteExW

The first argument (EBP+8) passed to ShellExecuteExW is pointer to the struct SHELLEXECUTEINFOW⁸, which contains information such as the application that needs to be exeucted, it’s commandline arguments and other settings. From an Malware Analysts standpoint the contents of the struct are very valuable because they have to be in deobfuscated form, so that ShellExecuteExW is able to execute the intended application. This means that the deobfuscation routine needs to be applied before the arguments are passed to ShellExecuteExW. By setting a breakpoint at ShellExecuteExW we can capture the deobfuscated arguments passed as a pointer to the struct SHELLEXECUTEINFOW.

In the following step we will take a look at the memory address 004AD94C in the dump section of our debbuger. This is the address where the struct SHELLEXECUTEINFOW resides in memory:

Figure 3: SHELLEXECUTEINFOW

The struct contains several fields that are of interest for further analysis. To get a complete understanding of the struct SHELLEXECUTEINFOW we recommend to lookup the definition on msdn (here). In figure 3 we can already see some of the fields like for example lpVerb, lpFile, lpParameters, lpDirectory and nShow.

If our goal is to understand which application gets executed by ShellExecuteExW and what arguments are passed to it during execution, we need to take a look at the fields lpFile and lpParameters. When we inspect these fields we will see that “obfuscated.js” executes the following command:

powershell.exe -exec bypass -command "whoami ; sleep 5"

Success! Now we are able to give our boss the information he needs to calm down the c-level execs of our company. But wait! There are still some malicious JScript files left for analysis. The question now is how you can analyze them without redoing all the steps presented so far? The answer to this question will be covered in the next section. Another important question that needs to be answered is what Windows APIs to look for when dealing with obfuscated scripts? The answer is simple! We don’t know what APIs to look for beforehand because each malicious script is different. The best way to undestand what the malware does and to defeat obfuscation is to intercept all interesting APIs. You might ask yourself what APIs are interesting then? Well it depends, but the blog post “WinDBG and JavaScript Analysis” from Cisco Talos is a good starting point.

Analyzing malicious scripts at scale

In this section of the blog post we will explaint to you how we can analyze a bunch of malicious JScript attachments without repeating the tedious steps introduced in the last section. The answer to this is simple! With the help of process injection and API Hooking, we are able to analyze the function calls of our interest. This allows us to bypass obfuscation and get an understanding of what the malicious script tries to achive on the victims machine. But what if we are really lazy people and we don’t want to implement all of this process injection⁹ and hooking¹⁰ stuff on our own? Then Frida is our answer!

But what is Frida? According to the projects webpage Frida is “[…] Greasemonkey for native apps, or, put in more technical terms, it’s a dynamic code instrumentation toolkit. It lets you inject snippets of JavaScript or your own library into native apps on Windows, macOS, GNU/Linux, iOS, Android, and QNX. Frida also provides you with some simple tools built on top of the Frida API. These can be used as-is, tweaked to your needs, or serve as examples of how to use the API.”.

As you might guess from the description above there are plenty of things that you can do with Frida, but in our case we will solely focuse on how it can be used to automate the deobfuscation of “obfuscated.js”. Wait! so you are telling me…

Figure 4: Mandatory Meme

Indeed, we will use Frida’s core, Gum (Instrumentation Library) and Gum’s JavaScript binding GumJS to hook ShellExecuteExW and grab the passed arguments in a deobfuscated state. In theory this is achieved as follows:

Frida core suspends the target process wscript.exe.
Frida core creates a remote thread in the target process which then loads Frida agent (Gum + Google’s V8 Engine) distributed as a shared library.
Gum is used to hook the function ShellExecuteExW from Shell32.dll in the target process wscript.exe.
Everytime ShellExecuteExW is called in inside the target process our JavaScript gets executed with the help of Google’s V8 Engine, which then gives us full access to the arguments passed to ShellExecuteExW.

Before we start writing the JavaScript Code that will be executed instead of ShellExecuteExW, we can use frida-trace to create a template for us. Please be aware of the fact that when we run frida-trace as presented in figure 5 the obfuscated JScript will be executed. When analyzing an unknown script this step is not recommended.

Figure 5: frida-trace

After running frida-trace the directory “__handlers__\SHELL32.dll\” is created in the current path. This directory contains the JavaScript file “ShellExecuteExW.js” which represents the template mentioned earlier. The JavaScript “ShellExecuteExW.js” will be used to define the behaviour of the hooked function ShellExecuteExW. Everytime the function ShellExecuteExW is called within wscript.exe the hooking function onEnter will also be executed.

 
/**
* Called synchronously when about to call ShellExecuteExW.
*

onEnter: function (log, args, state) {
	//Place your hook functionality here.
},

The arguments passed to ShellExecuteExW, will also be present in our hooking function onEnter, via the array args. Recall from the previous section that the first and only argument passed to ShellExecuteExW is a pointer to the struct SHELLEXECUTEINFOW. The address of the struct can now be used to access all the interesting fields which will reveal the purpose of the malicious JScript “obfuscated.js”. Figure 6 depicts the final implementation of the hooking function.

Figure 6: Hooking function onEnter

Finally, if we rerun frida-trace with the same commandline arguments as in figure 5, the previously defined hooking function will be executed whenever ShellExecuteExW is called within wscript.exe. The hooking function will then print out all of the interesting fields from the struct SHELLEXECUTEINFOW passed to ShellExecuteExW:

Figure 7: Final frida-trace

With the help of frida-trace and the hooking function onEnter from “ShellExecuteExW.js”, we can now automate the analysis of the malicious JScript attachments that were received by the c-level executives of our company.

Final Thoughts

The approach presented in this blog post is based on the hypothesis that malware authors at some point will need to extend the functionality of their malicious scripts by using activex controls / COM Objects in order to get access to more powerful APIs. When doing so, it is highly likely that malware authors will try to hide the specifics (function names, arguments) of the APIs used by employing some form of obfuscation. With the help of frida we can easily intercept all the relevant API calls used by malicious scripts which allows us to bypass the implemented obfuscation techniques. Another great benefit that comes from using frida is the high grade of automation that can be achieved by using it.

Want to see something else added? Open an issue.