If you are looking to get started with malware analysis, tune into the webcast series I recorded to illustrate key tools and techniques for examining malicious software:
Since the best way to learn malware analysis involves practice, I am happy to provide you with malware samples from each of these webcasts. Just send me an email after you’ve watched the webcast and confirm that you will be taking precautions to properly isolate your laboratory environment.
Examining malicious software involves a variety of tasks, some simpler than others. These efforts can be grouped into stages based on the nature of the associated malware analysis techniques. Layered on top of each other, these stages form a pyramid that grows upwards in complexity. The closer you get to the top, the more burdensome the effort and the less common the skill set.
The easiest way to assess the nature of a suspicious file is to scan it using fully-automated tools, some of which are available as commercial products and some as free ones. These utilities are designed to quickly assess what the specimen might do if it ran on a system. They typically produce reports with details such as the registry keys used by the malicious program, its mutex values, file activity, network traffic, etc.
Fully-automated tools usually don’t provide as much insight as a human analyst would obtain when examining the specimen in a more manual fashion. However, they contribute to the incident response process by rapidly handling vast amounts of malware, allowing the analyst (whose time is relatively expensive) to focus on the cases that truly require a human’s attention.
Static Properties Analysis
An analyst interested in taking a closer look at the suspicious file might proceed by examining its static properties. Such details can be obtained relatively quickly, because they don’t involve running the potentially-malicious program. Static properties include the strings embedded into the file, header details, hashes, embedded resources, packer signatures, meta data such as the creation date, etc.
Looking at static properties can sometimes be sufficient for defining basic indicators of compromise. This process also helps determine whether the analyst should take closer look at the specimen using more comprehensive techniques and where to focus the subsequent steps. Analyzing static properties is useful as part of the incident triage effort.
VirusTotal is an example of an excellent online tool whose output includes the file’s static properties. For a look at some free utilities you can run locally in your lab, see my posts Analyzing Static Properties of Suspicious Files on Windows and Examining XOR Obfuscation for Malware Analysis.
Interactive Behavior Analysis
After using automated tools and examining static properties of the file, as well as taking into account the overall context of the investigation, the analyst might decide to take a closer look at the specimen. This often entails infecting an isolated laboratory system with the malicious program to observe its behavior.
Behavioral analysis involves examining how sample runs in the lab to understand its registry, file system, process and network activities. Understanding how the program uses memory (e.g., performing memory forensics) can bring additional insights. This malware analysis stage is especially fruitful when the researcher interacts with the malicious program, rather than passively observing the specimen.
The analyst might observe that the specimen attempts to connect to a particular host, which is not accessible in the isolated lab. The researcher could mimic the system in the lab and repeat the experiment to see what the malicious program would do after it is able to connect. for example, if the specimen uses the host as a command and control (C2) server, the analyst may be able to learn about specimen by simulating the attacker’s C2 activities. This approach to molding the lab to evoke additional behavioral characteristics applies to files, registry keys and other dependencies that the specimen might have.
Being able to exercise this level of control over the specimen in a properly-orchestrated lab is what differentiates this stage from fully-automated analysis tasks. Interacting with malware in creative ways is more time-consuming and complicated than running fully-automated tools. It generally requires more skills than performing the earlier tasks in the pyramid.
For additional insights related to interactive behavior analysis, see my post Virtualized Network Isolation for a Malware Analysis Lab, a my recorded webcast Intro to Behavioral Analysis of Malicious Software and Part 3 of Jake Williams’ Tips on Malware Analysis and Reverse-Engineering.
Manual Code Reversing
Reverse-engineering the code that comprises the specimen can add valuable insights to the findings available after completing interactive behavior analysis. Some characteristics of the specimen are simply impractical to exercise and examine without examining the code. Insights that only manual code reversing can provide include:
Manual code reversing involves the use of a disassembler and a debugger, which could be aided by a decompiler and a variety of plugins and specialized tools that automate some aspects of these efforts. Memory forensics can assist at this stage of the pyramid as well.
Reversing code can take a lot of time and requires a skill set that is relatively rare. For this reason, many malware investigations don’t dig into the code. However, knowing how to perform at least some code reversing steps greatly increases the analyst’s view into the nature of the malicious program in a comp
To get a sense for basic aspects of code-level reverse engineering in the context of other malware analysis stages, tune into my recorded webcast Introduction to Malware Analysis. For a closer look at manual code reversing, read Dennis Yurichev’s e-book Reverse Engineering for Beginners.
Combining Malware Analysis Stages
The process of examining malicious software involves several stages, which could be listed in the order of increasing complexity and represented as a pyramid. However, viewing these stages as discrete and sequential steps over-simplifies the steps malware analysis process. In most cases, different types of analysis tasks are intertwined, with the insights gathered in one stage informing efforts conducted in another. Perhaps the stages could be represented by a “wash, rinse, repeat" cycle, that could only be interrupted when the analyst runs out of time.
If you’re interested in this topic, check out the malware analysis course I teach at SANS Institute.
P.S. The pyramid presented in this post is based on a similar diagram by Alissa Torres.
Despite its age, Windows XP is useful to have in your IT lab, for instance if you need to experiment with older software or study malware. Microsoft distributes a Windows XP virtual machine called Windows XP Mode, which you can download if you’re running Windows 7, as I explained earlier.
If you’re using Windows 8 or 8.1, you can still get the Windows XP virtual machine, but it requires a bit more work. The initial steps below are based on the instructions documented on the Redmond Pie blog.
First, download Windows XP Mode from Microsoft. You’ll need to go through the validation wizard to confirm you’re running a licensed copy of Windows. Though designed to look for Windows 7, it appears to accept Windows 8.1 as well. After the validation completes, you’ll able to download the Windows XP Mode installation file.
Once you’ve downloaded the Windows XP Mode installation file, don’t run it. Instead, explore its contents using a decompressing utility such as WinRAR or 7-Zip. Inside that file, go into the “sources” directory and extract the file called “xpm”.
The file “xpm” is another compressed archive, whose contents you can navigate using a tool such as 7-Zip. Extract from the “xpm” archive the file called “VirtualXPVHD” and rename it to something like “VirtualXP.vhd”.
This VHD file represents the virtual hard disk of the Windows XP system. You can use VirtualBox to create a virtual machine out of it. To do this, use the VirtualBox wizard for creating a new virtual machine and select “Use an existing virtual hard drive file” when prompted.
To create a VMware virtual machine out of the VHD file you’ll first need to convert it to the VMDK format, which VMware uses to represent virtual disks. The most convenient way to do this might be to use WinImage. If following this approach, select “Convert Virtual Hard Disk image…” from the Disk menu in WinImage, then select “Create Dynamically Expanding Virtual Hard Disk”. When prompted to save the file, select the VMware VMDK format and name the output file something like “VirtualXP.vmdk”.
Once the VMDK file has been saved, you can create a VMware virtual machine out of it by using VMware Workstation. Go to File > New Virtual Machine. Select “Custom (advanced)” when prompted for the configuration type. Accept defaults as you navigate through the wizard. When prompted to select a disk, select “Use an existing virtual disk” and point the tool to the VirtualXP.vmdk file.
Once the VMware virtual machine has been created, launch it, then go through the Windows XP setup wizard within the new virtual machine the same way you would do it for a regular Windows XP system. After Windows XP setup is done, install VMware tools into the Windows XP virtual machine you just created. Take a snapshot of your virtual machine, in case it breaks.
Windows XP installed into the virtual machine in that manner might need to be activated with Microsoft within 30 days of the installation. Be sure to understand and comply with the applicable software licensing agreements.
If this topic is interesting to you, take a look at my Reverse-Engineering Malware course. Other related items:
Over the years, the set of skills needed to analyze malware has been expanding. After all, software is becoming more sophisticated and powerful, regardless whether it is being used for benign or malicious purposes. The expertise needed to understand malicious programs has been growing in complexity to keep up with the threats.
My perspective on this progression is based on the reverse-engineering malware course I’ve been teaching at SANS Institute. Allow me to indulge in a brief retrospective on this course, which I launched over a decade ago and which was recently expanded.
Starting to Teach Malware Analysis
My first presentation on the topic of malware analysis was at the SANSFIRE conference in 2001 in Washington, DC, I think. That was one of my first professional speaking gigs. SANS was willing to give me a shot, thanks to Stephen Northcutt, but I wasn’t yet a part of the faculty. My 2.5-hour session promised to:
Discuss “tools and techniques useful for understanding inner workings of malware such as viruses, worms, and trojans. We describe an approach to setting up inexpensive and flexible lab environment using virtual workstation software such as VMWare, and demonstrate the process of reverse engineering a trojan using a range of system monitoring tools in conjunction with a disassembler and a debugger.”
I had 96 slides. Malware analysis knowledge wasn’t yet prevalent in the general community outside of antivirus companies, which were keeping their expertise close to the chest. Fortunately, there was only so much one needed to know to analyze mainstream samples of the day.
Worried that evening session attendees would have a hard time staying alert after a day’s full of classes, I handed out chocolate-covered coffee beans, which I got from McNulty’s shop in New York.
Expanding the Reverse-Engineering Course
A year later, I expanded the course to two evening sessions. It included 198 slides and hands-on labs. I was on the SANS faculty list! Slava Frid, who helped me with disassembly, was the TA. My lab included Windows NT and 2000 virtual machines. Some students had Windows 98 and ME. SoftICE was my favorite debugger. My concluding slide said:
That advice applies today, though one of the wonderful changes in the community from those days is a much larger set of forums and blogs focused on malware analysis techniques.
By 2004, the course was two-days long and covered additional reversing approaches and browser malware. In 2008 it expanded to four days, with Mike Murr contributing materials that dove into code-level analysis of compiled executables. Pedro Bueno, Jim Shewmaker and Bojan Zdrnja shared their insights on packers and obfuscators.
In 2010, the course expanded to 5 days, incorporating contributions by Jim Clausing and Bojan Zdrnja. The new materials covered malicious document analysis and memory forensics. I released the first version of REMnux, a Linux distro for assisting malware analysts with reverse-engineering malicious software.
Recent Course Expansion: Malware Analysis Tournament
The most recent development related to the course is the expansion from five to six days. Thanks to the efforts of Jake Williams, the students are now able to reinforce what they’ve learned and fine-tune their skills by spending a day solving practical capture-the-flag challenges. The challenges are built using the NetWars tournament platform. It’s a fun game. For more about this expansion, see Jake’s blog and tune into his recorded webcast for a sneak peek at the challenges.
It’s exciting to see the community of malware analysts increase as the corpus of our knowledge on this topic continues to expand. Thanks to all the individuals who have helped me grow as a part of this field and to everyone who takes the time to share their expertise with the community. There’s always more for us to learn, so keep at it.
Sometimes people ask me for career advice related to information security in general and, more specifically, digital forensics and incident response. I’ve written a few articles on this topic, as did many other respected professionals. Below are pointers to some of these tips.
Digital forensics in general:
Specific to malware analysis:
Broader IT and information security career tips:
I’m sure I missed many other excellent articles with practical career tips for digital forensics and related fields. If you’d like to recommend your favorite references, kindly leave a comment.
In the past weeks I published several posts describing malware analysis tools and approaches at other blogs:
Also, on my own blog I took a look at Cylance’s Accelerify tool for speeding up the lab system’s clock for malware analysis.
I’m pleased to announce the release of version 4 of the REMnux Linux distribution for reverse-engineering malicious software. The new version includes a variety of new malware analysis tools and updates the utilities that have already been present on the distro.
What’s new in REMnux v4? See the details below and watch the recorded webcast where I showcase some of the key additions. You can download the latest release at REMnux.org.
What’s New in REMnux v4
REMnux is now available as a Open Virtualization Format (OVF/OVA) file for improved compatibility with virtualization software, including VMware and VirtualBox. (Here’s how to easily install the REMnux virtual appliance.) A proprietary VMware file is also available. You can also get REMnux as an ISO image of a Live CD.
Key updates to existing tools and components:
New tools added to REMnux:
Getting Started With REMnux
The one-page REMnux Usage Tips cheat sheet outlines some of the more popular tools installed on REMnux. Feel free to customize it to incorporate your own tips and tricks.
The recorded Malware Analysis Essentials Using REMnux webcast provides a good overview and examples of some of the tools for performing static malware analysis. I also recorded a webcast to discuss What’s New in REMnux v4 for Malware Analysis and to demonstrate the new tools.
If you find REMnux useful, take a look at the reverse-engineering malware course that my colleagues and I teach at SANS. It makes use of REMnux and various other tools.
If you haven’t already, download the REMnux distro at REMnux.org.
For tips, issues and workarounds related to installing REMnux v4, see REMnux Version 4 Installation Notes.
The need to define custom, incident-specific signatures is slowly gaining traction in the mainstream enterprise. A few years ago this concept, often called Indicators of Compromise (IOCs), was mostly discussed by government organizations and defense contractors who were coming to terms with Advanced Persistent Threat (APT) attacks.
Madiant began popularizing the term IOC around 2007. Kris Kendall’s paper Practical Malware Analysis mentioned IOCs in the context of malware reversing at Black Hat DC 2007. For a precursor to this, see Kevin Mandia’s Foreign Attacks on Corporate America slides from Black Hat Federal 2006. At the time, few organizations saw the need to go beyond antivirus-based detection by analyzing the adversary’s artifacts to define custom host-level signatures.
Now, several years later, the term IOC is pretty well-known in the infosec industry. More companies are adding malware and related analysis skills to incident response teams. As Jake Williams put it, such firms know how to examine new malware and extract IOCs. “These are then fed back into the system and scans are repeated until no new malware is found.” Automated analysis products from vendors such as Norman, Mandiant, FireEye and HB Gary are being increasingly positioned as IR triage-enablers.
That said, the knowledge and skills for deriving and using IOCs is far from being mainstream. Anton Chuvakin highlighted the distinction between security haves and have-nots along the lines of this capability. The haves know how to reverse-engineer malware to “extract the IOCs FAST (or get those IOCs shared with you by trusted friends) and then look for them on other systems.”
IOC techniques haven’t entered the mainstream just yet. But we’re heading in that direction, as more people attain forensics skills and as more tools become available for defining and making use of such custom, incident-specific signatures.
To learn how to define and make use of IOCs, take a look at: