If you are looking to get started with malware analysis, tune into the webcast series I recorded to illustrate key tools and techniques for examining malicious software:
Since the best way to learn malware analysis involves practice, I am happy to provide you with malware samples from each of these webcasts. Just send me an email after you’ve watched the webcast and confirm that you will be taking precautions to properly isolate your laboratory environment.
Examining malicious software involves a variety of tasks, some simpler than others. These efforts can be grouped into stages based on the nature of the associated malware analysis techniques. Layered on top of each other, these stages form a pyramid that grows upwards in complexity. The closer you get to the top, the more burdensome the effort and the less common the skill set.
The easiest way to assess the nature of a suspicious file is to scan it using fully-automated tools, some of which are available as commercial products and some as free ones. These utilities are designed to quickly assess what the specimen might do if it ran on a system. They typically produce reports with details such as the registry keys used by the malicious program, its mutex values, file activity, network traffic, etc.
Fully-automated tools usually don’t provide as much insight as a human analyst would obtain when examining the specimen in a more manual fashion. However, they contribute to the incident response process by rapidly handling vast amounts of malware, allowing the analyst (whose time is relatively expensive) to focus on the cases that truly require a human’s attention.
Static Properties Analysis
An analyst interested in taking a closer look at the suspicious file might proceed by examining its static properties. Such details can be obtained relatively quickly, because they don’t involve running the potentially-malicious program. Static properties include the strings embedded into the file, header details, hashes, embedded resources, packer signatures, meta data such as the creation date, etc.
Looking at static properties can sometimes be sufficient for defining basic indicators of compromise. This process also helps determine whether the analyst should take closer look at the specimen using more comprehensive techniques and where to focus the subsequent steps. Analyzing static properties is useful as part of the incident triage effort.
VirusTotal is an example of an excellent online tool whose output includes the file’s static properties. For a look at some free utilities you can run locally in your lab, see my posts Analyzing Static Properties of Suspicious Files on Windows and Examining XOR Obfuscation for Malware Analysis.
Interactive Behavior Analysis
After using automated tools and examining static properties of the file, as well as taking into account the overall context of the investigation, the analyst might decide to take a closer look at the specimen. This often entails infecting an isolated laboratory system with the malicious program to observe its behavior.
Behavioral analysis involves examining how sample runs in the lab to understand its registry, file system, process and network activities. Understanding how the program uses memory (e.g., performing memory forensics) can bring additional insights. This malware analysis stage is especially fruitful when the researcher interacts with the malicious program, rather than passively observing the specimen.
The analyst might observe that the specimen attempts to connect to a particular host, which is not accessible in the isolated lab. The researcher could mimic the system in the lab and repeat the experiment to see what the malicious program would do after it is able to connect. for example, if the specimen uses the host as a command and control (C2) server, the analyst may be able to learn about specimen by simulating the attacker’s C2 activities. This approach to molding the lab to evoke additional behavioral characteristics applies to files, registry keys and other dependencies that the specimen might have.
Being able to exercise this level of control over the specimen in a properly-orchestrated lab is what differentiates this stage from fully-automated analysis tasks. Interacting with malware in creative ways is more time-consuming and complicated than running fully-automated tools. It generally requires more skills than performing the earlier tasks in the pyramid.
For additional insights related to interactive behavior analysis, see my post Virtualized Network Isolation for a Malware Analysis Lab, a my recorded webcast Intro to Behavioral Analysis of Malicious Software and Part 3 of Jake Williams’ Tips on Malware Analysis and Reverse-Engineering.
Manual Code Reversing
Reverse-engineering the code that comprises the specimen can add valuable insights to the findings available after completing interactive behavior analysis. Some characteristics of the specimen are simply impractical to exercise and examine without examining the code. Insights that only manual code reversing can provide include:
Manual code reversing involves the use of a disassembler and a debugger, which could be aided by a decompiler and a variety of plugins and specialized tools that automate some aspects of these efforts. Memory forensics can assist at this stage of the pyramid as well.
Reversing code can take a lot of time and requires a skill set that is relatively rare. For this reason, many malware investigations don’t dig into the code. However, knowing how to perform at least some code reversing steps greatly increases the analyst’s view into the nature of the malicious program in a comp
To get a sense for basic aspects of code-level reverse engineering in the context of other malware analysis stages, tune into my recorded webcast Introduction to Malware Analysis. For a closer look at manual code reversing, read Dennis Yurichev’s e-book Reverse Engineering for Beginners.
Combining Malware Analysis Stages
The process of examining malicious software involves several stages, which could be listed in the order of increasing complexity and represented as a pyramid. However, viewing these stages as discrete and sequential steps over-simplifies the steps malware analysis process. In most cases, different types of analysis tasks are intertwined, with the insights gathered in one stage informing efforts conducted in another. Perhaps the stages could be represented by a “wash, rinse, repeat" cycle, that could only be interrupted when the analyst runs out of time.
If you’re interested in this topic, check out the malware analysis course I teach at SANS Institute.
P.S. The pyramid presented in this post is based on a similar diagram by Alissa Torres.
When characterizing ill-effects of malicious software, it’s too easy to focus on malware itself, forgetting that behind this tool are people that create, use and benefit from it. The best way to understand the threat of malware is to consider it within the larger ecosystem of computer fraud, espionage and other crime.
A Tip of a Spear
I define malware as code that is used to perform malicious actions. This implies that whether a program is malicious depends not so much on its capabilities but, instead, on how the attacker uses it.
Sometimes malware is compared to a tip of a spear—an analogy that rings true in many ways, because it reminds us that there is a person on the other end of the spear. This implies that information security professionals aren’t fighting malware per se. Instead, our efforts contribute towards defending against individuals, companies and countries that use malware to achieve their objectives.
Understanding the Context
Without the work of personnel that handles technical aspects of malware infections, the malware-empowered threat actors would be unencumbered. Yet, these tactical tasks need to be informed by a strategic perspective on the motivations and operations of the individuals that create, distribute and profit from malware.
To deal with malware-enabled threats, organizations should know how to detect, contain and eradicate infections, but we cannot stop there. We also need to also understand the larger context of the incident. We won’t be able to accomplish this until we can see beyond the malicious tools to understand the perspective of our adversaries. The who is no less important than the what.
Over the years, the set of skills needed to analyze malware has been expanding. After all, software is becoming more sophisticated and powerful, regardless whether it is being used for benign or malicious purposes. The expertise needed to understand malicious programs has been growing in complexity to keep up with the threats.
My perspective on this progression is based on the reverse-engineering malware course I’ve been teaching at SANS Institute. Allow me to indulge in a brief retrospective on this course, which I launched over a decade ago and which was recently expanded.
Starting to Teach Malware Analysis
My first presentation on the topic of malware analysis was at the SANSFIRE conference in 2001 in Washington, DC, I think. That was one of my first professional speaking gigs. SANS was willing to give me a shot, thanks to Stephen Northcutt, but I wasn’t yet a part of the faculty. My 2.5-hour session promised to:
Discuss “tools and techniques useful for understanding inner workings of malware such as viruses, worms, and trojans. We describe an approach to setting up inexpensive and flexible lab environment using virtual workstation software such as VMWare, and demonstrate the process of reverse engineering a trojan using a range of system monitoring tools in conjunction with a disassembler and a debugger.”
I had 96 slides. Malware analysis knowledge wasn’t yet prevalent in the general community outside of antivirus companies, which were keeping their expertise close to the chest. Fortunately, there was only so much one needed to know to analyze mainstream samples of the day.
Worried that evening session attendees would have a hard time staying alert after a day’s full of classes, I handed out chocolate-covered coffee beans, which I got from McNulty’s shop in New York.
Expanding the Reverse-Engineering Course
A year later, I expanded the course to two evening sessions. It included 198 slides and hands-on labs. I was on the SANS faculty list! Slava Frid, who helped me with disassembly, was the TA. My lab included Windows NT and 2000 virtual machines. Some students had Windows 98 and ME. SoftICE was my favorite debugger. My concluding slide said:
That advice applies today, though one of the wonderful changes in the community from those days is a much larger set of forums and blogs focused on malware analysis techniques.
By 2004, the course was two-days long and covered additional reversing approaches and browser malware. In 2008 it expanded to four days, with Mike Murr contributing materials that dove into code-level analysis of compiled executables. Pedro Bueno, Jim Shewmaker and Bojan Zdrnja shared their insights on packers and obfuscators.
In 2010, the course expanded to 5 days, incorporating contributions by Jim Clausing and Bojan Zdrnja. The new materials covered malicious document analysis and memory forensics. I released the first version of REMnux, a Linux distro for assisting malware analysts with reverse-engineering malicious software.
Recent Course Expansion: Malware Analysis Tournament
The most recent development related to the course is the expansion from five to six days. Thanks to the efforts of Jake Williams, the students are now able to reinforce what they’ve learned and fine-tune their skills by spending a day solving practical capture-the-flag challenges. The challenges are built using the NetWars tournament platform. It’s a fun game. For more about this expansion, see Jake’s blog and tune into his recorded webcast for a sneak peek at the challenges.
It’s exciting to see the community of malware analysts increase as the corpus of our knowledge on this topic continues to expand. Thanks to all the individuals who have helped me grow as a part of this field and to everyone who takes the time to share their expertise with the community. There’s always more for us to learn, so keep at it.
There is much we can learn about coordinated online activities of skilled attackers with nation-state affiliations. The following two write-ups provide a wealth of information about one such attack group, which has been targeting organization in South Asia over the past few years and appears to reside in India:
According to these reports, the group engaged in industrial espionage and spying on political activists. The victims resided in many countries, but Pakistan stood out as the most targeted location. The attackers relied on spear phishing to gain initial access to the targeted environment. The emails were thematically appropriate to the targets and included malicious documents that exploited unpatched vulnerabilities. Some of the malware was digitally signed.
The analysts attributed these cyberattack activities to specific source by examining:
As the result, Norman and Shadowserver researchers concluded that the attackers apparently operated from India “and have been conducting attacks against business, government and political organizations.” Similarly, ESET analysts concluded “that the entire campaign originates from India.”
In addition, Norman and Shadowserver researchers concluded that the malicious software used in these campaigns was created by multiple software developers who were “tasked with specific malware deliverances.” The developers collaborated, “working on separate subprojects, but apparently not using a centralized source control system.”
In the past weeks I published several posts describing malware analysis tools and approaches at other blogs:
Also, on my own blog I took a look at Cylance’s Accelerify tool for speeding up the lab system’s clock for malware analysis.
Some organizations have encountered Advanced Persistent Threat over 5 years ago—earlier than most of us. Because of the types of data they process, these initial APT victims were exposed to carefully-orchestrated, espionage-motivated attacks before they spread to a wider range of targets.
Now, half a decade later, might the time to look at the attacks that the initial APT victims are fighting nowadays to forecast the threats that will eventually reach other companies. I am wondering:
It’s hard to answer these questions without first-hand access to the companies that witnessed the first wave of APT attacks. Furthermore, the dilution of the term APT by marketing departments makes it harder to differentiate between reliable APT insights, such as what Mandiant has been publishing, from generic APT-themed sales collateral peppered throughout the web.
Based on public information and observations, I suspect the threat landscape over the next few years will involve:
These are just conjectures. I don’t have the answers to the questions I posed above; however, I thought I’d at least ask them and explore the idea of looking at early APT targets’ current state to anticipate advanced threats that will later affect other organizations.
Related articles you might like:
The need to define custom, incident-specific signatures is slowly gaining traction in the mainstream enterprise. A few years ago this concept, often called Indicators of Compromise (IOCs), was mostly discussed by government organizations and defense contractors who were coming to terms with Advanced Persistent Threat (APT) attacks.
Madiant began popularizing the term IOC around 2007. Kris Kendall’s paper Practical Malware Analysis mentioned IOCs in the context of malware reversing at Black Hat DC 2007. For a precursor to this, see Kevin Mandia’s Foreign Attacks on Corporate America slides from Black Hat Federal 2006. At the time, few organizations saw the need to go beyond antivirus-based detection by analyzing the adversary’s artifacts to define custom host-level signatures.
Now, several years later, the term IOC is pretty well-known in the infosec industry. More companies are adding malware and related analysis skills to incident response teams. As Jake Williams put it, such firms know how to examine new malware and extract IOCs. “These are then fed back into the system and scans are repeated until no new malware is found.” Automated analysis products from vendors such as Norman, Mandiant, FireEye and HB Gary are being increasingly positioned as IR triage-enablers.
That said, the knowledge and skills for deriving and using IOCs is far from being mainstream. Anton Chuvakin highlighted the distinction between security haves and have-nots along the lines of this capability. The haves know how to reverse-engineer malware to “extract the IOCs FAST (or get those IOCs shared with you by trusted friends) and then look for them on other systems.”
IOC techniques haven’t entered the mainstream just yet. But we’re heading in that direction, as more people attain forensics skills and as more tools become available for defining and making use of such custom, incident-specific signatures.
To learn how to define and make use of IOCs, take a look at: