A brief introduction to malware analysis

Kim Crawley
8 min readSep 26, 2024

--

Christoph Scholz, CC BY-SA 2.0 Attribution-ShareAlike 2.0 Generic

Here’s another gem from my Peerlyst archives that’s still as relevant in 2024 as it was when I wrote it in 2019. Enjoy!

Thank you, patrons!

At the Fan level: Naomi Buckwalter! OMG, thank you!

At the Reader level: New Readers! Sylvain and HTownQueer!

Returning Readers Ryan Wilson, François Pelletier and IGcharlzard!

I will do my best to post something new weekly. If you can, I’d love for you to join my Patreon supporters here. I even have support levels where I can do custom work for you: https://www.patreon.com/kimcrawley

Malware has been one of the most significant cybersecurity problems ever since computers could acquire data from outside sources. It really started to become serious by the 1980s, and today it’s a bigger issue than ever before. In order to prevent malware infections or to contain the damage that it can do, we need to learn about how it manifests and behaves. Two of the companies I work for, BlackBerry Cylance and Kaspersky, both make antivirus products, so malware research is essential for them. Malware can be researched by analyzing malware samples… malware analysis! It sounds like a really complicated topic, but my brief introduction should make it simple and easy to understand. This isn’t a fully detailed explainer, that would probably take hundreds of thousands of words. But this post should be an excellent starting place in order for you to learn more. But before I get into the core of this topic, it helps to understand what some of the different types of malware are.

Some common malware types

Laypeople often refer to all malware as viruses. Despite the fact that antivirus software should protect your computing devices from all types of malware, viruses are only one type of malware. And very often one particular malware can fall into multiple categories. For example, it’s technically possible for malware to be both ransomware and a worm. A lot of more recent malware, especially malware that targets Android devices, can fit into multiple categories simply because a cyber attacker’s command and control servers can upload modules to infected devices that do multiple types of bad things. Malware is simply all software that was designed with malicious intent, that can do bad things to computers. Sometimes, malware can even be subjective. A cryptominer isn’t malware if the user consents to the cryptomining… or is it? Some popular spyware Trojans might not be considered to be malware by someone who doesn’t mind the surveillance if it means they can listen to their favourite music. As for myself, I set a pretty low threshold for what is or isn’t malware. Your mileage may vary.

Viruses are malware that replicates itself by modifying other computer programs and inserting its own code upon execution. They are kind of like biological viruses in that way, hence the name.

Worms replicate themselves by copying themselves onto infected machines. Unlike viruses, worms typically don’t alter the other files on a computer. Think of a worm eating one apple and then moving onto the next apple. It’s impossible for a malware to be both a virus and a worm simultaneously, because the terms specifically refer to how malware can replicate. You don’t hear about worms anywhere near as often as you would in the 1990s, but they’re definitely still a thing.

Trojans describe another type of malware transmission. Like the soldier-filled wooden horse of Greek legend, Trojan malware requires direct user interaction in order to replicate, and they do so by fooling a user into thinking it’s something they want. An example of a Trojan is when cyber attackers file bind malware onto a music or movie file that they upload to P2P or Bittorrent networks. Do you remember those fun free animated cursor collections from the Windows 95 to Windows XP era? Those were probably Trojans. And Trojans can take many other forms as well.

Fileless malware really started to become a thing about ten years ago, and it’s an even worse problem today. There are actually files involved, at least from the cyber attacker’s end. But fileless malware doesn’t write any files to your data storage, such as your hard drive. Fileless malware usually acts by hijacking a process that runs in your computer’s memory. If you execute Task Manager on a Windows machine, you’ll see svchost.exe run multiple times for multiple normal Windows processes. But sometimes one of your svchost.exe instances could be hijacked by fileless malware! Fileless malware is notoriously difficult to detect, but advanced antivirus software may be able to detect it by examining the behaviour of your computer’s memory.

Ransomware has been around longer than most people think. Contrary to popular belief, its existence predates Bitcoin. Ransomware is never covert, it’s always overt. That’s because ransomware encrypts the files on your computer, but without giving you access to the decryption keys. Ransomware will display a ransom note, often with a text file or local webpage, to extort money from the user in order to restore their access to their data. Back in my tech support days around 2007 to 2010, ransomware would demand a credit card number. Now it usually demands some sort of cryptocurrency, and that’s why people erroneously believe that it’s a post-Bitcoin phenomenon. WannaCry is probably the most notorious ransomware ever, and it marks the transition of ransomware primarily targeting consumers to primarily targeting enterprises and institutions.

Cryptominers use a computer’s computational power to engage in the complex mathematical work in order to generate cryptocurrency. When it’s done without a user’s consent, it’s definitely malware. But sometimes people do consent to cryptomining. Malicious cryptominers tend to be really covert and subtle. They’ll do their best to use just a little bit of CPU power and memory per device across a botnet of infected machines, in order to evade detection.

Which segues into botnets and zombie malware! A botnet is a network of malware infected computers that’s maliciously controlled by a cyber attacker’s command and control servers. Botnets can do a wide variety of malicious things. Very often they’re used to synchronize a large number of computers to perform distributed denial of service attacks. And as I’ve mentioned, often botnets exist for the sake of cryptomining as well. But there aren’t many limits to what botnets can do, the sky’s the limit!

Malware analysis 101

So malware analysis is the craft of examining potentially malicious files and software in order to understand how it exists, transmits, and behaves. A lot of malware analysis is done by people who work for antivirus companies. But often times, people like SOC analysts engage in malware analysis as well, in order to figure out how malware could be affecting the networks they’re supposed to help to protect.

Malware analysis can fall into two major categories, static and dynamic.

Static malware analysis is done by examining the resources that are called upon binary files without executing them. Sometimes static malware analysis is done by reverse engineering a file with a disassembler program.

Dynamic malware analysis is done by executing a malicious file, and seeing how it behaves while it runs. Sandboxing gives a malicious file an environment to execute in without damaging the host computer. Sometimes a sandbox can be provided by running an operating system in a virtual machine, such as VMWare or Oracle VirtualBox. In the worst case scenario, harm can be prevented to the host machine by simply deleting the virtual machine and deploying a new one. Other times an application that’s specifically designed to provide a sandbox for malware execution can be deployed instead.

Malware analysis can involve multiple stages, specific to what’s useful for understanding a certain type of malware. The steps involved in malware analysis can vary greatly, but I will describe a few of the possible steps that are often applicable.

Fully-automated malware analysis scans a malicious file through an automated program with little direct interaction from the person doing the malware analysis. The automated program will then generate reports describing how the malware uses the network, how it modifies the configuration of an operating system, and how the file behaves. Fully-automated analysis can have limits to its usefulness, but if a malware analyst has a lot of files to analyze, this method can start their work quickly and efficiently.

Static properties analysis looks at static properties of a malicious file, like embedded strings, embedded resources, header data, and hashes. These are the elements of a malicious file that don’t change, so they can be analyzed in a static manner.

Interactive behavior analysis is a type of dynamic malware analysis. An analyst will execute the malware, usually within some sort of sandbox, and watch what it does very carefully. This stage of malware analysis can provide much more detailed information than the fully-automated stage.

Manual code reversal may be the most challenging phase of malware analysis. An analyst will carefully reverse engineer a malicious file to find capabilities that could be missed in the behavioural analysis stages. This stage requires advanced software engineering knowledge to an even greater extent than the other stages, and it can be very tedious work for a human being to do. But the amount of effort can correlate with its effectiveness, so it’s often worth doing.

Some malware analysis applications

I’m going to finish this post by briefly introducing you to some of the more commonly used types of applications that are used in malware analysis.

As I mentioned, dynamic malware analysis is often done by executing malware in a virtual machine. For this purpose, VM clients and the disk images of operating systems become useful tools.

Hex editors are often used in static malware analysis. These applications show human beings the hexadecimal code that a static file can translate to. Hexidecimal code can look something like this: 48 65 6C 6C 6F 2C 20 77 6F 72 6C 64 2E. If you can translate that into ASCII characters, please let me know in the comments, I will be impressed!

Dissemblers are similar to hex editors in some ways, but they’re not quite the same thing. They are applications that usually translate more sophisticated computer programming code into assembly languages, which is the code that your computer’s CPU works with directly. They can be useful in static malware analysis.

Debuggers are also often used in malware analysis. A bug is any type of computer programming error, and they’re often just honest mistakes, not malicious at all. But by looking at the flaws of computer programming code, often malware can be better understood.

Finally, there are applications that are specifically designed to provide malware analysts with safe sandboxes to work in, so they don’t harm the host machines that they use, but they can still see how malware behaves. VMRay and Joe Sandbox are two examples of this type of application.

So there you have it! You now have a basic understanding of how malware analysis works. As malware has evolved to become increasingly complicated, malware analysis is more challenging than ever. But malware affects all types of computers and all operating systems. If someone tells you a certain type of computer or operating system can’t get malware, don’t believe them. All computers that can get data from outside sources like removable media or networks can be infected with malware. So I must salute the world’s malware analysts, because they have their work cut out for them.

--

--

Kim Crawley
Kim Crawley

Written by Kim Crawley

I research and write about cybersecurity topics — offensive, defensive, hacker culture, cyber threats, you-name-it. Also pandemic stuff.

Responses (1)