How to Detect Metamorphic Malware

SOC Prime
5 min readJul 25, 2022

--

Sounds scary that there is malware that can change its entire code once it's inside the victim's system and stay undetected. Even though the first-ever metamorphic virus was found in 1998, some interesting samples keep spreading in the wild right now and infecting everything from Windows devices to Bayer's radiology machines.

Increasing creativity of attackers allows for planting malware that changes its mutation rules in every new variant, tracks mutated copies, delays disassembly, and more. And if you can't look under its chameleon's skin, you don't even know what the code does. Hence, you can't stop it. That's why detecting metamorphic malware is also a kind of an art form 🎨🧚🍷.

"Perks" That Come With Metamorphic Code

Metamorphic malware uses obfuscating techniques, packers, and PE infectors to evade detection, on top of altering its main code. Some strains are all-in-one. For example, Virlock is ransomware, file infector, polymorphic algorithm, metamorphic algorithm, and screen locker.

PE Infectors can wrap the malicious payload into legitimate EXE and DLL files. Since this format is not very resistant to modification, it's relatively easy for the malware to infiltrate.

There are many types of self-modifying code, but essentially, it's programmed to do similar functions:

  • Establishing a connection, sending and receiving data to/from Command and Control server as well as a peer-to-peer botnet;
  • Using anti-debugging and anti-VM techniques
  • Encrypting and obfuscating the data
  • Modifying system processes

Most packers, mutating engines, and PE infectors that come with metamorphic code can be detected by behavior-based techniques. We can trace patterns in things like spatial correlation between functions, jump commands, etc. Even a small number of strings alone can look suspicious.

Advanced malware can tamper with audit logs, so it's better to support SIEM rules with machine-learning tools and multi-layer heuristic detection.

If the malware employs advanced techniques like code transposition combined with instruction substitution and dead-code insertion or rewriting based on Tzeitzin semi-Thue systems, it calls for creating custom detection algorithms on demand. Visit our SOC Prime Detection as Code platform to learn more.

How to Detect Metamorphic Code

Metamorphic malware is like the next level after a polymorphic one. Even though it's confusing that the latter has the same definition as the core concept behind OOP, it's not the same.

For many people who's been long enough in software engineering, polymorphic malware is boring. Nevertheless, the notorious WannaCry was polymorphic, too.

While polymorphic code works with encryption keys, metamorphic malware works without them. Instead, the latter relies on mutating engines. And while all copies of a polymorphic code have the same kernel, all metamorphic copies have different kernels.

A common polymorphic kernel looks like this:

Mutating engines usually change the initial binary code into some temporary representation and then edit this representation to create a completely different code that assembles back into a machine code and does the same thing as before.

The structure of a mutating engine looks something like this:

As you can see, the engine itself is much bigger than the payload and relies on low-level assembly instructions.

Some peculiarities of such engines allow us to detect their suspicious behavior. Even if we don't know for sure whether a virus has infiltrated or not, we can take some preventive measures. For example, allow either to write permission or to execute permission, but never both. Since the metamorphic malware attempts to do both, it looks suspicious; hence, possible to detect.

Also, we can compare similarities with the help of fuzzy hashing algorithms. This is not a far-fetched future β€” the algos exist right now and have been in use for years in major agencies. For example, NIST used ssdeep and sdhash to create a National Software Reference Library (NSRL). Algorithms like these look for similarities between files on a binary level; that's why it gets easier to detect metamorphic files, packers, engines, and so forth.

Aside from known tools, you can also train a neural network if you have good samples. Machine learning in threat detection is highly popular now, so you can use ready-made models, which saves time and effort.

We like the following models:

  1. Jian Jiang, Fen Zhang, "Detecting Portable Executable Malware by Binary Code Using an Artificial Evolutionary Fuzzy LSTM Immune System," Security and Communication Networks, vol. 2021, Article ID 3578695, 12 pages, 2021. https://doi.org/10.1155/2021/3578695
  2. Campion, M., Dalla Preda, M. & Giacobazzi, R. Learning metamorphic malware signatures from samples. J Comput Virol Hack Tech 17, 167–183 (2021). https://doi.org/10.1007/s11416-021-00377-z
  3. Bergenholtz, E., Casalicchio, E., Ilie, D., Moss, A. (2020). Detection of Metamorphic Malware Packers Using Multilayered LSTM Networks. In: Meng, W., Gollmann, D., Jensen, C.D., Zhou, J. (eds) Information and Communications Security. ICICS 2020. Lecture Notes in Computer Science(), vol 12282. Springer, Cham. https://doi.org/10.1007/978-3-030-61078-4_3
  4. Ling, Y.T., Sani, N.F.M., Abdullah, M.T. et al. Nonnegative matrix factorization and metamorphic malware detection. J Comput Virol Hack Tech 15, 195–208 (2019). https://doi.org/10.1007/s11416-019-00331-0

Finally, you need to understand that even the most accurate ML models are rather a useful contribution to the security engineer's work but not a substitution for it. They are good for working with large amounts of data and obtaining valuable research findings. But an enterprise's security posture depends on how you act on those findings.

--

--

SOC Prime

Defend against attacks easier, faster, and more efficiently than ever before with socprime.com