Dodging the Guardian: How Malware Evades EDR Detections

17 min readJun 11, 2024

— Published on June 11th 2024 by Jordi Iglesias (d4skor) at Iglenson Security.

In this article, we will showcase how an “evasive” malware is made and how it successfully bypasses a major EDR vendor’s detection mechanisms. The goal is to help non-security people get an insight on this sometimes underground niche, demystify how is malware actually developed, and why EDRs are not enough by themselves to stop the attackers.

A Stealth Plane’s “Under the Radar” principle applies to Evasive Malware as well.

· About Malware, Anti Virus and EDRs
∘ Detection and Response (Sometimes)
∘ Behind the EDR
· Understanding EDRs and how they are fooled
∘ The Good, The Bad, and The Ugly: Windows APIs from Hell
∘ Quick Introduction to API Hooking
∘ Setting the Mouse Trap
∘ Breaking free from the EDR’s Hooks: The Syscall Saga
· Bypassing a major EDR with a custom built malware
∘ The Malware Artifact
∘ Slipping By: Proof of Concept evading the EDR
· Conclusion

About Malware, Anti Virus and EDRs

To give some context to the general public, malware is just normal software which has unconventional , malicious — and rather shady — purposes. That means it is developed just like any other program, usually in low level languages such as C/C++ or even assembly. It uses Windows APIs, memory management functions, and other common stuff because malware’s main goal is to look like a legitimate program, so it is usually coded to blend in the environment as much as possible.

Now regarding the watchdogs: Antivirus are the most common products around to detect malware-containing files and some malicious behaviors happening on the system. Initially, AV’s detections were what’s known as “signature based” or static. This means that they grabbed a suspicious file, and compared its contents, metadata or byte patterns with the vendor’s known malware database. Of course this was easily bypassed by malware developers by “obfuscating” the original code and recompiling the program for example, since this would change most of the static signatures.

For sure the industry moved forward, and modern Antivirus solutions offer more advanced detections, like in-memory scanning (again searching for known patterns but in memory instead of disk), behavioral or sandbox detections (running the suspicious file in a safe sandbox to see what it does) paired often with Cloud analysis to offer more powerful inspection without being heavy on the endpoint’s hardware resources.

Detection and Response (Sometimes)

So now, let’s talk about the often misunderstood industry big boys: EDRs (Endpoint Detection and Response). They are usually looked at as if they were some magical black box that stops every single threat. But if I had to summarize them quickly, I’d say that an EDR is a security product that collects enormous amounts of “telemetry” data from the endpoint it is installed on, such as process / users / network / services activity among many others and gather these from several different sources.

EDRs are designed to work next to Antivirus and manage their alerts and detections. Sometimes EDR solutions come with an AV engine bundled, while others use Microsoft Defender’s default one. In any case, the EDR should never replace AV and viceversa.

A quick glance over the following table can help you get an idea on the large amount of data sources that different commercial EDRs have.

Almost everything a benign user does, as well as a malware or an attacker, generates telemetry from these sources. This data and information by itself is interpreted by the EDR’s “algorithm”, and it will raise alerts or take certain actions depending on how it is programmed and configured.

Behind the EDR

Even if an EDR misses a threat and is successfully “Bypassed”, all the telemetry that is continuously generated by the EDR based on the actions of the malicious actor will very probably expose them if properly analyzed by a Blue Team.

So, if we ignore the EDR sales pitches given to the average IT teams, the real value of all this massive amount telemetry data lies in the fact that it is a big feeding pipe for Security Operations Centers (SOC) analysts, Threat Hunters, and other “Blue Team” defensive professionals who are tasked with processing it and determining if some activity is malicious or not. In essence, an EDR process can be considered a “sensor” that provides endpoint and user information to defensive teams in real time.

The key differences between different EDRs are not how much malware they detect by themselves in certain tests (which is important too), but the visibility and information they provide for professionals to further analyze and detect (and respond). If an EDR is blind, so is the Blue Team for the most part. So you better pick an EDR that has a lot of eyes to see.

Now the main point for this article is that EDRs are not enough by themselves. EDRs are designed as a tool for giving visibility to security analysts. If no one is managing them, most of the telemetry they generate is ignored, and the EDR is left alone at its own criteria. By default, they will not block and detect some modern and complex attacks because they are actually hard to identify, but they will almost certainly log and provide everything necessary for a Blue Team to actually hunt the attackers down. If EDRs were supposed to autonomously block all sorts of attacks, as it is often told to sysadmins by salespeople, one could expect an unsustainable amount of False Positive alerts coming from them as well.

That’s why the best case scenario would be for a company to have some good EDRs deployed, followed by a 24/7 Security Operations Center, and a Threat Hunting team that would investigate anything suspicious on real time, among other security specialists. But obviously, having an internal “Blue Team” is really far from being affordable by the vast majority of the small and medium sized companies out there.

This is the main reason for the increasing offer on additional services related to EDRs that we are seeing in the market, mainly focusing SMBs. Among the few out there, Managed EDR services offer probably the best solution of them all. Thanks to economies of scale and outsourcing, they usually provide an EDR “sensor” for each endpoint, as well as a 24/7 SOC monitoring of all the telemetry being generated. Not only this, but also a real time Threat Hunting team service is included as well, which will dive into any suspicious incident to detect and hunt the attackers if they have bypassed the first line of defense, the EDR & Antivirus themselves. And all of this at a really competitive price range. Wonders of capitalism one could say :)

Understanding EDRs and how they are fooled

Before getting to the malware showcase, we must first explain what a malware typically does, and how EDRs detect them at a lower level.

For this, there will also be an introduction to Windows API Hooking — a common EDR detection technique — and how it is circumvented to help the malware remain undetected.

The Good, The Bad and The Ugly: Windows APIs from Hell

EDRs and other security solutions that inspect suspicious software’s behavior often try to find known patterns of consecutive actions that can be considered an Indicator Of Compromise (IOC), as they are almost exclusive to malicious activities.

The idea in any “process injection” technique is to make a target process execute what’s known as the “payload”, which is usually the actual code that makes the victim computer connect back to the attacker when executed. This payload is normally found within the malware, and can be anything from a cmd command, to a reverse shell, and of course an agent that connects to an attacker controlled C2 Server (Command & Control). The targeted process can be the malware’s own process or a remote one for more stealth, like notepad.exe as a generic example.

One infamous procedure is the classic process injection technique. It consists in the usage of 3 Windows APIs:

VirtualAllocEx: Allocates Virtual Memory on a remote process, where the Payload will be copied into.
VirtualProtectEx: Sets the memory permissions on the remote process’ allocated memory (which now contains our Payload) to RWX — Read, Write, Execute — so that in the next step it can be executed properly.
CreateRemoteThread: Starts a new thread on the remote process that will run the allocated Payload code, since that memory space is now executable.

So, keeping in mind that the malware developer’s goal is to make the program look as legitimate as possible, one would obviously want to avoid this super common and easy to detect process injection technique. We’ll see later in this article, when showcasing the custom malware we made, a different process injection procedure used as an example.

If you ever get to read a bit on process injection techniques, behold the insane amount of original ways to execute Payloads existing. Here lies the relative beauty of malware development, which consists in being creative enough to fool EDRs and even Blue Team specialists by programming a malware to do absolutely unexpected things, trying to make them in a way that does not correlate to any other known malware, and effectively disguising as benign software.

Quick Introduction to API Hooking

One of the most well known technique for EDRs to determine malicious behavior is what’s known as Windows API “Hooking”.

Among the many techniques that EDRs have on their sleeve to catch malware, we will overview this one in particular because it is one of the most popular. Since there are countless articles talking in depth about this, here the idea is not to bore you explaining Windows Internals or any of the different API Hooking evasion techniques, but help you get an idea on how malware developers circumvent every new detection feature put out there in new ways every single time. Hopefully, with this API Hooking evasion example, you’ll get an idea on why this is often called a “Cat and Mouse Game”.

Note that due to the quick development of detection mechanisms, the use of API Hooking in modern EDRs decreases year after year. So it will soon be an obsolete technique, although it will be useful as an example for this article.

API hooking is a generic technique used to intercept and modify the behavior of an API function (for example, one of the three mentioned previously). This is commonly used for debugging or reverse engineering. It involves replacing the original implementation of an API function with a custom version that performs some additional actions before or after calling the original function. This allows one to modify the behavior of a program without modifying its source code, but only the imported APIs.

Windows API hooking is one of the techniques used by AV/EDR solutions to determine if code is malicious. If it hooks a common process injection API, hooking can give an insight into what an executable attempts to do.

The classical way of implementing API hooking is done via trampolines. A trampoline is a “Jump” that is used to alter the code execution path by jumping to another address in the process’ memory.

The trampoline’s jump is inserted at the beginning of the function, resulting in the function becoming hooked.
When the hooked function is called, the trampoline jump is triggered instead.
Then, the execution flow is passed and altered to another address thus resulting in a different function being executed instead.

Don’t worry if this seems a bit confusing, next up it’ll make more sense.

Setting the Mouse Trap

Let’s see the EDR’s implementation of API Hooking to detect malicious behaviors in critical Windows APIs.

Windows APIs are an abstraction of the NT or Native API, for making normal programmer’s life easier. These APIs usually start with “Nt” to be distinguishable, and are exported from a Windows’ DLL library named NTDLL.

Whenever we call a Windows API, under the hood it is the Native API that’s being executed. EDR’s won’t actually hook the Windows API but the Native version of it. Some examples of Windows API vs their respective Native APIs are:

VirtualAllocEx -> NtAllocateVirtualMemory
VirtualProtectEx -> NtProtectVirtualMemory
CreateRemoteThread -> NtCreateThreadEx

A couple diagrams to help you visualize and understand the EDR’s hooking procedure better. The first one shows the Unhooked VirtualProtectEx, and the second one shows how the EDR hooked its underlying Native function, NtProtectVirtualMemory.

Time for a real world example on how our target EDR attempts to hook one of the three infamous Windows APIs (it’s hooking their Native version as we just mentioned). The API we will analyze is VirtualProtectEx, which underneath executes the Native NtProtectVirtualMemory function. The program running will be a simple Hello World! in C.

To get a bit more of clarity on how this is implemented, we are going to see a few screenshots from a Debugger, showing the assembly instructions that will be finally executed when the API is called. You don’t need to understand the assembly instructions themselves for this, just that when “syscall” is executed, the execution goes to the kernel and it is actually processed by the CPU. So, if something interrupts the execution between “mov r10, rcx” and “syscall”,the API will never run.

This is the original state of the Native API, with no EDR installed: as you can see, mov r10,rcx …. syscall.

But, in this second screenshot, with the EDR already installed, we can see that the first instructions have been modified, and now contain a JUMP instruction. This is the actual HOOK or Trampoline as we mentioned.

So, you might be wondering, where does this Jump go? It redirects the execution flow to…. the EDR! And how does it do that? Well so EDRs usually have a DLL (library) component which is injected into every single process in the system upon its creation. Yes, even in this simple Hello World program. Once injected, the EDR installs a “hook” on every critical or sensitive Native API function out there (like the 3 mentioned), and that hook makes a JUMP to the EDR’s DLL.

Thus, every time one of these hooked functions is called, the EDR’s DLL will receive the execution before running “syscall”, and since it has an internal logic to determine if one function is being abused or not, it will return back to the original execution and run “syscall” if no threats have been found, or else it will stop the execution and raise an alert.

Debugger showing the loaded “modules” in the Hello World process. We can see kernelbase.dll, ntdll.dll and other expected DLLs. But **the red one belongs to the EDR. DLL name is blurred to avoid legal issues with the vendor.**

At this point, it’s game over, right? We shouldn’t be able to execute any juicy API function without the EDR knowing, right? … Continue reading :)

Before getting to the hooking implementation, if you still can’t put the pieces together, just imagine a security system for a house. The EDR is the control panel, and on every door and window there is a sensor that notifies the control panel whenever it opens. This sensor is a Hook in a sensitive Windows API. Every time a door is opened (a Hook is triggered), the Control Panel (EDR) will check if there’s something weird going on, or it’s just the kids going to the backyard (regular usage). So, if at 3am the backyard window opens (a sensitive Hooked API is called), and the motion sensor of the garden was triggered as well (another sensitive Hooked API was used), definitely that’s an “Indicator Of Compromise”.

Breaking free from the EDR’s Hooks: The Syscall Saga

Ever since EDR’s best wishes of catching malware turned into a reality thanks to their fantastic hooks, malware developers reverse engineered them and obviously found 101 ways to evade them. The most popular methods to avoid the hooks are direct and indirect syscalls.

Note: “syscalls”, asides from being an assembly instruction, is also used to refer to Native APIs themselves like NtProtectVirtualMemory. So Native API and syscall may be used interchangeably.

Even if we manually called the Native API corresponding to a hooked Windows API like VirtualProtect, the hook would still catch us. So, the starting point for most of the EDR hooking evasion techniques is to manually implement the assembly code for the desired “syscalls” (Native API functions) and call them directly.

Reminder of what an original, Unhooked Syscall/Native API looks like in assembly

The standard hook-triggering procedure would be calling the Windows API, that then calls the Native API where the hook is placed. Here, the execution JUMPS to the EDR’s DLL, to determine if the behavior is malicious or not. If not, the execution will return to the Native API’s assembly code and finally the syscall instruction will be executed:

This would be the regular Windows API usage, which ends up falling in the EDR’s hook as expected.

To avoid the hooks, instead of this typical procedure we will execute the syscalls directly, which means to manually call the same assembly instructions that the target function would run in the end if no hooks existed, without calling the APIs and hooks on top of them. From here comes the name of “direct syscalls”.

So this is what happens when calling the syscalls or Native APIs “directly”, in a way in which we completely avoid the EDR’s hooks:

By doing this simple action, the EDR’s hooks have been effectively bypassed.

We can now finally proceed to the malware Proof of Concept.

Although, life would be too easy for malware developers if evading EDRs only consisted in this. Note that we have skipped the part of getting the “SSN”, which is like the Native API / syscall ID, and getting this SSN is a complex task for which several different techniques have been developed.
Anyways, direct syscalls are now obsolete, and the procedure we just explained started getting detected, so later came Indirect Syscalls and a ton of techniques for doing these… And one day EDRs started monitoring Call Stacks, so one had to perform Call Stack Spoofing, until EDRs learned to gather telemetry from the kernel itself so anything we did in “Userland” became useless. With this hopefully you already have an idea on how this Cat and Mouse works.

Bypassing a major EDR with a custom built malware

To prove the points and statements made so far through the article, this section is dedicated to bypassing a well-known EDR. So, for the ones still believing that these products are 100% bulletproof — as said by the salesperson — , continue the read. (the Product and Vendor name will be blurred to avoid legal issues).

This should work as a counter-propaganda against aggressive and misleading sales statements made by EDR vendors. And while still they are valuable and necessary, I hope that for now you already know that EDRs are not enough by themselves if no one is looking at their telemetry, because their standalone detections can be bypassed, although they keep on generating valuable data from anything the malware does.

We will showcase how a custom built and relatively simple trojan malware slips through the detections and establishes a remote access backdoor on the EDR-protected computer.

The Malware Artifact

Basically, when designing this malware, we tried to avoid common patterns like “The Good, The Bad, and The Ugly” among many others.

Firstly, regarding the payload itself, we will be using a Havoc C2 (Command & Control) agent. This is the code that, when executed, will make the victim computer connect to our “attacker” C2 Server, establishing an interactive backdoor on the victim and giving us remote access. This payload will be encrypted inside the malware code and will only decrypt on runtime, so we can avoid “static” or signature detections. This step is important because since this C2 project is open source and well-known, Antivirus have plenty of “signatures” to detect its generated payloads when analyzing files.
The process injection (moving the payload into the remote process’ memory) will be leveraging a technique known as “Mapping Injection”, which consists in moving the payload into a “Mapped ” memory zone to avoid the typical “Private” memory sections created when using VirtualAllocEx. EDR’s are really suspicious about private memory regions with Executable permissions, which is exactly what we would end up with if using VirtualAllocEx.
And for the payload execution, the main idea is to avoid using CreateRemoteThread at all costs, since we already know is “The Ugly” API. In this example we will use a new technique which abuses Windows feature called ThreadPools. With thread pools, we don’t have to create a thread at all, and instead, we can let the pool handle the execution of our payload, greatly increasing the chances of slipping by undetected.
Regarding the API Hook evasion, we are using an Indirect Syscalls technique named “Hell’s Hall”, an evolution of Hell’s Gate, Halo’s Gate, and Tartarus Gate (Yes, evasion technique names are that weird). These are all iterations of the base concept of direct syscalls which we explained already, but refined in certain ways.

There are of course more features on this malware, like anti-analysis stuff or anti-virtual environment functions, which prevent the malware from running if it is being scanned in an EDR or Antivirus’ Sandbox. But this is off topic for now.

Slipping By: Proof of Concept evading the EDR

For this malware demonstration, the remote process where our malicious program injects and runs the payload will be Microsoft Edge for stealth reasons. This is because if we injected into Notepad.exe for example, I guess everyone would be suspicious if it started connecting to the internet to our attacker C2 Server :) But Edge is always generating Internet traffic, so it wouldn’t be weird at all.

In the following demo video, you’ll see two screens:

The left one represents the victim computer, which will run the malware, and has the EDR dashboard open.
And the right one is the Havoc Command & Control server interface, this is what the attacker would see, and from where one can interact with the victim computer. An example whoami command will be executed, as well as a screenshot of the victim’s desktop.

Also you’ll notice that the malware execution is delayed. This is because for the first 30 seconds it will do random stuff like solving math problems to distract the EDR’s initial attention.

Conclusion

As a quick recap of everything that we have explained along this article, the main takes are the following:

EDRs are really good detection tools, although one shouldn’t really trust them alone and instead try to have a dedicated security team backing them up, be it a Managed EDR service or a fully fledged Blue Team. Generally speaking, this should not be a task commanded to the IT team or Sysadmin in place.
Malware development is not witchery, it’s just making regular software that simply has undesirable intentions and is often coded to look as benign as possible.
Attackers and malware developers always have an advantage over the Blue Team (defenders) because a human’s original thoughts when programming malware are unpredictable by nature. This also explains why the patterns an EDR recognizes are easily fooled by simply being creative and doing the unexpected.
There’s no such thing as an EDR, or any security solution at all, that will detect 100% of the threats. It will never exist. Asides from the security products in place, all companies have to ensure a proper hardening and penetration testing on their internal and external infrastructure to reduce the chances of an attack becoming critical if the first line of malware defense (EDRs and Antivirus) is bypassed.

Thanks a lot for reading, and see you on the next articles!

Special thanks to @waldoirc for his useful corrections and insights.