Why security software misfires, and 6 things software authors can do about it
Have you ever wondered why, in the era of deep learning and hoverboards, security software can still mess up? Why is it so challenging to distinguish clean files from malware? How can you make sure your software won’t be blasted off customers’ machines?
Here’s what the past 10 years in Symantec Security Response’s Content Compliance team have taught me.
Whether it’s Endpoint Protection software, a security appliance, Advanced Threat Protection, or an intrusion prevention system, at some stage a security solution needs to make a judgement call: Is this file a part of legitimate harmless software? Does this traffic originate from a legitimate process? What do I know about this process, and how can I quantify the risk to the system?
The protection software will gather intelligence about the traffic, processes, or files involved, and match these attributes to known good or suspicious attributes. For example, it may use a library of behavioral heuristic signatures, or it can calculate the “risk” based on behavior, embedded strings, traffic type, community rating, etc.
If the match is close enough, or if too little information is available, blunders can happen.
Here are some of the main reasons.
Packers bundle, compress, and encrypt binary files. They are the software equivalent of wearing a balaclava and ski goggles. You may be an innocent skier, but you could have trouble getting into a bank.
Runtime packers, particularly polymorphic ones, are a popular way to compress and obfuscate malware. Roughly 90 percent of Symantec’s malicious Portable Executable (PE) sample submissions are packed. However, popular packers such as ASpack, Armadillo, Themida, and UPX are also used by legitimate software to protect intellectual property and reduce the size of the binaries. About 5 percent of clean harmless software is packed.
In these cases, the security solution must rely on other means to choose a path of action, such as in-memory scans, emulation in sandbox, network traffic and file source. This poses a number of problems:
- The metadata on which the decision is based is reduced. There is no community rating if the packer is polymorphic, and no “known good” or “known bad” hash-based signatures will match. The strings and API calls used by the software are masked until decryption happens at runtime.
- The metadata that can be used is not available to the same components of the security stack, or at the same stage of protection. A file has to be fully downloaded on the endpoint before it can be emulated; it has to be executed (even for a small fraction of the code) before it can be scanned in-memory. This means the protection will happen later than desired, or, in very conservative setups, the protection in the earlier layers would be false positive (FP) prone.
- Code used by packers is very similar to viruses and self-decrypting malware.
Malware authors frequently customize their packers or create bespoke ones. They also rarely sign their malware (nearly all the signed files that fall within a detection policy are adware or grayware, not malware).
What to do?
If compression and encryption must be used, it is crucial to use out-of-the-box packers, not select the “polymorphic” option, if any, and sign the binaries cryptographically.
Even when the behavior of the analyzed process is clear, the desired response from the security solution is not always obvious.
Consider an app that displays the following capabilities:
- Track the location of the mobile device
- Remotely lock the device
- Prevent factory reset unless a passcode is provided
From these behavior traits, it is impossible to determine whether this is ransomware or theft protection software.
How do you distinguish a team of movers loading your neighbor’s furniture onto a truck, from a team of robbers doing the same? What distinguishes proxy software, or a cloud backup solution, from spyware? A remote access solution from a backdoor?
In some cases, the answer is stealth. Is the process visible, is there an easy way to halt and uninstall the software, is the user interface (UI) clear enough to assert that a customer would know what this software is doing, and how it came to be on the system? There can also be a legitimate reason for clean software to be stealthy.
What to do?
To prevent false positives in such cases, I would recommend being conservative in the use of persistence and stealth techniques, working on a clear UI and EULA, as well as reaching out to proactive whitelisting programs such as IEEE’s Clean Metadata Exchange.
Modern security software relies on an array of clues to determine a path of action. As mentioned earlier, the contents of a file may not be available to a given layer of the security stack, or it may not be enough information to derive intent. When available, the security software will also look at the origin of the file: the process that created it, the URL from which it was downloaded, etc.
Sites such as renowned hardware vendors (hosting their tools and drivers), or official app stores with a stringent vetting process, would influence the decision positively. Peer-to-peer networks and anonymous hosting would not. In addition, software that is commonly available through public-facing distribution portals can be easily added to whitelists, or machine-learning training sets, thus reducing the risk of accidental detection.
One particularly risky distribution method, in terms of detections, is affiliate programs. When affiliates are remunerated per install, there will be some who choose to bundle the software with adware, or other software of dubious nature, to maximize the revenue. In other cases, they have been known to alter the software slightly to install it silently while bundled with a small game or popular tool, effectively creating a Trojan. This is the software equivalent of falling in with a bad crowd.
When thousands of customers report the software to their security vendor, because they do not understand how it came to be on their system, nor did they consent to it, the vendor will have to issue a Potentially Unwanted Application (PUA) detection.
This will also happen to software with a very alarmist behavior, making exaggerated claims about problems supposedly found on the system, then offering to fix them for a fee. While they pose no threat to the system, this behavior decidedly qualifies as a PUA.
These detections are not false positives: they merely signal to the user that they are installing an app that some people installed unknowingly, and reported as inconvenient. It is up to the user then to confirm their intention to install it or not.
What to do?
While this is not a false positive, it can affect a company’s reputation and license sales. I would recommend selecting affiliate programs very carefully, or choosing a different distribution method, and making the program (or a trial version) available on public facing trusted stores, to improve its reputation. You can also sign up to a “clean practices” charter and get certified through companies such as AppEsteem.
Some software is designed for administrators who know what they are getting into. For example, tools that can perform identification or fingerprinting of your computer(s), remote administration software, vulnerability scanners, and tools for transferring data. These tools generally have legitimate use cases.
Unfortunately, these tools make it easier for an attacker to gain access to a system, and it is the security stack’s responsibility to ensure that an administrator is aware of the presence of such tools, and that they have not been installed by an unauthorized user. If a remote access utility appears on your home computer without your consent, that makes it a threat. For the same reason people need gun permits or driving licenses, a hacktool needs an exclusion rule to be allowed on a system.
What to do?
These are not false positives, the correct course of action for these detections is to rely on an administrator to allow them to run via antivirus exceptions.
Bugs and corruption
One day, a reliable company created a piece of software that needed full administrative access. Unfortunately, if the provided uninstaller was called without any parameters, the default behavior was to halt and delete everything on the system. A slight oversight that would pose a serious risk to the end user, which means the security solution had to reject this file, despite the author’s good intentions.
The same goes for applications or data files that don’t stick to a set format. For example, exploits often require crafted malformed input to trigger unexpected behavior in the software that processes this input.That is why most QA procedures include testing for randomized or unexpected input, to protect against such attacks.
Malformed documents and executables can trigger heuristics designed to detect obfuscated malware and exploits.
What to do?
A best effort to write secure code, and react to constructive feedback.
False positive-proneness in legitimate software can be mitigated by the following preventative measures:
1. Digitally sign the binaries.
2. Avoid polymorphic packers.
3. Be conservative in requesting Android permissions and using stealth or persistence techniques.
4. For commercial software, make your software available publicly on trustworthy sites, such as official app stores. Avoid dubious affiliate programs.
Like this story? Recommend it by hitting the heart button so others on Medium see it, and follow Threat Intel on Medium for more great content.