Introducing FIDO: Automated Security Incident Response
We’re excited to announce the open source release of FIDO (Fully Integrated Defense Operation — apologies to the FIDO Alliance for acronym collision), our system for automatically analyzing security events and responding to security incidents.
The typical process for investigating security-related alerts is labor intensive and largely manual. To make the situation more difficult, as attacks increase in number and diversity, there is an increasing array of detection systems deployed and generating even more alerts for security teams to investigate.
Netflix, like all organizations, has a finite amount of resources to combat this phenomenon, so we built FIDO to help. FIDO is an orchestration layer that automates the incident response process by evaluating, assessing and responding to malware and other detected threats.
The idea for FIDO came from a simple proof of concept a number of years ago. Our process for handling alerts from one of our network-based malware systems was to have a help desk ticket created and assigned to a desktop engineer for follow-up — typically a scan of the impacted system or perhaps a re-image of the hard drive. The time from alert generation to resolution of these tickets spanned from days to over a week. Our help desk system had an API, so we had a hypothesis that we could cut down resolution time by automating the alert-to-ticket process. The simple system we built to ingest the alerts and open the tickets cut the resolution time to a few hours, and we knew we were onto something — thus FIDO was born. This section describes FIDO’s operation, and the following diagram provides an overview of FIDO’s architecture.
FIDO’s operation begins with the receipt of an event via one of FIDO’s detectors. Detectors are off the shelf security products (e.g. firewalls, IDS, anti-malware systems) or custom systems that detect malicious activities or threats. Detectors generate alerts or messages that FIDO ingests for further processing. FIDO provides a number of ways to ingest events, including via API (the preferred method), SQL database, log file, and email. FIDO supports a variety of detectors currently (e.g. Cyphort, ProtectWise, CarbonBlack/Bit9) with more planned or under development.
Analysis and Enrichment
The next phase of FIDO operation involves deeper analysis of the event and enrichment of the event data with both internal and external data sources. Raw security events often have little associated context, and this phase of operation is designed to supplement the raw event data with supporting information to enable more accurate and informed decision making.
The first component of this phase is analysis of the event’s target — typically a computer and/or user (but potentially any targeted resource). Is the machine a Windows host or a Linux server? Is it in the PCI zone? Does the system have security software installed and the latest patches? Is the targeted user a Domain Administrator? An executive? Having answers to these questions allows us to better evaluate the threat and determine what actions need to be taken (and with what urgency). To gather this data, FIDO queries various internal data sources — currently supported are Active Directory, LANDesk, and JAMF, with other sources under consideration.
In addition to querying internal sources, FIDO consults external threat feeds for information relevant to the event under analysis. The use of threat feeds help FIDO determine whether a generated event may be a false positive or how serious and pervasive the issue may be. Another way to think of this step is ‘never trust, always verify.’ A generated alert is simply raw data — it must be enriched, evaluated, and corroborated before actioning. FIDO supports several threats feeds, including ThreatGrid and VirusTotal, with additional feeds under consideration.
Correlation and Scoring
Once internal and external data has been gathered about a given event and its target(s), FIDO seeks to correlate the information with other data it has seen and score the event to facilitate ultimate disposition. The correlation component serves several functions — first — have multiple detectors identified this same issue? If so, it could potentially be a more serious threat. Second — has one of your detectors already blocked or remediated the issue (for example — a network-based malware detector identifies an issue, and a separate host-based system repels the same item)? If the event has already been addressed by one of your controls, FIDO may simply provide a notification that requires no further action. The following image gives a sense of how the various scoring components work together.
Scoring is multi-dimensional and highly customizable in FIDO. Essentially, what scoring allows you to do is tune FIDO’s response to the threat and your own organization’s unique requirements. FIDO implements separate scoring for the threat, the machine, and the user, and rolls the separate scores into a total score. Scoring allows you to treat PCI systems different than lab systems, customer service representatives different than engineers, and new event sources different than event sources with which you have more experience (and perhaps trust). Scoring leads into the last phase of FIDO’s operation — Notification and Enforcement.
Notification and Enforcement
In this phase, FIDO determines and executes a next action based on the ingested event, collected data, and calculated scores. This action may simply be an email to the security team with details or storing the information for later retrieval and analysis. Or, FIDO may implement more complex and proactive measures such as disabling an account, ending a VPN session, or disabling a network port. Importantly, the vast majority of enforcement logic in FIDO has been Netflix-specific. For this reason, we’ve removed most of this logic and code from the current OSS version of FIDO. We will re-implement this functionality in the OSS version when we are better able to provide the end-user reasonable and scalable control over enforcement customization and actions.
Open Items & Future Plans
Netflix has been using FIDO for a bit over 4 years, and while it is meeting our requirements well, we have a number of features and improvements planned. On the user interface side, we are planning for an administrative UI with dashboards and assistance for enforcement configuration. Additional external integrations planned include PAN, OpenDNS, and SentinelOne. We’re also working on improvements around correlation and host detection. And, because it’s now OSS, you are welcome to suggest and submit your own improvements!
(Netflix’s FIDO — Fully Integrated Defense Operation — is not a part of or service of the FIDO Alliance)
— Rob Fry, Brooks Evans, Jason Chan
Originally published at techblog.netflix.com on May 4, 2015.