ROC’ing The Data Science of Cyber Security

--

As you may know, we are developing the Data Science of Cyber Security course, and a core part of this is the investigation of machine learning. Within Cybersecurity, we are increasingly swamped with data, and in many different formats and from many different sources. The detection of a data breach often involves searching through Terabytes of log data, in order to find the trace of the scanning of the network by an intruder (the reconnaissance phase), or on the dropping of malware onto a site (the delivery phase), or in the running of malicious script to install a backdoor (the installation phase), or the call back to a master control network (the command and control phase), or even the transfer of files (the action phase).

But if an organisation has good defences in place, it could detect the threat at an early phase, and thus make plans to stop the threat from progressing into an action phase. Many organisations are thus moving to Security Operations Centres for 24x7 montoring of their data infrastructure. Unfortunately the security analysts could be swamped by the amount of alert/alarms being generated, or become desensitized by too many false alarm. Thus we increasingly use machine learning to classify our inputs, and thus aid the analysts in making reasoned descisions. We must thus understand how we classify data, and in how we discover our thresholds. This article…

--

--

Prof Bill Buchanan OBE FRSE
ASecuritySite: When Bob Met Alice

Professor of Cryptography. Serial innovator. Believer in fairness, justice & freedom. Based in Edinburgh. Old World Breaker. New World Creator. Building trust.