Stacey Franklin Jones on Open Source Intelligence and How it is Used
Both US and international intelligence agencies have been repeatedly covered in the news recently, from stories on accused Russian agent Maria Butina to headlines on foreign interference in US elections and the creation of “deepfake” political news content spread via social media. If you want to better understand how intelligence agents gather the information they use every day that undergirds these story, you might want to get familiar with open source intelligence (aka OSINT) — what it is and how it’s used. Today, Stacey Franklin Jones, Senior Scientist at O Analytics, is here to give us an overview of open source intelligence.
What is Open Source Intelligence?
Stacey Franklin Jones first notes that the term may be used in a few related, but slightly different ways. The phrase open source intelligence in simplest terms refers to data or information of “significance” that is publicly available. Its “significance” is in the eye of those who are going to then process it. It may be data from open data sets and other resources, including media and news reports, information posted online via social media, government data and reports, academic publications and journals, and/or corporate information. So, think of it as intelligence available in the public space. Stacey Franklin Jones also points out that the phrase open source intelligence is commonly used in the context of the methods(s) and techniques associated with collecting publicly available data, for example by intelligence officers and the military, and extracting or deriving something of value or particular interest from it. You may also hear terms associated with open source intelligence like ‘overt’ collection that is done openly, or ‘covert’ collection which is done in secret. This publicly available data can have many formats and may be found in text, video, images, audio or other content. Stacey Franklin Jones notes that while some information may be more difficult to find, like unpublished papers or old documents, there is an abundance of legally accessible public information. A third related definition is the more official one from the Office of the U.S. Director of National Intelligence: “Open-Source Intelligence (OSINT) is intelligence produced from publicly available information that is collected, exploited, and disseminated in a timely manner to an appropriate audience for the purpose of addressing a specific intelligence requirement.” This definition addresses the information that results from gathering the data from public sources, and applying certain extraction and interpretation methods. So, the term open source intelligence may be used to refer to the original data, approach to gathering and converting it to information of particular interest, and/or the resulting ‘intelligence’. In all cases publicly available data is the original input source.
How is Open Source Intelligence Used?
To utilize open source data in support of intelligence operations, it must be discovered, collected, and analyzed. One challenge with collecting and using open-source data is the vast amount of information that is now available, making it increasingly difficult to identify specific data sets. Stacey Franklin Jones explains that using machine learning and automated data analysis tools, intelligence professionals can identify patterns, trends, and relationships in key open source data; making it easier to use. According to research organization RAND “Rapid advances in machine learning and natural language processing are changing the efficiency of these methods for sorting, translating, and analyzing data for intelligence purposes.”
Here’s a simple example of how online tools can be used in a constructive manner to collect and analyze open source data available online. Google is one of the top search platforms in the world, connecting users with billions of content options. If your Google search results seem tough to sort through or your search terms are not specific enough, Google has something to help with this. It’s the “Google Dorks” tool that hyper-targets search terms, providing better indexed results for a more efficient collection and use of open-source data available on the internet. For example, Stacey Franklin Jones points out that you can use it to search terms that appear only in PDF files or not amongst all websites, which may greatly reduce your results and better target specific data sources. Another example are tools used by social media data gatherers that identify the same username across hundreds of social media sites, that can also make collecting social media content across platforms easier and less time consuming.