From Dissident to Detective: On the Way to ShmagunGPT

Slava Solodkiy
16 min readMar 26, 2024

--

OSINT Skills Made Alexey Navalny and His Team Popular

It was the skills in Open Source Intelligence (OSINT) that helped Alexey Navalny and his team gain popularity. Most of his AML and anti-corruption investigations were based on open data. One of their most striking investigations, “He Is Not Dimon to You,” has garnered 46 million views. It details how friends, classmates, and trusted persons of Dmitry Medvedev own non-profit organizations that receive generous donations from oligarchs and state loans. The investigation triggered protests across Russia, and online shopping orders helped prove the connection between Medvedev and the man registered as the owner of his non-profit organizations. Shirts and sneakers ordered under the name and address of the formal owner eventually ended up with Medvedev, who appeared in them publicly without issue.

Their investigations have often put them at odds with powerful state actors, leading to legal challenges and personal risks. The work of Grozev, Shmagun, and Dobrokhotov exemplifies the critical role of investigative journalism in uncovering truth and holding the powerful to account. And regarding compliance, KYC, and AML — I would recommend regulators, banks, and fintechs to learn from them (as the CIA does), rather than from conferences and office research by major consulting firms.

Christo Grozev, Olesya Shmagun, and Roman Dobrokhotov are well-known investigative journalists and researchers renowned for their work in exposing various illicit activities, money laundering and government malfeasance, often involving high-profile cases and sensitive political matters.

Christo Grozev is known for his association with Bellingcat, an international collective of researchers, investigators, and citizen journalists that uses open-source and social media investigation. Grozev has been instrumental in investigations into the poisoning of Sergei Skripal and Alexei Navalny, the downing of Malaysian Airlines flight MH17, and other notable cases involving Russia.

Like Grozev, Olesya Shmagun has contributed to uncovering corrupt practices and AML. Her work, much like that of her peers, involves meticulous research and the use of open-source intelligence (OSINT) techniques.

Roman Dobrokhotov is the editor-in-chief of The Insider, known for his involvement in major investigative efforts alongside Bellingcat. Dobrokhotov has faced significant legal and political pressure within Russia, including police raids and being targeted by defamation lawsuits, as a result of his investigative work. His efforts have contributed to revealing the actions of Russian intelligence and military services in various international incidents.

Roman Dobrokhotov (read the full article on Wired) has become a notable figure in exposing the clandestine operations of Moscow’s GRU military intelligence agency. Dobrokhotov’s journey from a protester challenging Kremlin narratives to a fearless investigative journalist is marked by his crucial role in uncovering the identities and activities of Russia’s most covert military spies and assassins, including their involvement in high-profile cases like the attempted assassination of Sergei Skripal with a nerve agent. Dobrokhotov’s work not only exemplifies journalistic bravery but also underscores the vital importance of independent OSINT-media in challenging state-sponsored narratives and uncovering the truth via ‘follow the money’ approach.

Recently, Grozev, who won an Oscar for the documentary film about Navalny, has been focused (together with Dobrokhotov) on investigating the activities of Jan Marsalek from Wirecard and is preparing a documentary film about him. Following the unexpected death of Navalny, Christo has temporarily concentrated, along with other independent investigators, on collecting and analyzing data related to the death of the opposition figure, known for his investigations into corruption and money laundering, and those involved in it.

Hristo Grozev is a Bulgarian investigative journalist, media expert, and media investor, leading investigator at The Insider, previously worked with Bellingcat. He is one of the main authors of the investigation into the involvement of FSB Russia employees in the poisoning of Alexey Navalny. Winner of the European Press Prize and the Emmy Award for his investigative journalism. Around 2014, he started investigative journalism with Bellingcat: “I’m doing something I’m good at, finding things others miss, using my knowledge of Russia, the neighboring countries, including Ukraine, working with people in these countries, and being concerned about their governments (both in Russia and Ukraine) deceiving citizens. I do this voluntarily, spending my own funds on investigations.”

In 2019, Grozev (together with Roman Dobrokhotov and Daniel Romein) was awarded the European Press Prize for Investigative Journalism & London Press Club Prize for Digital Journalism. In 2021, Bellingcat and CNN received an Emmy Award in the category of “Outstanding Investigative Report in News” for Hristo Grozev’s investigation. They also made reports against NATO, which was illegally selling weapons to Saudi Arabia for the civil war in Yemen, did analysis of Turkish and Greek crimes during the migrant crisis. Grozev was accused of organizing the escape abroad of journalist Roman Dobrokhotov.

Grozev lived in Vienna for 20 years until 2023, where he was under police protection due to his exposes about Russia. In February 2023, he stated that he moved to the USA after Austrian authorities told him they could no longer guarantee his safety. In August 2020, Grozev stated in an interview with Deutsche Welle that the poisoning of Alexey Navalny was similar to the attempts on Emelyan Gebrev and Sergei Skripal, in which Russian special services are suspected.

Responding to a question about informants, Grozev said: “We work only with sources who understand the risk they are taking.” He also denied any connection with the CIA, noting that Western intelligence had not even reached the information published by independent journalists. Grozev has bet on crowdsourcing: now anyone can join the work of investigative journalists, comparing data from flight tables with information about the time and place of high-profile poisonings and strange deaths. Several matches were immediately found.

Olesya Shmagun (Princeton, previously graduated from the Faculty of Journalism at Moscow State University in 2012, continued her studies in graduate school) — Pulitzer Prize winner for the investigation of the Panama Papers, co-founder of the publication “Important Stories”, employee of the Center for the Study of Corruption and Organized Crime (OCCRP), four-time winner of the monthly journalism prize “Redkollegiya”. In April 2017, as part of the International Consortium of Investigative Journalists along with 300 other journalists, received the Pulitzer Prize in the category for explanatory journalism for the investigation into the “Panama archive”. In 2023, she graduated from the Woodrow Wilson School of Public and International Affairs at Princeton University, earning a master’s degree in public policy.

Recently, Olesya and I were chatting about Nansen.ID and… ShmagunGPT, and I really think a tool for Enhanced Due Diligence (EDD), inspired by Olesya’s investigative magic, is exactly what we need, especially in the worlds of banking and fintech. I threw an idea at Olesya about creating a digital identity solution for opposition figures or maybe even a digital bank for those in exile… With Olesya’s incredible knack for digging into money laundering schemes, imagine digitizing her expertise to become a nemesis for money launderers everywhere with something like ShmagunGPT.

I dream of Nansen.ID as a business with a heart, channeling profits into the hands-on investigative work of journalists like Olesya, Christo, and Roman. Their investigative work provides insanely useful data for compliance in banks and fintechs — at the very least.

From a regulator’s perspective, KYC is less about knowing your customer and more about understanding where their money’s from, how they got it, and where it’s headed. I’ve been around the block with bank compliance, and Olesya’s battled against the baddies, uncovering corruption and laundering schemes. We’re basically enriching traditional data with fresh, unconventional insights. Plus, imagine if we built a backend sort of like ShmagunGPT, training a neural network based on Olesya’s investigative methods. It’d be like an automated sidekick for other investigators and compliance officers.

Filling out the same personal info over and over for every new bank account, insurance policy, mobile plan, flight, hotel stay, apartment lease, and more — isn’t it exhausting? That’s where the last bit about “convenience” comes in (i.e., no need to re-answer if you’ve already addressed a question; digital ID will auto-fill the existing answer). That’s for the end users.

For banks and other entities, it’s crucial to grasp that the slickest KYC process at onboarding won’t shield you from fraud and scammers: only 20% get caught at the get-go, while the other 80% are nabbed based on their transactions later on. To catch these guys later, you need to “cast a net” at onboarding so that any anomalies in behavior can be spotted more swiftly, allowing a quick rewind to pinpoint accomplices. The real slick criminals layer their operations with legit transactions by innocent folks — no system will red-flag them at onboarding. But setting up the system to notice oddities sooner or swiftly backtrack to find connections? Totally doable.

Banks, fintechs, and insurers pay for this. But who really benefits? End users!, especially those who’ve been denied accounts or visas. It’s a boon for the whole regulatory ecosystem, from visa centers and telecoms to hotels and airlines, even extending to online election services. In essence, it’s a trade-off:

1, I get that my nationality (or additionally, my industry affiliation) blocks me from certain benefits and creates hurdles I’d rather not have; I want to enjoy those benefits.

2, I know you don’t see me as the bad guy; you’re just covering your bases because you can’t tell us apart. So, my “payment” is becoming more open and transparent with you.

3, You accept this “payment,” allowing you a closer look into my life, with the agreement that if someone linked to me steps out of line, they get cut off from the network of benefits.

It’s a way of saying, “I’m cool, let me in,” while also ensuring everyone plays by the rules. As in Ancient Greece: exile from the polis (“collective responsibility” in action) as the main possible “punishment”.

The CIA has announced a new strategy for working with open-source information, aiming to expand and enhance the collection and analysis of data amidst the ever-growing information stream. The document, presented by the Office of the Director of National Intelligence (ODNI) and the CIA, discusses the development of methods for collecting, creating, and delivering intelligence from open sources (OSINT) until 2026. Special attention is given to the potential of artificial intelligence and machine learning in improving the processing of open data, as well as the risks associated with verifying the authenticity and reliability of information.

As part of the strategy, ODNI has enlisted leading cybersecurity expert Jason Barrett to implement key directions. His task is to integrate innovations into OSINT work based on the CIA’s experience in this field over the last year. The CIA has also developed AI technology, similar to ChatGPT, for selecting relevant information from the vast amount of available data. This new tool automates the OSINT processing workflow, highlighting key data for analysis. Senator Mark Warner, the chair of the U.S. Senate Intelligence Committee, emphasized the importance of such tools, noting that the traditional view of prioritizing covert information collection is giving way to the recognition of the importance and effectiveness of using open data.

Open Source Intelligence (OSINT) involves searching and analyzing public information to ultimately gain new knowledge. Essentially, OSINT investigators primarily work with data that has already been published by someone at some point. States and corporations possess a vast amount of information, part of which can be found online or obtained upon request. However, the path to this data often lies through websites that are invisible to search engines, through cumbersome databases, little-known archives, and clunky interfaces. The investigator’s skill lies in finding information, analyzing it, and making it understandable to a broad audience.

OSINT emerged in the 20th century as a military technology. One of the first entities specialized in such investigations was the Research and Analysis Branch of the American Office of Strategic Services, the precursor to the CIA. Today, open-source data intelligence methods are used by intelligence and government employees, as well as professional investigators and journalists.

OSINT primarily involves working with open data, but investigative teams sometimes use non-public sources. For instance, the investigations into the poisonings of Alexey Navalny or Sergei and Yulia Skripal are based on mobile operators’ billing data (information about incoming and outgoing calls, SMS, internet traffic), passenger lists of trains and airplanes, leaked databases of commercial companies, and other data. Such information can be purchased on Telegram channels or in the darknet.

Other open services used by OSINT investigators can be divided into groups:

  • Maps and satellite images, not only the popular Google or Yandex but also Bing and OpenStreetMap (OSM). The latter operates on a Wikipedia-like principle — users can add and mark objects on the map themselves. For OSM, there’s also the Overpass-turbo app, allowing for the download of coordinates for specific objects on the map, like all stores of a certain retail chain or all drinking water fountains in a city.
  • Services that allow searching by photo, known not only to investigators but also to ordinary people. You upload a photo of a person, and the site shows you their social media page, and sometimes even friends they preferred to hide. Many of these platforms are paid but have limited free functionality, such as PimEyes or Search4Faces. There are also services providing information by phone number or car license plate.
  • Commercial company registries reveal the company’s founding date, authorized capital, legal address, and people associated with it.
  • Vehicle movement services. The popular site Flightradar collects flight numbers, information about the starting and ending points of routes, registration number, country of registration, and other data about all flights. Similar services exist for tracking sea vessels.
  • Services for searching removed information. Resources like WaybackMachine allow you to find and view old versions of websites — in case they have stopped working or their data has been removed.
  • Metadata analysis systems can extract information from files of various formats about the date, time, and device that created a specific document. Or collect information about entire websites — when and by whom a domain was registered and which other domains are associated with that site. Services like who.is and Domain Tools allow for this.

For example, In “OSINT Techniques for Sensitive Documents That Have Escaped Into The Clear Web,” Christina Lekati highlights a common vulnerability among organizations: sensitive documents inadvertently exposed online. Lekati notes that participants frequently discover documents posing significant risks to their organizations on the clear web, often due to employee errors or oversight. She emphasizes the importance of proactive searches to identify and manage these documents before they’re exploited by threat actors. The article offers a tutorial on advanced search queries, using special characters and operators to refine searches for specific documents related to an organization. Lekati provides practical advice on how to use Google Dorking, a technique that utilizes special search strings to find sensitive information efficiently. Highlighting the potential goldmine of information that documents like contracts, internal processes, and admin credentials can represent, she warns of the exposure risk to competitors, the media, and other entities. To combat this, Lekati suggests several ready-to-use search queries involving operators that focus on finding specific file types, such as PDFs, PowerPoints, and Excel files. She encourages creativity in conducting OSINT checks and underscores the ease of mitigating such risks by eliminating or managing the exposure of sensitive documents. She advocates for OSINT as a defensive discipline, crucial for organizations to act proactively against potential security breaches.

In 2015, 13-year-old Justin created a Twitter account under the nickname Intel Crab and invented a fake persona of a teenager from Donetsk. He collected videos, photographs, and quotes from people in the war-torn Donbas to post on his account. When Justin realized he had become popular — with thousands of followers — he stopped pretending to be a boy from Donetsk.

Justin decided to take a more serious approach. He began analyzing and verifying the information he gathered, as well as recreating context with additional tools like plane tracking services and satellite images. Now 20 years old, Justin regularly finds photos and videos from event locations, opens maps, and checks whether the specified geolocation matches what is visible in the images. He publishes his findings, for example, tracking changes in the amount of equipment at Russian military bases using satellite images, and monitored photos posted by Kadyrovites in Zaporizhzhia on VKontakte and Telegram, publishing their locations. Justin now has nearly 309,000 followers. This year, he is graduating from the University of Alabama and saving money to go to Ukraine to see the country not just on a monitor screen.

Perhaps the most famous open-source investigation team is Bellingcat. Its founder, Elliot Higgins, has been writing about the use of banned weapons and violations of humanitarian law in Syria since 2012 on the Brown Moses Blog. In 2014, he assembled a team and began investigating war crimes in Ukraine.

One of Bellingcat’s most notable works is the investigation of the downing of Malaysia Airlines flight MH17 in Donetsk Oblast in July 2014. Journalists established that the missile that downed the plane was Russian and launched from territory controlled by Russian authorities. Using photos and videos by eyewitnesses who captured the Buk missile system in various locations, investigators tracked its movement from Russia to Ukraine. They reconstructed its route thanks to a cargo platform photographed in various places in Russia and Ukraine. Initially, the platform carrying the system had four Buk missiles, but the day after the plane’s downing, only three were visible, and it was headed back towards Russia. Even the shadows cast by objects in photos and videos were important — using the SunСalc program, journalists calculated the approximate time of filming. Another significant detail was the smoke trail left by the missile. Using it, Bellingcat identified the missile’s launch site on satellite images and eyewitness recordings.

The MH17 case brought popularity to both Bellingcat and the OSINT method itself. Media began to reference data from investigative teams more frequently, and some newsrooms established their own data and OSINT departments. The spread of the internet allowed OSINT methods to extend beyond military intelligence and the professional community of investigators, becoming a new form of digital activism.

Artificial intelligence could prevent errors and inaccuracies caused by the human factor. OSINT blogs constantly write that potentially AI could be delegated several tasks at once, such as determining the location of a shot or distinguishing between tanks and IFVs in satellite images. However, current software still struggles with this task. The military uses more advanced AI developments: their algorithms can recognize enemy troops in satellite images, predict the course of hypersonic missiles, and even autonomously attack enemy targets. (Read about how artificial intelligence learned to wage war.)

However, investigators can indeed have an impact on the world. The results of work based on open data are sometimes considered by courts. In 2018, the prosecutor of the International Criminal Court (ICC) issued an arrest warrant for Libyan General Mahmoud Werfalli, who carried out public executions. The ICC based its evidence on the analysis and geolocations done by the Bellingcat team. The Hague Court, which considered the case of the MH17 crash in Ukraine, also cited materials from Bellingcat investigations. The International Investigative Group on War Crimes in Ukraine, initiated by Eurojust, requested materials from a joint investigation by “Important Stories”, OCCRP, and Der Spiegel on the supply of microelectronics and drones to Russia bypassing sanctions.

However, investigators want their findings to be used more actively. The Conflict Intelligence Team is currently working with other investigative projects to propose amendments to the legislation of EU and US countries. Investigators want their conclusions to have greater value in crime investigations. Only those who are inconvenienced by these investigations express outright distrust of OSINT researchers’ materials.

The barrier to entry in OSINT is low — only internet access and free time are needed. No special education is required — there is no university or training program to graduate from and receive a diploma as an OSINT investigator (although private courses are available). Investigators themselves say that the main qualities needed for this work are patience and attentiveness. “90% of the time, we sift through a huge amount of material, photos, and videos. It’s very tedious and hard work,” says Ruslan Leviev in an interview with Kit.

If you’re already engaged in cyber investigations or want to gain knowledge in this field, I recommend applying for the GIJN’s free online course (I’ve already applied). The course topics include: Basics of Digital Investigations, Threat Landscape: Malicious and Spyware, DNS: Websites and Infrastructure, Investigating Disinformation and Trolling, Network Analysis. Instructors: Craig Silverman, ProPublica reporter; Jane Lytvynenko, independent journalist (Guardian, BuzzFeed News, Joan Shorenstein Center at Harvard); Etienne “tek” Maynier, Amnesty Tech Lab staff; Luis Assardo, Reporters Without Borders staff and independent researcher. The course starts on April 29 and will run every Monday and Thursday for 6 weeks.

We review and practice other popular OSINT tools:

  • Maigret is an innovative tool designed for data analysis from various social platforms. It offers extensive capabilities for information gathering, user activity analysis, connection finding, and other functions. This tool has flexible settings for data collection and analysis, allowing you to choose social networks and save results into files. Maigret supports over 3000 sites for username searches. An excellent tool for username searches, it’s maximally simple to install and equally easy to use.
  • Mr.Holmes is a project aimed at gathering information from open sources about social networks, phone numbers, domains, and IP addresses using Google Dorks. Plus, it can be installed on Linux as well as Termux with Windows. The tool has a very nice feature of maintaining a local database.
  • Holehe is a powerful tool for detecting registered accounts by email. Holehe checks for email attachment to accounts on various platforms, including Twitter, Instagram, Imgur, and over 120 other sites. Our tool is very simple to install and use.
  • Ghunt is a powerful and versatile OSINT tool designed for gathering information about users through their Gmail addresses. It provides access to the owner’s name, identifiers, active Google services such as YouTube, Photos, Maps, and others. You can also get information about possible locations, Google documents, scheduled meetings in the calendar, and much more.
  • H8Mail is a tool that scans the specified email inbox in its databases and provides a set of possible passwords. With its help, you can gain access not only to the email but also to all other accounts if the user reuses the same passwords. This is a very decent tool for checking against databases of various conditionally free services to search for leaked email passwords.
  • DarkGPT offers advanced capabilities for working with leaked databases, significantly differing from previous tools based on ChatGPT, such as OSINVGPT, PentestGPT, and others. The Spanish pentester known as “luijait” recently introduced to the global community a novelty in the field of OSINT — the DarkGPT tool, which utilizes the power of GPT-4–200K for precise data leak analysis. Based on the latest advancements in artificial intelligence, it not only provides users with access to information but also tools for its analysis. DarkGPT stands out among its competitors due to the integration with GPT-4–200K, allowing for advanced data processing. The tool ensures secure access to leaked databases. Its interface, implemented through the command line, makes the tool accessible even for beginners in OSINT. The ease of use and intuitive interface significantly simplify the data collection and analysis process.

--

--