What’s new in TI Mindmap | feb 2024

Series of periodic articles on the developments of TI Mindmap.

Antonio Formato
9 min readFeb 29, 2024

Article co-authored with Oleksiy Meletskiy.

Introducing TI Mindmap: Simplifying Complex Information

Navigating through lengthy blog posts, threat intelligence articles, or write-ups can be daunting, especially for cyber threat intelligence teams aiming to extract key insights efficiently. Enter TI Mindmap, a tool accessible through the Streamlit app platform. With just a URL as input, this service harnesses the power of OpenAI to transform cumbersome content into concise, actionable summaries. But it doesn’t stop there. Utilizing sophisticated algorithms, TI Mindmap goes beyond mere text reduction, providing users with insightful encapsulations of crucial points and themes.

TI Mindmap is a tool developed using Large Language Models (LLMs). The app operates on a ‘Bring Your Own (OpenAI) Key’ model, allowing users to leverage their own OpenAI keys for personalized and efficient information processing.

This tool aims to streamline the data analysis process, enabling teams to focus more on strategic decision-making and less on the cumbersome task of data mining.

Series of periodic articles on the developments of TI Mindmap.

We’re excited to kick off a series of regular updates focused on shedding light on the newest features, improvements, and upcoming ideas within TI Mindmap. Additionally, we’ll discuss upcoming ideas and solicit feedback from the community, ensuring that TI Mindmap continues to meet the diverse needs of its users.

https://ti-mindmap-gpt.streamlit.app/

https://github.com/format81/TI-Mindmap-GPT

App schema

TI Mindmap app building blocks

New Features

  • Extract adversary tactics, techniques, and procedures
  • Tactics, techniques and procedures by execution time
  • Tactics, techniques and procedures timeline
  • AI Chat on your article
  • Mermaid live editor integration
  • PDF report
  • Tweet Mindmap

Extract adversary tactics, techniques, and procedures

The dedicated function is designed to extract Mitre Tactics and Techniques (TTPs) from a given text, following prompts and guidelines provided to the user.

How it works:

  • Prompt Definition: The function begins by defining prompts for both the system and the user. The system prompt sets the context for the AI. The user prompt instructs the user to extract TTPs from the text provided, following certain guidelines such as referencing the ATT&CK Matrix for Enterprise and producing a table with specified columns.
  • LLM API Call: it is made with a set of messages with a one-shot prompting approach. One-shot prompting entails providing an example in the user prompt to guide the LLM towards the desired generated response.
  • The function then returns the content of the response received from the API call, which includes the extracted TTPs, formatted as specified
TTPs table

Tactics, techniques and procedures by execution time

The dedicated python function serves to generate a list of Tactics, Techniques, and Procedures (TTPs) provide a list of TTPs order by execution time:

How it works:

  • Prompt Definition: The system prompt establishes the framework for the AI, showcasing its proficiency across cybersecurity, threat intelligence, and Mitre attack strategies. Meanwhile, the user prompt guides users in generating a comprehensive list of Tactics, Techniques, and Procedures (TTPs) derived from the given text and TTP table. These TTPs are arranged chronologically by execution time and each line is formatted to include both the Tactic and Subtactic alongside their respective IDs.
  • LLM API Call: it is executed using a series of messages employing a one-shot prompting technique. While the one-shot prompt is currently hard-coded into the code, there are future plans to enhance its flexibility by allowing users to customize their own prompts through parameterization.
TTPs by execution time

Tactics, techniques and procedures timeline

The feature is designed to generate Mermaid code representing a timeline graph depicting the stages of a cyber attack, based on the Tactics, Techniques, and Procedures (TTPs) timeline provided.

How it works:

  • The application crafts a mermaid timeline through its “Tactics, Techniques, and Procedures by Execution Time” feature.
  • The timeline exclusively displays the extracted TTPs, providing a clear visual representation.
  • For users seeking to customize the timeline, a simple click on the “Open code in Mermaid.live” button opens up a live mermaid editor. Here, the compiled code is readily available for any desired modifications, offering a user-friendly and interactive editing experience.
TTPs graph timeline

AI Chat on your article

The chat functionality based on RAG (Retrieval Augmented Generation) on Threat Intelligence data integrated into TI Mindmap offers a dynamic tool for analysts. RAG architecture enhances the abilities of a Large Language Model (LLM) by allowing it to generate responses constrained by content sourced from vectorized documents. In this context, the analyst interacts with the article through a chat interface, utilizing RAG to retrieve relevant information directly from the document and generate tailored responses. This functionality enables analysts to quickly access and process critical insights from threat intelligence articles, streamlining their workflow and enhancing their analytical capabilities without needing to reference external data sources.

How it works:

  • The first step in answering questions from documents is to load the content. This is done with a scraping function.
  • The scraped text is divided into chunks of a predetermined size and optimized for chat purposes. Chunking is the process of breaking down large pieces of text into smaller segments.
  • Transforming segments into embeddings involves converting them into numerical representations known as embeddings. These embeddings encapsulate the semantic meaning of words, phrases, or sentences. The concept revolves around constructing vectors in a multi-dimensional space where the distance between vectors carries significance. Leveraging LangChain’s abstraction over FAISS, the chunks and embedding model are processed, resulting in the creation of vectors. The application then sends requests to the embeddings endpoint to retrieve these embedding vectors.
  • The embedding vector is processed in memory to create the knowledge base using FAISS to implement similarity search entirely in memory (no vector database is used, and the data persists only for the duration of the session).
  • After identifying the most similar chunks, the subsequent task involves formulating an answer to the question using a Language Model (LLM). This is where LangChain truly excels, handling all the intricate tasks effortlessly. It orchestrates the entire process seamlessly. To derive an answer, LangChain feeds the provided question and the most akin chunks, obtained from FAISS, as input to the LLM. The LLM utilizes this input to produce a text response pertinent to the question at hand. We leverage LangChain’s RetrievalQA chain to facilitate this process.
TI Mindmap chatbot architecture
AI Chat

Notes:

An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms. The embedding is an information dense representation of the semantic meaning of a piece of text. Each embedding is a vector of floating-point numbers, such that the distance between two embeddings in the vector space is correlated with semantic similarity between two inputs in the original format.

FAISS, which stands for Facebook AI Similarity Search, is a library for efficient similarity search and clustering of dense vectors. In the context of building a knowledge base for a chat app using RAG and embeddings, FAISS can be utilized to efficiently index and retrieve relevant information from a large corpus of text embeddings.

Langchain is a comprehensive framework for natural language processing (NLP) tasks, encompassing various libraries and tools. It offers solutions for tasks such as vector storage, question answering, callback functionalities, and integration with OpenAI and Azure OpenAI services.

Mermaid live editor integration

The ability to use mermaid live editor to edit the Mermaid code generated by the application has been added. Mermaid live editor allows you to:

  • Edit and preview flowcharts, sequence diagrams, gantt diagrams in real time.
  • Save the result as a svg file
  • Get a link to a viewer of the diagram so that you can share it with others.
  • Get a link to edit the diagram so that someone else can tweak it and send a new link back

This integration can be very useful in the following use cases:

  1. the user wants to manually edit the Mindmap
  2. The Large Language Model Outputs Mermaid code with syntax errors.

The latter scenario can occur for various reasons, although we are working to reduce the likelihood of it happening. The output of a large language model (LLM) can be incorrect or partially incorrect despite a good prompt due to limitations in its training data, understanding context, or inherent biases in the model itself.

How It Works:

  • When the Mindmap and/or the TTPs timeline are generated, a button is generated.
  • The functions for coding the Mermaid code are invoked according to the procedure documented here. In summary, the code is encoded in base64 and compressed, and then the mermaid.live URL is crafted.
  • Clicking the button opens a new tab in the browser with mermaid.live in edit mode.
Mindmap
Open in external mermaid editor
mermaid.live

PDF report

Introducing the PDF Report feature, a seamless extension of our platform designed to encapsulate the depth of cybersecurity articles into comprehensive, easily digestible PDF reports. This feature bridges the gap between detailed research and actionable insights, ensuring that the essence of complex analyses is never lost in translation.

How It Works:

  • Concise Summaries: Each PDF report begins with a short description and summary of the article, crafted to highlight key findings, insights, and implications for the cybersecurity landscape.
  • Visual Mindmap Integration: Below the summary, the report incorporates a mindmap. This visualization not only enriches the report with a graphical representation of the article’s core themes and connections but also enhances comprehension and retention of the material.
  • Accessible and Shareable: Designed with both the expert and the novice in mind, these PDF reports are perfect for academic, professional, or personal use. They can be easily shared with colleagues, used in presentations, or kept for reference, facilitating broader discussions and understanding of threat intelligence findings.

The PDF Report feature is more than just a summary; it’s a comprehensive tool designed to distill, visualize, and share the richness of threat intelligence research. Elevate your understanding and dissemination of cybersecurity knowledge with our intuitive, insightful PDF reports.

pdf report

Tweet Mindmap

Dive into the essence of complex threat intelligence with Tweet Mindmap, a feature designed to distill comprehensive articles into mindmaps. This tool not only simplifies the intricate web of information but also fosters community engagement and knowledge sharing.

How It Works:

  • Visual Digest: Tweet Mindmap transforms detailed threat intelligence articles into visually appealing mindmaps, capturing the core insights at a glance.
  • Interactive Engagement: In just 1–2 sentences, convey your perspective, highlight crucial findings, or raise awareness about emerging cyber threats.
  • Seamless Sharing: With each tweet, attach the mindmap to offer your followers a gateway to deeper understanding. This feature not only facilitates immediate engagement with the content but also encourages a ripple effect of knowledge dissemination across Twitter.
Generated tweet
Tweet

How to get involved

The project is open to external contributions. To collaborate, please check the GitHub repository: https://github.com/format81/TI-Mindmap-GPT/ .

If you find TI Mindmap useful, please consider starring the repository on GitHub.

Antonio Formato: LinkedinTwitter

Oleksiy Meletskiy: LinkedinTwitter

References:

--

--