How I set up my Personal Knowledge Management System

Daniel Ofosu
4 min readMay 25, 2024

--

Obsidian, GPT4-o, and Python scripts to the rescue

As someone who constantly juggles between Apple Notes, screenshots, and various other not-so-structured note-taking methods, I found myself being unable to find relevant notes and content when needed, due to the disorganisation. It was frustrating not remembering how I wrote something or where I saved it, and in what format. I needed a simple and effective solution to streamline my workflow and make my digital life more manageable. Thats when a friend at work recommended me Obsidian, and I decided to delve into how I could turn this into my PKM system of choice. Here’s how I did it, step by step.

Automatically Tagging Markdown Files

The first issue I faced was the cumbersome manual tagging of notes in Obsidian. As background, tags are incredibly useful for categorizing and quickly finding content. For example, writing a note related to AI, LLMs and Image generation would be tagged with those tags, and then be easily findable. Obsidian has a vast amount of community created plugins, but none of them seemed to offer what I needed in a simple form, so I had to take up the task of writing my own script. To ease up the process and enable efficient quick note-taking from my phone on the go, I started by developing a function that would automatically add tags to each file.

The Code

I wrote a function to add or update the ‘tags’ metadata in the markdown Obsidian notes residing on my iCloud, based on the content of the file.

To generate relevant tags based on the file content, I used OpenAI’s GPT-4o model. The model analyzes the content of the note, and provides a comma-separated list of tags which then is fed into the note using the add_tags_to_md function.

def auto_tagging(api_key):
"""
Automatically adds tags to all Markdown files in the current directory using OpenAI's API and organizes them into relevant folders.

Parameters:
- api_key (str): The OpenAI API key.
"""
openai.api_key = api_key

processed, skipped = [], []

for index, filename in enumerate(os.listdir(".")):
if filename.endswith(".md"):

with open(filename, 'r', encoding='utf-8') as f:
source_content = f.read()
if "---" in source_content:
parts = source_content.split('---', 2)
metadata_str = parts[1].strip()
if any(line.startswith("tags:") for line in metadata_str.split("\n")):
skipped.append(filename)
continue

response = openai.ChatCompletion.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant that formats tags for Obsidian notes."},
{"role": "user", "content": f"Extract keywords for tags from the following content. In case the text is short please try to infer the meaning. When tagging abbreviations also add the word as a tag. Format them as a comma-separated list:\n\n{source_content}"}
]
)
tags_str = response.choices[0].message['content'].strip()

new_content = add_tags_to_md(source_content, tags_str)
with open(filename, 'w', encoding='utf-8') as f:
f.write(new_content)

processed.append(filename)
print(f"PROGRESS: {index / len(os.listdir('.')):.2%}")

Restructuring into folders

With tagging automated, the next step was to ensure that the files were organized into descriptive, and meaningful folder structure, so that I could also manually search through my notes. This involved generating new folders where relevant, based on the discretion of gpt-4o, and moving files to a relevant place in the folder structure.

Restructuring function


def organize_files(api_key, processed_files):
"""
Organizes the processed files into relevant folders based on GPT's reasoning.

Parameters:
- api_key (str): The OpenAI API key.
- processed_files (list): List of processed files.
"""
existing_folders = [d for d in os.listdir() if os.path.isdir(d)]

for filename in processed_files:
response = openai.ChatCompletion.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant that organizes files into relevant folders."},
{"role": "user", "content": f"Here is a list of existing folders: {existing_folders}. Suggest a relevant folder to place the following file. Provide the folder name in the format [[folder_name]]. Avoid creating too many new folders and use subfolders when appropriate.\n\nFilename: {filename}"}
]
)
folder_response = response.choices[0].message['content'].strip()

folder_name = folder_response.split("[[")[1].split("]]")[0]

if folder_name not in existing_folders:
os.makedirs(folder_name, exist_ok=True)
existing_folders.append(folder_name)

shutil.move(filename, os.path.join(folder_name, filename))
print(f"Moved {filename} to {folder_name}")

Bringing It All Together

To automate the entire process, I combined these functions into a single super easy script that processes all Markdown files in the directory. The script reads each file, generates tags, and organizes the file into the appropriate folder. It can either be run manually, or then scheduled in a service locally (e.g. Macbooks Automator), or in cloud. The full script can be found here in my github.

Future Features

While the current solution has already greatly improved my note-taking workflow, there are several additional features I plan to implement in the near-future:

  1. Cloud-native Running: Automating the script to run in a cloud service e.g. AWS to eliminate the step of using Automator on my Macbook to run the python file, as this requires my Macbook being open.
  2. Options for different LLMs: Adding the option to use different LLM API:s than OpenAI, and even running your local LLM.
  3. Summarize Links: Automatically summarize links placed in notes, so when I look up articles and don’t have time to manually summarize them, I would automatically get a short summary when pasting the link.
  4. Fetch Similar Links/Articles: Automatically fetch similar links or articles that could be useful and relevant to that note.
  5. Add to Reading List: Auto-add books mentioned in notes to my reading list in Goodreads.
  6. Add to Podcast/Video List: Auto-add podcasts or videos mentioned in notes to my listening or watching list in Apple Podcasts/Spotify, and possibly generate a summary or reasoning why/when I should check those out.
  7. Fetch Screenshots from iCloud: Automatically fetch screenshots from iCloud, filter for text ones (using OCR), and then summarize them in Obsidian.
  8. Auto-reminders or Calendar Notes: A feature to automatically add reminders or slots in my calendar to do some things that I have written about, e.g. if I wrote about a good book it would suggest me on a Sunday evening to start reading it.

--

--