How do you think code Documentation Generation can change after LLMs?

Suman Saurabh
3 min readDec 5, 2023

--

Today, I want to share the challenging journey of creating an Automatic Documentation Generator for repositories, a project that’s very close to my heart.

🔍 The Genesis

It all started with a frustration I kept noticing — the endless hours we all spent updating documentation for our Projects.

We know the drill: code changes and the docs lag behind. I saw so many talented teams getting bogged down by this — outdated info here, a mismatch there — we clearly need a better way.

That’s when it hit me: Github Co-pilot and many such tools have already solved this problem at a smaller scale. What if we had a tool that could sync our documentation effortlessly with every code update? A tool that understands the rhythm of development and dances along with it.

That’s the seed from which Snorkell.ai germinated.

🛠️ The Building Blocks

The goal was clear:

  1. Develop a tool that integrates seamlessly
  2. Monitors code changes
  3. Updates code documentation automatically.

The idea was to create a system that not only eases the workload but ensures accuracy and consistency in documentation. So, I rolled up my sleeves and plunged into the depths of LLMs. I was determined to build something that didn’t just parse code but really got it — translating it into documentation that was clear, correct and coherent.

🤖 AI at the Core

The core of our Automatic Documentation Generator is a Finetuned GPT-3.5 model. It’s trained to parse through code in repositories, identify critical components like classes and functions, and generate comprehensive, up-to-date documentation. The AI not only understands the syntax but also grasps the context, ensuring that the documentation is as informative as it is accurate.

🔄 Seamless Generation and Continuous Updating

When it came to the heart of our tool, I knew it had to be like a silent job in the background, diligently writing the documentation. So here is what the flow looks like:

  1. The tool is designed to work unobtrusively in the background.
  2. Code Merge Detection: Whenever new code is merged into the ‘main’ branch of your repository, Snorkell.ai instantly detects it. It specifically looks for modifications in functions and classes.
  3. Documentation Generation: Upon detecting these changes, Snorkell.ai automatically begins generating documentation for the modified code. This process is swift, precise and coherent, ensuring the documentation accurately reflects the latest code updates.
  4. Pull Request Creation: After generating the updated documentation, Snorkell.ai takes a proactive step. It creates a pull request containing the updated documentation. This pull request serves as a notification and allows for easy review and integration of the new documentation.
  5. Review and Merge: The developer can then review the automatically generated documentation in the pull request. Once satisfied, it can be merged, ensuring the documentation is always in sync with the latest code.

Currently, Snorkell.ai is available for projects hosted only on GitHub.

🌟 The Outcome

The result is a tool that revolutionizes how teams manage documentation. It saves countless hours, reduces the margin of error, and ensures that every team member, new or old, has access to the latest project insights. It’s not just a tool; it’s a new way of approaching documentation in software development.

🙏 Gratitude and Looking Ahead

This journey wouldn’t have been possible without the feedback and support from the developer community. As I look ahead, I am excited to learn more about continuous improvement and see how this tool will evolve and help more teams in their software development endeavours.

I’m eager to hear your thoughts and experiences with documentation in software projects.

Reference Links:

  1. Github App: https://github.com/apps/snorkell-ai
  2. Website: https://www.snorkell.ai/
  3. Demo: https://youtu.be/rXMW1xAA-RU
  4. Some smart Reddit Feedbacks: https://www.reddit.com/r/Python/comments/180akfb/comment/ka4tukz/?utm_source=share&utm_medium=web2x&context=3

#SoftwareDevelopment #GitHub #Documentation #AI #MachineLearning #TechInnovation #OpenAI #ChatGPT #Python #Coding #Javascript #Typescript #Java

--

--