Harnessing LLMs to manage my projects (Part 1)

Exploring using LLMs for personal project management, and how to go from a simple project idea to well specified tasks

11 min readSep 24, 2024

I recently worked in a startup and found that while I love the excitement of a new project I find going from an exciting new project idea to a well thought out plan and actionable tasks is not my forte.

I’m now up-skilling in Large Language Models (LLMs) and the crazy speed at which they have advanced over the past few years is remarkable. While I do this I’m naturally thinking, can LLMs help me overcome my biggest project management hurdles?

In this article, I’ll explain what I’ve done to design effective prompts to produce a structured project specification and break the project up into actionable tasks. Note that this is for short term personal research software projects and I’m not looking at the processes involved in team software development.

I have chosen to focus on my experiences with two well-known LLMs: Anthropic’s Claude.ai and Meta’s Llama 3.1. I chose Claude for its ethical AI stance and cloud-based API, while Llama 3.1 offers the advantage of running locally on my own hardware (using Ollama server). I’ve also tested these prompts with other models such as ChatGPT with success, but have not focused tailored the prompts for other models.

Feasibility of LLMs for Project Breakdown

To gauge the feasibility of using LLMs for self-management I came up with a quick prompt to test their capabilities. My goal here is to give the LLM a simple project outline and to break down the project into actionable tasks.

I started out with this straightforward prompt:

You are an assistant to a project manager for research software
development. Given the following project description, break down
the project into multiple tasks, arranged as a table including
columns "Task Name", "Size of Task", "Priority", and "Outcome."

Next let’s add a project outline (for this very project, naturally), as follows:

Project Outline:
Build a simple tool in Python that will help us in the project planning
process. Let's start with taking a project summary and feeding it to a
local Ollama server using LangChain, get structured output for a list
of tasks and potentially subtasks, and then upload this to Notion in 
the correct format to fill out tasks in a project template.

Here are the outputs from Claude and Llama3.1:

This prompt seems pretty close to giving what I was expecting. It’s enough to convince me that it’s worth spending some more time to flesh out the concept and build prompts that can give a better project breakdown.

One key takeaway from this experiment is the importance of providing clear guidance to LLMs when generating tasks. While both LLMs generated useful breakdowns of the project outline, there are clear differences not just the steps but the outcomes of the project, for example Claude wants us to write a user guide and Llama 3.1 has no documentation task at all. This is not just a function of the different LLMs, different calls to the same LLM will provide different tasks.

This isn’t unexpected as the project is not well defined. Namely, do I want to build a utility software to be used my myself, an proof-of-concept internal tool for a large company, or a product for a startup? What software quality level are we aiming for? Who are the users of the software? What is the minimum functionality needed in this project? Without these details the models are giving us a single output sampled from a large space possible scenarios.

In the next section, I’ll improve the prompt and project outline, as well as return the data in a more machine readable form so that we can parse and upload to the task management system.

Improving the prompt

Now that I’ve found LLMs can breakdown projects in a useful way, let’s find out how to improve how we ask the model to do this task. This is prompt engineering: the art of crafting requests to get LLMs to do what you need effectively. There are many prompting strategies and tricks that claim to improve the output of LLMs. Before we discuss these and explore a better prompt, let’s start with a project overview that I can use to test the prompts.

Example Project Outline

I expanded the outline of the project in the last section and removed the details of specific tools and added more details of the proof-of-concept nature of the project:

Overview:
Build a simple proof-of-concept tool to use LLMs to take a brief project outline, produce a detailed project description with a specified structure and finally break it down into tasks with enough information and clarity to be able to be useful. The project and tasks are then uploaded to a project database in Notion along with the relevant properties and a clear and comprehensive description for the project as well as each task. The project should have a simple UI where the project description is entered. The project does not need to support multiple users.
Goals:
* Design a system to generate a project description and tasks from a project outline
* Build a simple POC UI where a user can enter the project outline and see the task breakdown before uploading to Notion
* Evaluate the project breakdowns appropriately

Prompt design principles

To learn some of the basics of prompt engineering I started with the documentation from the companies providing the AI models themselves, and then read many general tutorials and blogs. I’ve collected some resources I found useful at the bottom of this post. Also I found Anthropic’s Prompt Generator very useful to generates a prompt based on best practices from a simple prompt description.

With what I’ve learned so far, I’m starting to structure my prompts using the following four sections (inspired by the great CO-STAR framework that Sheila Tao used to win the Singapore’s GPT-4 Prompt Engineering competition)

Give the role and context:
Give the LLM a clear role and context for the task
Explain the task:
What do you want the LLM to do? Be clear and provide important factors in the task.
Break down the task into steps:
Give a clear step-by-step breakdown of the task to be done
Specify the output format:
Give examples of the output format along with inline instructions for what goes in each section
For Claude, structure the input and output data using XML tags [*]. Other LLMs may work better with other formats.
Reinforce important points:
State or rephrase the most important aspects of the output. This I found important to get Llama 3.1 (and ChatGPT) to output valid XML for example.

Writing my next prompt

Role and context
Let’s start by implementing a recommended role and some context for the test. Specifically, I want to implement the project ideas as a proof of concept software, without this the LLM is likely to provide objectives suited to software development of a product.

Also, we will give it the project outline itself here (the variable in double curly braces will be replaced by the actual description)

You are a research project manager responsible for developing proof of
concept software for internal use.

Here is the project outline you will be working with:

<project_outline>
{{PROJECT_OUTLINE}}
</project_outline>

Explain the task
Next, I gave the LLM a clear description of the task to be done. Note that I have two tasks in this prompt, first writing a more detailed project specification and then breaking down the project into tasks. This may not give the best results and in the next section I’ll break this down into multiple prompts that do just one of these tasks.

You are tasked with expanding a project outline into a detailed,
structured project description and breaking it down into specific
tasks. This will help in better understanding the project scope
and planning its execution. 

Ensure that your expanded description is comprehensive, clear,
and aligned with the original project outline. The tasks should
be actionable.

Break down the task into steps
How do I want the LLM to go about breaking down tasks? This step may be less clear, and I was helped by asking the LLM to provide a step-by-step breakdown of how to do the task and editing it.

First, expand this outline into a structured project description. 
Your goal is to create a comprehensive and well-structured project plan
that addresses all key aspects of the project.
Follow these guidelines:
1. Begin with an introduction that provides an overview of the project.
2. For each main point in the outline, create a detailed paragraph or
   section.
3. Elaborate on the ideas, providing context, rationale, and any
   relevant details.
4. Ensure logical flow and coherence between sections.
5. Conclude with a summary of the project's goals and expected outcomes.

Next, break down the project into specific tasks. For each task:
1. Provide a clear, actionable description.
2. Indicate which section of the project it relates to.
3. Estimate the time or effort required (e.g., hours, days, or complexity
   level).
4. Identify any dependencies or prerequisites.

Specify the output format
Finally, I specify how the LLM should return the results. For Claude XML formatting is recommended as the model is trained to follow XML in it’s input and output. The XML format may not be appropriate for other LLMs, and other formatting could improve the results. I’ll explore this in another blog post but to keep things simple for now I’m going to use XML formatting exclusively.

Present your output in the following format:

<expanded_description>
[Your expanded project description goes here]
</expanded_description>

<task_breakdown>
1. [Task description] - [Related section] - [Time/Effort estimate] - [Dependencies, if any]
2. [Task description] - [Related section] - [Time/Effort estimate] - [Dependencies, if any]

[Continue for all tasks...]
</task_breakdown>

Reinforce important points

Llama3.1 often fails to return valid XML, the the XML tags are often not closed and the returned text has Markdown formatting. However, Claude is optimized for XML output. By the following Llama 3.1 now returns valid XML.

Your response must be in valid XML.

However, there are no guarantees that the model will return valid XML. There are methods to ensure models return XML syntax by limiting their outputs to that specified by formal grammar, but this doesn’t guarantee the output is valid XML.

Results

How do we do with this prompt? Here’s one output from Claude given the above prompt with our example project outline inserted.

<expanded_description>
Project: LLM-Powered Project Management ToolIntroduction:
This project aims to develop a proof-of-concept (POC) tool that leverages Large Language Models (LLMs) to streamline the project planning process. The tool will take a brief project outline as input and generate a detailed project description with a specified structure, along with a comprehensive task breakdown. The generated project information will then be automatically uploaded to a Notion database, creating a centralized and organized project management system.System Design:
The core of the system will be built around an LLM, which will be responsible for interpreting the project outline and generating the detailed description and tasks. The system will be designed to understand project management concepts and produce structured output that aligns with best practices in project planning. The LLM will be fine-tuned or prompted to generate output in a consistent format that can be easily parsed and organized.User Interface:
A simple, user-friendly interface will be developed to allow users to input their project outlines. The interface will display the generated project description and task breakdown, allowing users to review and potentially edit the information before uploading it to Notion. The UI will be designed with simplicity in mind, focusing on essential functionality for this POC stage.Notion Integration:
The system will integrate with Notion's API to automatically create new project entries in a designated database. Each project entry will include the detailed description, individual tasks, and relevant properties such as estimated time/effort and dependencies. The integration will ensure that all information is properly formatted and organized within the Notion workspace.Evaluation:
To assess the effectiveness of the POC tool, an evaluation process will be implemented. This may include comparing the LLM-generated project plans with manually created ones, gathering user feedback on the accuracy and usefulness of the output, and analyzing the time saved in the project planning process. The evaluation will help identify areas for improvement and determine the potential value of further developing the tool.Conclusion:
This proof-of-concept tool aims to demonstrate the potential of using LLMs in project management, specifically in the crucial phase of project planning and task breakdown. By automating these processes and integrating with Notion, the tool has the potential to significantly streamline project initialization and organization, saving time and improving consistency in project management practices.
</expanded_description><task_breakdown>
1. Define LLM requirements and selection - System Design - 2 days - None
2. Design prompt engineering strategy - System Design - 3 days - Depends on #1
3. Implement LLM integration for project description generation - System Design - 5 days - Depends on #1, #2
4. Develop algorithm for task breakdown generation - System Design - 4 days - Depends on #3
5. Create simple user interface for project outline input - User Interface - 3 days - None
6. Implement result display functionality in UI - User Interface - 2 days - Depends on #5
7. Design Notion database structure for projects and tasks - Notion Integration - 1 day - None
8. Implement Notion API integration for data upload - Notion Integration - 3 days - Depends on #7
9. Develop data formatting and organization logic for Notion upload - Notion Integration - 2 days - Depends on #8
10. Design evaluation metrics and methodology - Evaluation - 2 days - None
11. Implement evaluation process - Evaluation - 3 days - Depends on #10
12. Conduct initial testing and debugging - All sections - 4 days - Depends on #1-#11
13. Gather user feedback and perform iterations - All sections - 5 days - Depends on #12
14. Document system architecture and usage instructions - All sections - 2 days - Depends on #13
15. Prepare final POC demonstration and presentation - All sections - 1 day - Depends on #14
</task_breakdown>

This is looking better, we have a detailed project specification and broken down tasks. However, I think we can do better. Some issues I’d like to address are:

The sections in the project description change with different runs of the LLM and different project outlines.
The scope of the project is still hard to control; for example, there are still tasks for documentation even though we have stated that it’s a proof-of-concept software for a single user.
The tasks themselves have little detail on what is involved in them.
The output is not very structured, for my goal of uploading them to my project management system I want more explicit structure.

Conclusions

LLMs can effectively generate detailed project descriptions from a simple project outline and break the project down into tasks. This will hopefully be valuable for me in the future with new projects, getting me over staring at a blank page and give me something to start with!

In the next blog post, I’ll delve deeper into prompt chaining and critique & refinement techniques to further improve the quality and accuracy of LLM-generated project plans. I’ll also explore ways to better structure the output so that it can be uploaded to my project management tool.

Useful prompt engineering resources

A small selection of resources I’ve found useful:

Documentation from AI companies

OpenAI API Prompting Guide
Anthropic’s Claude — Prompt engineering interactive tutorial
Anthropic’s Claude — Real world prompting

General guides and courses on prompting

PromptingGuide.ai by Elvis Saravia
Learn Prompting
Prompt Engineering for Developers — DeepLearning.AI
Mastering Prompt Engineering Video

Other tutorials, blogs and guides