Scuffed AI Presentation Generator (with code)

Published in

MLPurdue

8 min readFeb 20, 2024

By Brian and Robert

Perhaps you have seen or heard of companies creating AI tools that can generate powerpoint presentations for you like Tome.app, Beautiful.ai, or Presentations.ai. This article will teach you how to create your own at least until Google makes their own or something.

Preface: This is very scuffed. I’m sure there are more reliable/efficient ways to implement this but the goal is to provide you ideas on how to combine various ML tooling to make your own ideas come to life. Also, I made this when doing an english assignment so the prompts are personalized to my english task. As a result, if you choose to customize for your own task, you may have to change how you parse the text.

Prerequisites:

OpenAI API key (link)
Stability API key (link)
Enthusiasm!!!!!!!!!!
To best follow along, open and make a copy of this colab: link

Basic overview:

Enter information about your topic
Use Langchain + some LLM to come up with answer, divide that into several slides, and insert text for each slide in addition to some description of text
Description of text is passed to Stability to then generate image
Use powerpoint python library to put all that stuff together

Step 1: API Keys

os.environ[“STABILITY_KEY”] = “YOURKEY”
os.environ["OPENAI_API_KEY"] = "YOURKEY"

In the colab, insert your own API keys (instructions in prerequisites) and replace the “YOURKEY”s with them.

Step 2: Prompting for the Text of the Presentation

We will use LangChain, a framework that allows you to easily connect with/use large language models. You can use many different types, but for now we will use OpenAI models (the reason why you need an OpenAI key). If you are interested in LangChain, you can learn more it here: link

task = "Please read the assigned poem and create a presentation " \ 
       "for the class. The presentation should take about 10-15 " \ 
        "minutes and include the following: You must begin the " \
        "presentation reading the poem aloud. Your slides must " \
        "include but in no specific order: 1. an explanation"  \
        "of the title 2. identify an image and explain its " \
        "significance to the poem 3. identify and explain three " \
        "lines from the poem--image, meaning, 4.  provide " \ 
        "biographical information about the author--current " \
        "position, 5. engage the class with a question that " \
        "they must answer in the written form- the question " \
        "may be posed  6. include any images, videos, or " \
        "sound that you think will enhance our 7. answer the " \
        "question: Why did Billy Collins choose to include this " \ 
        "poem in the collection?"


llm = OpenAI(temperature=0.9)
assignment_prompt = PromptTemplate(
   input_variables=["task"],
   template="I have a presentation assignment to do the following " \
            "surrounded by quotes: '{task}'. Slide by slide, what " \
            "should the content and title of each slide be? When " \
            "saying a slide name, preface it with the word 'SLIDE' " \
            "and when stating each bullet point for the contents for " \
            "each slide, preface it with the word 'CONTENT'"
)
assignment = LLMChain(llm=llm, prompt=assignment_prompt)


answer = assignment.run(task)
answer = answer.split('\n')

Here, we create an assignment prompt using PromptTemplate. The task string includes the details about my task, and template explains how to format the LLM’s answer. We insert our “task” string into where it says {task} in the template. By running assignment.run(task), we get an output string that completes the instructions on how to create the skeleton of our presentation. Namely, it will decide the title and content of each slide in a formatted way: Prefacing with the word “SLIDE” or “CONTENT” with bullet points. Notice that we are being very specific so we know exactly how to parse our answer.

slide_content = dict()
cur_slide = ""
for line in answer:
  if 'SLIDE' in line:
    cur_slide = line
  if 'CONTENT' in line:
    # Reason why we replace ':' is because answer by llm is usually like "CONTENT: actual content"
    line = line.replace('CONTENT:', '')
    slide_content[cur_slide] = line
    slide_content

The above code parses the prompt template’s output and prints it. We get:

{'SLIDE 1: Title of Poem': ' Poem Title and author ',
 'SLIDE 2: Meaning of Title': ' Explanation of the title and its significance ',
 'SLIDE 3: Image ': ' Identify an image and explain its significance to the poem ',
 'SLIDE 4: Lines Explained ': ' Identify and explain three lines from the poem--image, meaning, ',
 'SLIDE 5: Author Biography ': ' Provide biographical information about the author--current position, ',
 'SLIDE 6: Question for Class ': ' Engage the class with a question that they must answer in the written form- the question may be posed ',
 'SLIDE 7: Enhance Presentation ': ' Include any images, videos, or sound that you think will enhance our ',
 'SLIDE 8: Theory on Collection ': ' Answer the question: Why did Billy Collins choose to include this poem in the collection?'}

Which nicely organizes the instructions for each slide.

create_content_prompt = PromptTemplate(
   input_variables=["background", "content"],
   template="Here is the background information:{background} Answer the follwowing: {content}",
)
create_content = LLMChain(llm=llm, prompt=create_content_prompt)

Here, we create another PromptTemplate. We will give it background information and tell it to answer the following content/instruction, which was laid out before. For example, some instructions may be: “Answer the question: Why did Billy Collins choose to include this poem in the collection?” or “Provide biographical information about the author — current position”.

background = "The Bagel"\
           "I stopped to pick up the bagel"\
           "rolling away in the wind,"\
           "annoyed with myself"\
           "for having dropped it"\
           "as if it were a portent."\
           "Faster and faster it rolled,"\
           "with me running after it"\
           "bent low, gritting my teeth,"\
           "and I found myself doubled over"\
           "and rolling down the street"\
           "head over heels, one complete somersault"\
           "after another like a bagel"\
           "and strangely happy with myself."\
           "—David Ignatow"

For background, we will just pass in a string. Above, my background is a poem called The Bagel. Now, by providing both the background and content/instruction, we can finally get it to produce coherent slide information.

Step 3: Prompting for the Images of the Presentation

If you don’t insert some pretty images, you will get bad grades. As a result:

create_image_description_prompt = PromptTemplate(
   input_variables=["content"],
   template="Describe the significant image in the following: {content}",
)
create_image_description = LLMChain(llm=llm, prompt=create_image_description_prompt)

This here is the prompt template for our image generation. We pass in the text we generated for the current slide as the “content” and ask it to describe a significant image. So, if a slide’s content is about a bagel, this prompt template might output something like “A bagel that is brown and circular and… “ blah blah blah.

def create_image(image_description):
   answers = stability_api.generate(
   prompt=image_description,
   seed=992446758, # If a seed is provided, the resulting generated image will be deterministic.
                   # What this means is that as long as all generation parameters remain the same, you can always recall the same image simply by generating it again.
                   # Note: This isn't quite the case for CLIP Guided generations, which we tackle in the CLIP Guidance documentation.
   steps=30, # Amount of inference steps performed on image generation. Defaults to 30.
   cfg_scale=8.0, # Influences how strongly your generation is guided to match your prompt.
                  # Setting this value higher increases the strength in which it tries to match your prompt.
                  # Defaults to 7.0 if not specified.
   width=512, # Generation width, defaults to 512 if not included.
   height=512, # Generation height, defaults to 512 if not included.
   samples=1, # Number of images to generate, defaults to 1 if not included.
   sampler=generation.SAMPLER_K_DPMPP_2M # Choose which sampler we want to denoise our generation with.
                                                # Defaults to k_dpmpp_2m if not specified. Clip Guidance only supports ancestral samplers.
                                                # (Available Samplers: ddim, plms, k_euler, k_euler_ancestral, k_heun, k_dpm_2, k_dpm_2_ancestral, k_dpmpp_2s_ancestral, k_lms, k_dpmpp_2m, k_dpmpp_sde)
   )

   for resp in answers:
       for artifact in resp.artifacts:
           if artifact.finish_reason == generation.FILTER:
               warnings.warn(
                   "Your request activated the API's safety filters and could not be processed."
                   "Please modify the prompt and try again.")
           if artifact.type == generation.ARTIFACT_IMAGE:
               img = Image.open(io.BytesIO(artifact.binary))
   return img

With our image description, we pass it into the function above. It uses Stability to turn the given text into an image. I copied the above code from somewhere. As you can see, you can change some parameters like width/height.

Step 4: Putting it All Together

# Creating powerpoint
prs = Presentation()
image_num = 0
for title, content in slide_content.items():
   # Generate content for that specific requested content
   content = create_content.run({"background":background, "content":content}).replace('\n', '')

   # Create an image description of the generated content to make
   image_description = create_image_description.run(content).replace('\n', '')
   print(image_description)

   # Create image from generated image description
   image = create_image(image_description)
   image_name = str(image_num) + '.png'
   image.save(image_name)

   # Add slide
   add_slide(title, image_name, content)
   image_num += 1

As you can see, slide_content.items() provides the general overview of the slides — what title to use, and what each slide should be about. The “content = create_content.run()…” gets that general overview of what to do for the current slide and combines it with the background (the entire poem) to generate the appropriate content. Then, we create an image description from that new content, and get a related image. We then use the powerpoint python library to put all this together. The powerpoint maker code is not that interesting so I will not show it although you can find it in the colab. It just hard codes where to put the title, the content, and the image.

Step 5: Actually Using It

Once you run all the code, your powerpoint file will be created. Here are some examples:

The descriptions that generated the images above were:

“The significant image in this phrase is that of a person holding their head down, looking embarrassed. This image conveys the idea that someone may be feeling ashamed of something they have done and that this shame can lead to a sense of joy. This is because embarrassed feelings can often lead to a sense of accomplishment, as the person is able to recognize how far they have come and how much better they can do. This can lead to a feeling of pride and happiness about their progress.” and “The significant image in this passage is that of the bagel in Ignatow’s poem. It symbolizes the fragile nature of the human condition, as it is easily crushed or broken and yet still maintains its shape. It is also a metaphor for the journey of life and the struggles that come with it, as well as the hope of redemption and a possible renewal of spirit. Ignatow’s use of this image speaks to the power of poetry to convey meaning and emotion in a simple yet theatrical manner.”

If you want the presentation to have more words or use some different language, you can just edit the prompts in the prompt templates.

Something that I learned through this project is that it didn’t really make me think of “coding” where you have full control. It's kind of like talking to some 5 year old, trusting it to both come up with a good answer and specifically format its answer following your instructions, the latter which may result in breaking your program. Pretty interesting!