Batch summarize and translate docs via OpenAI or a local LLM for privacy: Introducing yet another Summarizer

Sean Ryan
4 min readJun 6, 2024

An obvious application of LLMs is to summarize and translate text — so already we have quite a few command line tools (such as scottleibrand/gpt-summarizer) and websites (such as ZeroGPT). However as an engineer, it is interesting to “build your own” and my own mrseanryan/gpt-summarizer (same name, unfortunately!) has some nice features:

File types:

  • Summarize text, markdown, HTML, PDF files

Summarization levels:

  • Summarize at different lavels: short, long, and per-paragraph

Translation:

  • Translate to a target language

Data sources:

  • Batch summarize whole directories of files
  • Download a file via URL and summarize it

Private LLM:

  • Optionally use a locally hosted LLM, for maximum privacy and prevent any loss of IP (Intellectual Property)

Cost savings:

  • Avoid re-summarizing a previously processed file
  • Calculate cost estimates (when using Open AI)

Output files:

  • Output files in YAML format (as opposed to JSON): cheaper for LLM to generate, easy for humans to read
  • Output files with a “.yaml.txt” file extension, for easy previewing and…

--

--

Sean Ryan

Versatile and creative full stack developer, tackling UI and data challenges to delight the user. Passionate about UX, clean architecture and machine learning.