LLMs in production — the missing discussion

3 min readApr 17, 2023

Last week, I attended a virtual conference organized by the MLOps community on using Large Language Models (LLMs) in production, and read a comprehensive writeup by Chip Huyen on the same topic. This blog post contains some of my thoughts on the topics I missed seeing in these two.

Chip Huyen highlighted three main challenges of productionizing LLM based applications, listed below, and also discussed ways to address them:

Prompt evaluation/versioning/optimization, their relative stability etc
cost and latency issues
using prompting vs fine-tuning vs other methods

In the conference, the talks focused on three primary aspects:

typical production pipelines and the issues you face (e.g., using prompting for a version 1, getting into finetuning for v2, and as the product matures, and as you have some data collection strategies in place, you explore other options)
tools to support LLMs in production (e.g., langchain)
importance of data quality (talks from SnorkelAI, and Galileo)

Yet, for LLMs or any other NLP approach in production, I think two other aspects: clear evaluation strategy beyond cost and latency tradeoffs, and a clear data privacy policy are as important as…

LLMs in production — the missing discussion

Written by Sowmya V.B.