Inference any LLM with serverless in 15 minutes

Wing Lian
3 min readJun 4, 2023

Uncovering the efficiency of deploying large language models with Runpod’s serverless infrastructure

GitHub: https://github.com/OpenAccess-AI-Collective/servereless-runpod-ggml
Demo:
https://huggingface.co/spaces/openaccess-ai-collective/ggml-runpod-ui
Arena:
https://huggingface.co/spaces/openaccess-ai-collective/rlhf-arena

So you’ve built a language model, and you’ve uploaded it to HuggingFace. Now what? Well, today I’m excited to share with you a practical way to deploy…

--

--

Wing Lian

Father, partner, avid ⛷⚽️🏋️, technologist, & real estate investor.