Victor MayLLM Multi-GPU Batch Inference With AccelerateAn Implementation WalkthroughSep 10, 20232Sep 10, 20232
Victor MaySolving The Issue of Falcon Text Generation Never StoppingHow to make an overly chatty bird stop talking.Jul 26, 20231Jul 26, 20231
Victor MayScalable Streaming of OpenAI Model Responses with FastAPI and asyncioA tutorialJul 13, 20231Jul 13, 20231