PinnedYeonjoon JunginSqueezeBits Team Blog[vLLM vs TensorRT-LLM] #1. An Overall EvaluationvLLM and TensorRT-LLM are two leading frameworks for efficiently serving Large Language Models (LLMs). vLLM is a fast, user-friendly…Sep 30Sep 30
Yeonjoon JunginSqueezeBits Team Blog[vLLM vs TensorRT-LLM] #2. Towards Optimal Batching for LLM ServingIn our previous article, we compared vLLM and TensorRT-LLM under default configurations and specific constraints, providing insights into…Oct 111Oct 111