InByte-Sized AIbyDon Moon[vLLM — Quantization] bitsandbytes: 8-bit Optimizers, LLM.int8(),bitsandbytes (8-bit Optimizers, LLM.int8(), QLoRA) and k-bit Inference Scaling1d ago
InLevel Up CodingbyWenqi Glantz10+ Ways to Run Open-Source Models with LlamaIndexLlamaIndex’s open-source model integration with Hugging Face, vLLM, Ollama, Llama.cpp, liteLLM, Replicate, Gradient, and moreDec 19, 20233
Gautam ChutaniMulti-Modal RAG: A Practical GuideUsing vLLM to serve models for Multimodal Text Summarization, Table Processing, and Answer SynthesisSep 17Sep 17
DynamWorksVisionUX: A Novel Framework for Multi-Modal Video Analysis in Enterprise ApplicationsAbstract2d ago2d ago
InByte-Sized AIbyDon MoonLLM Inference Optimizations — Chunked Prefill and Decode-Maximal BatchingOverview of Chunked Prefill and Decode-Maximal Batching (Sarathi)Aug 28Aug 28
InByte-Sized AIbyDon Moon[vLLM — Quantization] bitsandbytes: 8-bit Optimizers, LLM.int8(),bitsandbytes (8-bit Optimizers, LLM.int8(), QLoRA) and k-bit Inference Scaling1d ago
InLevel Up CodingbyWenqi Glantz10+ Ways to Run Open-Source Models with LlamaIndexLlamaIndex’s open-source model integration with Hugging Face, vLLM, Ollama, Llama.cpp, liteLLM, Replicate, Gradient, and moreDec 19, 20233
Gautam ChutaniMulti-Modal RAG: A Practical GuideUsing vLLM to serve models for Multimodal Text Summarization, Table Processing, and Answer SynthesisSep 17
DynamWorksVisionUX: A Novel Framework for Multi-Modal Video Analysis in Enterprise ApplicationsAbstract2d ago
InByte-Sized AIbyDon MoonLLM Inference Optimizations — Chunked Prefill and Decode-Maximal BatchingOverview of Chunked Prefill and Decode-Maximal Batching (Sarathi)Aug 28
heping_LUInternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic TasksInternVL is a 6-billion-parameter(6B) visual-linguistic base model with 28% of the number of covariates, with the same powerful visual…5d ago
Naman TripathiOllama vs VLLM: Which Tool Handles AI Models Better?If you’re into AI and large language models (LLMs), you might have heard of Ollama and VLLM. Both are tools for working with LLMs, but they…Jul 175