PinnedPublished inData Science CollectiveHow to Use AWQ to Quantize LLMsUsing the llm-compressor Python library for Activation-Aware Weight Quantization (AWQ)2d agoA response icon12d agoA response icon1
PinnedPublished inAI AdvancesHow to Use SPLADE for Better LLM RAG RetrievalBuilding an advanced RAG pipeline with sparse encoder retrieval modelsMay 25A response icon3May 25A response icon3
PinnedPublished inTDS ArchiveHow to Use Hybrid Search for Better LLM RAG RetrievalBuilding an advanced local LLM RAG pipeline by combining dense embeddings with BM25Aug 11, 2024A response icon5Aug 11, 2024A response icon5
PinnedPublished inTDS ArchiveHow to Use Re-Ranking for Better LLM RAG RetrievalBuilding an advanced local LLM RAG pipeline with two-step retrieval using open-source bi-encoders and cross-encodersMay 2, 2024A response icon6May 2, 2024A response icon6
Published inData Science CollectiveUnderstanding the KV-Cache In LLMsHow reusing attention calculations speeds up LLM inferenceMay 18A response icon2May 18A response icon2
Published inData Science CollectiveHow to Build Your Own LLM Coding Assistant With Qwen2.5-CoderCreating a local LLM chatbot with Qwen2.5-Coder-Instruct, vLLM, and StreamlitMay 4A response icon5May 4A response icon5
Published inData Science CollectiveHow to Build a Local LLM Chatbot with CAG: Streamlit, vLLM, and Smart Context CachingRetrieval-Augmented Generation (RAG) is a well-known technique for adding large amounts of external knowledge to an LLM. If the external…Apr 4A response icon4Apr 4A response icon4
Published inAI AdvancesDiffusion Explained: How AI Image Generators WorkA technical explanation of AI image generator models without the mathMar 23A response icon2Mar 23A response icon2
Published inAI AdvancesPDF to Markdown Document Conversion With Local LLMsHow to use local vision-language models (VLMs) for document parsingMar 11A response icon8Mar 11A response icon8
Published inAI AdvancesCultural Bias In LLMsExploring the impact of cultural values on AI responses and how language and role assignment can reduce biasMar 4A response icon10Mar 4A response icon10