Published inTowards AIDeploy an in-house Vision Language Model to parse millions of documents: say goodbye to Gemini and…How to build a Document Parsing Pipeline to process millions of documents using Qwen-2.5-VL, vLLM, and AWS Batch.Apr 23A response icon18Apr 23A response icon18
Published inTDS ArchiveHow Did Open Food Facts Fix OCR-Extracted Ingredients Using Open-Source LLMs?Delve into an end-to-end Machine Learning project to improve the quality of the Open Food Facts databaseOct 6, 2024A response icon1Oct 6, 2024A response icon1
DuckDB & Open Food Facts: the largest open food database in the palm of your hand 🦆🍊Exploit the power of DuckDB to explore the largest open database in the food market.Jul 21, 2024Jul 21, 2024
Published inTDS ArchiveParse Your Invoices with LayoutLM and Label StudioFine-tune LayoutLM on your invoices with the Transformers library, Label Studio, and AWS S3.Apr 16, 2024A response icon6Apr 16, 2024A response icon6
Published inTDS ArchiveScale your Machine Learning Projects with SOLID principlesHow to write code that scales and accelerates your work as a data scientist or machine learning engineer.Mar 12, 2024A response icon8Mar 12, 2024A response icon8
Published inTDS ArchiveBuild Machine Learning Pipelines with Airflow and Mlflow: Reservation Cancellation ForecastingLearn how to create reproducible and ready-for-production Machine Learning pipelines through a Senior Machine Learning assignmentJan 12, 2024A response icon6Jan 12, 2024A response icon6
Published inTDS ArchiveBuilding a Matching Tool to Help Start-Up Founders Find the Best Incubators: an End-to-End…A project walkthrough to propose the best incubators for start-up founders, using Python, Pinecone, FastAPI, Pydantic, and DockerNov 26, 2023Nov 26, 2023
Published inTowards AIWhy are AI Products Doomed to Fail?After one year of implementing AI features for various businesses, I share my perspective on the mistakes I see companies making with LLMs…Nov 17, 2023A response icon41Nov 17, 2023A response icon41
Track your data with Data Version Control (DVC)A data tracking tool that works along Git to make your Machine Learning projects reproducible.Aug 27, 2023Aug 27, 2023
Fine-tune your LLM with AWS Sagemaker: build the best D&D assistant with generative AIA walkthrough on how to leverage Sagemaker to perform supervised fine-tuning on large language models.Aug 20, 2023A response icon1Aug 20, 2023A response icon1