InLevel Up CodingbyMd Monsur alimPLUG-DocOwl2: An OCR-Free Multi-page Document UnderstandingHow mPLUG-DocOwl2 revolutionizes document understanding with high-resolution compression, advanced AI, and unmatched efficiency.2d ago1
Ferry DjajaDocument Parsing with OmniParser and GPT4o VisionIn this blog, we’ll explore how to leverage Microsoft’s OmniParser as input for GPT-4’s vision capabilities, optimizing the parsing of…Nov 18
InAdvanced Deep LearningbyFrederik vom LehnCurrent best practices with GeminiResearch based best practices with LLMs and VLLMs, such as constrained outputs like Json output and model few shot learning!Nov 16Nov 16
Ferry DjajaExtracting Line Items from a Document with GPT-4o: It’s not a straightforward taskIn this tutorial, I’ll guide you through the process of accurately extracting line items from documents. Although I initially thought this…Aug 311Aug 311
Ferry DjajaChat with PowerPoint Files Using LangGraph and GPT-4oIn this blog, I’ll guide you through creating a Python script that enables seamless conversation with your PowerPoint document (.PPTX)…Nov 10Nov 10
InLevel Up CodingbyMd Monsur alimPLUG-DocOwl2: An OCR-Free Multi-page Document UnderstandingHow mPLUG-DocOwl2 revolutionizes document understanding with high-resolution compression, advanced AI, and unmatched efficiency.2d ago1
Ferry DjajaDocument Parsing with OmniParser and GPT4o VisionIn this blog, we’ll explore how to leverage Microsoft’s OmniParser as input for GPT-4’s vision capabilities, optimizing the parsing of…Nov 18
InAdvanced Deep LearningbyFrederik vom LehnCurrent best practices with GeminiResearch based best practices with LLMs and VLLMs, such as constrained outputs like Json output and model few shot learning!Nov 16
Ferry DjajaExtracting Line Items from a Document with GPT-4o: It’s not a straightforward taskIn this tutorial, I’ll guide you through the process of accurately extracting line items from documents. Although I initially thought this…Aug 311
Ferry DjajaChat with PowerPoint Files Using LangGraph and GPT-4oIn this blog, I’ll guide you through creating a Python script that enables seamless conversation with your PowerPoint document (.PPTX)…Nov 10
Ferry DjajaRAG with Complex PDF StructureIn this blog, I’ll outline how I developed a Retrieval Augmented Generation to analyze complex PDFs and answer questions. The process…Sep 62
ArchangelmikoPart 2: Implementing LayoutLMv3 from ScratchIn this second part of the series, we’ll walk through the essential steps to implement LayoutLMv3 from scratch. We’ll set up the model’s…Nov 4
Lavanya SiliveriOracle Integration-based fully Automated Invoice Processing and Approval management using Oracle AI…In this blog post, I would like to share how we can build a fully automated and secure Document analysis process using Oracle Cloud’s…Feb 13