<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Matteo Tuzi on Medium]]></title>
        <description><![CDATA[Stories by Matteo Tuzi on Medium]]></description>
        <link>https://medium.com/@matteo_49605?source=rss-a944ed328513------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/0*-8n54jErIy0rE3Yu</url>
            <title>Stories by Matteo Tuzi on Medium</title>
            <link>https://medium.com/@matteo_49605?source=rss-a944ed328513------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sun, 31 May 2026 18:41:07 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@matteo_49605/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[MemoryModel Benchmark Results on LoCoMo Dataset]]></title>
            <link>https://medium.com/@matteo_49605/memorymodel-benchmark-results-on-locomo-dataset-81dcb269f0a8?source=rss-a944ed328513------2</link>
            <guid isPermaLink="false">https://medium.com/p/81dcb269f0a8</guid>
            <category><![CDATA[benchmark]]></category>
            <category><![CDATA[memory-model]]></category>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[ai-agent]]></category>
            <category><![CDATA[artificial-intelligence]]></category>
            <dc:creator><![CDATA[Matteo Tuzi]]></dc:creator>
            <pubDate>Mon, 22 Dec 2025 17:22:24 GMT</pubDate>
            <atom:updated>2025-12-22T17:28:47.391Z</atom:updated>
            <content:encoded><![CDATA[<p>This document presents the benchmark evaluation of <a href="https://www.memorymodel.dev"><strong>MemoryModel</strong></a> on the <a href="https://github.com/snap-research/locomo)"><strong>[LoCoMo dataset</strong> <strong>(Long Conversational Memory)]</strong></a><strong> </strong>, a comprehensive benchmark designed to evaluate long-term conversational memory capabilities in AI systems where <strong>MemoryModel outperforms the baseline Mem0 implementation by +7.7% and OpenAI’s Memory by over 20%</strong></p><h3><strong>Benchmark Configuration: LoCoMo Topology</strong></h3><h4><strong>User-Defined Memory Nodes</strong></h4><p>MemoryModel is a fully <strong>schema-agnostic engine</strong>. Users define custom memory types, extraction prompts, and embedding templates directly through the <strong>MemoryModel Console</strong> (our web-based configuration interface). No code changes required.</p><p>For this benchmark, we configured a <strong>4-node topology </strong>via the console, optimized for conversational biography extraction from the LoCoMo dataset:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*v_Rb_7K4PeumiCZ8guX3SA.png" /></figure><h4><strong>Node 1: “temporal_event”</strong></h4><p><strong>User-Defined Extraction Prompt:</strong></p><pre>You are a Senior NLP Specialist and Temporal Reasoning Engine. Your task is to extract events from the conversation and resolve ALL relative time references into ISO 8601 absolute dates (YYYY-MM-DD).<br><br>### STEP 1: ESTABLISH THE ANCHOR DATE<br>The system has already processed the context and calculated the correct reference date for this session.<br><br>**Current Context Date:** {{CURRENT_DATE}}<br><br>**HIERARCHY OF TRUTH:**<br>1.  **SYSTEM ANCHOR (DEFAULT):** Use the &quot;Current Context Date&quot; provided above as your mathematical Anchor for &quot;today&quot;.<br>2.  **NARRATIVE OVERRIDE (EXCEPTION):** ONLY if the user *explicitly* changes the timeline in the text (e.g., &quot;Imagine it is 1990&quot;, &quot;Back in 2012...&quot;, &quot;Assume today is Nov 14&quot;), use that specific narrative date instead.<br><br>### STEP 2: EXTRACT AND CALCULATE<br>Extract every event. For each event involving a time reference:<br>1. Identify the relative phrase (e.g., &quot;last Tuesday&quot;, &quot;three days ago&quot;, &quot;next week&quot;).<br>2. Perform date arithmetic using the Anchor Date.<br>   - Example: If Anchor is 2023-07-12 (Wednesday) and text says &quot;two days ago&quot;, calculation is 2023-07-10.<br>   - Example: &quot;Tomorrow&quot; = Anchor + 1 day.<br><br>OUTPUT FORMAT<br>Return a valid JSON array ordered CHRONOLOGICALLY.<br>[<br>{<br>&quot;event_description&quot;: &quot;Self-contained description including key details (what, why, result), specific objects/contents (e.g., what a sign said), and emotional states. Do NOT result to vague summaries.&quot;,<br>&quot;absolute_date&quot;: &quot;YYYY-MM-DD&quot; (Calculated. Use null ONLY if date is impossible to infer),<br>&quot;original_time_expression&quot;: &quot;The verbatim relative phrase used in text&quot;,<br>&quot;location&quot;: &quot;Location or null&quot;,<br>&quot;participants&quot;: [&quot;Name 1&quot;, &quot;Name 2&quot;],<br>&quot;context_evidence&quot;: &quot;Verbatim text span&quot;<br>}<br>]<br><br>### CRITICAL RULES<br>- **ISO 8601 ONLY:** The &#39;absolute_date&#39; MUST be in YYYY-MM-DD format.<br>- **CALCULATE:** Do not be lazy. &quot;Three days ago&quot; must become a specific date.<br>- **OUTPUT:** Output ONLY the JSON array.<br><br>Input Text:<br>...</pre><p>Embedding Template:</p><pre>Timestamp: {{absolute_date}} (Ref: {{original_time_expression}}) | Event: {{event_description}} | **Details: {{context_evidence}}** | Participants: {{participants}} | Location: {{location}}</pre><h4><strong>Node 2: “profile_attribute”</strong></h4><p><strong>User-Defined Extraction Prompt:</strong></p><pre>You are a Senior Profiling Specialist. Your goal is to extract structured biographical data (Attributes) from text.<br>Your output MUST be a valid JSON array.<br>CORE PHILOSOPHY: SEMANTIC SELF-SUFFICIENCY<br>Every extracted attribute must make sense in isolation.<br>BAD (Too vague): &quot;Colors&quot;, &quot;Agencies&quot;, &quot;Running&quot;.<br>GOOD (Self-sufficient): &quot;Vibrant colors in projects&quot;, &quot;Adoption agencies for couples&quot;, &quot;Running (as a self-care routine)&quot;.<br>DOMAINS<br>Target STRICTLY these domains:<br>Possessions &amp; Assets (Vehicles, Real Estate, Tech, Collections)<br>Media &amp; Culture (Specific Titles of Books, Movies, Games, Music, Artists/Bands)<br>Preferences &amp; Favorites (Foods, Brands, Colors, Aesthetics)<br>Activities &amp; Hobbies (Sports, specific crafts/skills, recurrent habits)<br>Life Goals &amp; Logistics (Career plans, Major life changes like adoption/moving, Education)<br>Living Beings (Pets, Family members AND their specific attributes/traits)<br>Medical &amp; Biological (Conditions, Allergies, Physical traits)<br>EXTRACTION RULES (Field by Field)<br>&quot;entity_name&quot;: The specific subject the fact refers to.<br>Resolve pronouns: (e.g. &quot;I&quot; -&gt; &quot;John Doe&quot;).<br>ENTITY SEPARATION RULE: If the text describes a trait of a family member/pet, create a separate entity (e.g., &quot;John&#39;s Wife&quot;).<br>&quot;category&quot;: The most specific category available.<br>&quot;value&quot;: The SPECIFIC entity, title, brand, or noun + CONTEXT.<br>CONTEXTUALIZATION RULE (CRITICAL): You MUST include the specific qualifying details (adjectives, purpose, target audience).<br>Text: &quot;I&#39;m looking for adoption agencies that support LGBTQ+ folks.&quot;<br>Output: &quot;Adoption agencies (specifically supporting LGBTQ+ individuals)&quot;.<br>LIST INHERITANCE RULE: When splitting a list, attach the parent context to EACH item.<br>Text: &quot;I prioritize self-care by running, reading, and cooking.&quot;<br>Output 1: &quot;Running (for self-care)&quot;<br>Output 2: &quot;Reading (for self-care)&quot;<br>Output 3: &quot;Cooking (for self-care)&quot;<br>&quot;acquisition_date&quot;: YYYY-MM-DD if explicitly mentioned, else null.<br>&quot;context_evidence&quot;: The Source of Truth.<br>Include the FULL sentence(s).<br>MANDATORY: Keep the &quot;why&quot;, &quot;how&quot;, or emotion attached to the fact.<br>CRITICAL CONSTRAINTS<br>Ambiguity Check: Resolve &quot;It&quot; or &quot;They&quot; to specific nouns in the &#39;value&#39; field.<br>List Handling: Split &quot;sushi, pizza and tacos&quot; into 3 separate objects.<br>Factuality: Ignore vague opinions; focus on concrete habits, preferences, or plans.<br>Output ONLY the JSON array.<br>Input Text:<br>...</pre><p>Embedding Template:</p><pre>Entity: {{entity_name}} | Category: {{category}} | Attribute: {{value}} | Details: {{context_evidence}} | Acquired: {{acquisition_date}}</pre><h4><strong>Node 3: “career_milestone”</strong></h4><p><strong>User-Defined Extraction Prompt:</strong></p><pre>You are a Senior Career &amp; Progression Analyst. Your task is to extract structured data regarding the professional, creative, employment, and **major life undertakings** of speakers.<br><br>Your scope includes:<br>1. **Projects &amp; Endeavors:** Creative works, business initiatives, research, activism, volunteering.<br>2. **Career Events (Pivots):** Hiring, firing, resignations, promotions, job applications, rejections.<br>3. **Major Processes:** Long-term bureaucratic or personal processes (e.g., Adoption process, Immigration, Certification).<br><br>Your output MUST be a valid JSON array. For each entry:<br><br>1. &quot;agent&quot;: The person or entity involved. Resolve pronouns.<br>2. &quot;project_or_event_name&quot;: The specific name or nature of the endeavor.<br>   - **SPECIFICITY RULE:** If the project targets a specific audience, niche, or community, YOU MUST INCLUDE IT.<br>   - *Bad:* &quot;Counseling&quot;, &quot;Writing a book&quot;, &quot;Activism&quot;.<br>   - *Good:*  &quot;Sci-Fi Novel about AI&quot;<br>3. &quot;type&quot;: Categorize strictly: &quot;Creative&quot;, &quot;Business&quot;, &quot;Career Event&quot;, &quot;Educational&quot;, **&quot;Social/Civic&quot;**, **&quot;Life Process&quot;**.<br>4. &quot;status&quot;: Current state (e.g., &quot;In Progress&quot;, &quot;Completed&quot;, &quot;Abandoned&quot;, &quot;Rejected&quot;, &quot;Successful&quot;, &quot;Planned&quot;).<br>5. &quot;timeframe&quot;: Extract any mention of WHEN (e.g., &quot;last year&quot;, &quot;currently&quot;). If none, use `null`.<br>6. &quot;motivation_or_cause&quot;: The &#39;Why&#39;.<br>   - **CRITICAL:** Capture the SPECIFIC catalyst, origin story, or internal drive.<br>   - Look for connections between past experiences and current goals (e.g. &quot;Inspired by her own childhood support&quot; is better than &quot;Wants to help&quot;).<br>7. &quot;outcome&quot;: The result or current sentiment regarding the outcome.<br>8. &quot;context_evidence&quot;: **The Source of Truth.**<br>   - Include the full sentence(s).<br>   - If the motivation/cause is mentioned in a sentence *before* or *after* the project mention, INCLUDE IT HERE to make the memory self-contained.<br><br>CRITICAL CONSTRAINTS:<br>- Capture PASSIVE events (getting fired, rejected) just as carefully as ACTIVE projects.<br>- Output ONLY the JSON array.<br><br>Input Text:<br>...</pre><p>Embedding Template:</p><pre>Agent: {{agent}} | Project: {{project_or_event_name}} ({{type}}) | Status: {{status}} | Motivation: {{motivation_or_cause}} | Details: {{context_evidence}} | Outcome: {{outcome}}</pre><h4><strong>Node 4: “social_connection”</strong></h4><p><strong>User-Defined Extraction Prompt:</strong></p><pre>You are a Social Graph Specialist. Your task is to extract interpersonal relationships between speakers and third parties mentioned in the text.<br>TARGET: Focus strictly on People-to-People connections (Family, Friends, Colleagues, Rivals).<br>IGNORE: People-to-Location connections (e.g., &quot;John is in Paris&quot;).<br>Your output MUST be a valid JSON array. For each relationship found:<br>&quot;primary_entity&quot;: The subject of the relationship. Resolve pronouns to names (e.g. &quot;She&quot; -&gt; &quot;Mary&quot;).<br>&quot;related_entity&quot;: The other person involved.<br>&quot;relationship_type&quot;: The specific social role (e.g., &quot;Friend&quot;, &quot;Brother&quot;, &quot;Employer&quot;, &quot;Mentor&quot;, &quot;Nemesis&quot;). Avoid generic terms like &quot;knows&quot; if a specific role is clear.<br>&quot;relationship_details&quot;: Extract factual attributes defining the bond, such as duration (e.g., &quot;for 20 years&quot;), origin (e.g., &quot;childhood friends&quot;), or status (e.g., &quot;long-distance&quot;, &quot;estranged&quot;). If no specific detail is mentioned, use null.<br>&quot;interaction_event&quot;: Briefly describe the dynamic action or activity occurring in this specific text (e.g., &quot;arguing over dinner&quot;, &quot;planning a trip&quot;).<br>&quot;sentiment_tone&quot;: The emotional quality of their interaction/relationship in this text. Select strictly from: [&quot;Positive&quot;, &quot;Negative&quot;, &quot;Neutral&quot;, &quot;Conflictual&quot;, &quot;Supportive&quot;].<br>&quot;context_evidence&quot;: The VERBATIM text snippet supporting this extraction.<br>CRITICAL CONSTRAINTS:<br>Output ONLY valid JSON.<br>If no social relationships are mentioned, return [].<br>Do not extract relationships involving objects or places.<br>Distinguish between what they ARE doing (interaction_event) and facts about their bond (relationship_details).<br>Input Text:<br>...</pre><p>Embedding Template:</p><pre>{{primary_entity}} is {{relationship_type}} of {{related_entity}} [Details: {{relationship_details}}] | Sentiment: {{sentiment_tone}} | Interaction: {{interaction_event}} | Evidence: {{context_evidence}}</pre><h3><strong>Results comparison with other systems:</strong></h3><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/a3a39397e303e1d4500296fb37a2e9c9/href">https://medium.com/media/a3a39397e303e1d4500296fb37a2e9c9/href</a></iframe><h4><strong>Analysis of Results</strong><br><strong>MemoryModel outperforms the baseline Mem0 implementation by +7.7% and OpenAI’s Memory by over 20%.</strong></h4><p>The performance gap stems primarily from our architectural divergence in handling temporal reasoning. While systems like Mem0 rely on the LLM to calculate dates at query time (runtime calculation), MemoryModel adopts a <strong>“Shift-Left” approach</strong>: we resolve relative time expressions (e.g., “three days ago”) into ISO 8601 absolute dates during the ingestion phase. This deterministic pre-computation eliminates the hallucination risks associated with real-time arithmetic in LLMs.</p><h3><strong>Methodology</strong></h3><p><strong>Dataset</strong></p><ul><li><strong>Name:</strong> LoCoMo (Long Conversational Memory)</li><li><strong>Source:</strong> [<a href="https://github.com/snap-research/locomo">snap-research/locomo</a>]</li><li><strong>Size: </strong>50 long conversations (~300 turns, ~9.000 tokens each)</li><li><strong>Sessions: </strong>Up to 35 sessions per conversation</li><li><strong>Questions:</strong> 1.986 questions for evaluation</li></ul><p><strong>Evaluation Metrics:</strong></p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/570a9d0f4b270f2af3fb4a81677955f3/href">https://medium.com/media/570a9d0f4b270f2af3fb4a81677955f3/href</a></iframe><p><strong>Question Categories:</strong></p><ul><li><strong>Single-Hop:</strong> Questions answerable from a single conversational turn/session</li><li><strong>Multi-Hop:</strong> Questions requiring synthesis across multiple sessions</li><li><strong>Temporal:</strong> Questions involving time-based reasoning and chronological awareness</li><li><strong>Open-Domain:</strong> Questions requiring external knowledge integration</li></ul><h3><strong>Implementation Details</strong></h3><p><strong>Key Differences from Mem0:</strong></p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/916f75cb20a17e1e13f0586bc6535fdb/href">https://medium.com/media/916f75cb20a17e1e13f0586bc6535fdb/href</a></iframe><p><strong>Architectural Approach to Temporal Reasoning:</strong></p><p>A key differentiator between MemoryModel and Mem0 lies in <strong>how temporal information is handled</strong>.</p><p><strong>Mem0’s Approach: Runtime Calculation</strong></p><p>Mem0 stores memories with <strong>relative time expressions </strong>intact (e.g., “last year”, “two months ago”). During answer generation, their benchmark prompt must perform complex temporal reasoning:</p><pre># INSTRUCTIONS (from Mem0 benchmark prompt):<br>5. If there is a question about time references (like &quot;last year&quot;, &quot;two months ago&quot;, <br>   etc.), calculate the actual date based on the memory timestamp.<br>6. Always convert relative time references to specific dates, months, or years.<br>   For example, convert &quot;last year&quot; to &quot;2022&quot; or &quot;two months ago&quot; to &quot;March 2023&quot; <br>   based on the memory timestamp.</pre><p>This approach requires:</p><ul><li>A <strong>400 word prompt</strong> with step-by-step reasoning instructions</li><li>The LLM to <strong>calculate dates at query time</strong> from relative expressions</li><li>Explicit handling of multi-speaker contexts and contradictory timestamps</li></ul><p><strong>MemoryModel’s Approach: Pre-Computed Temporal Indexing</strong></p><p><strong>MemoryModel</strong> resolves temporal references <strong>at ingestion time</strong>, not at query time:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*RceIlA7Ww5_I77OJHonh5A.png" /><figcaption><strong>MemoryModel’s Approach: Pre-Computed Temporal Indexing</strong></figcaption></figure><h4><strong>Benefits of this architecture:</strong></h4><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/87df3b0b8cbd83ef6907210f5330e6b5/href">https://medium.com/media/87df3b0b8cbd83ef6907210f5330e6b5/href</a></iframe><p>This explains why our <strong>simpler answer generation prompt</strong> achieves <strong>higher accuracy</strong> (74.6% vs 66.9%):</p><p>The heavy lifting of temporal reasoning is done <strong>once during ingestion</strong> by the specialized `temporal_event` node, using NLP date parsing. The retrieval system then uses <strong>direct temporal range filters</strong> on pre-computed ISO dates, eliminating the need for runtime LLM calculations.</p><p>This approach embodies the “<strong>Shift-Left” principle</strong>: moving reasoning complexity from query-time (slow, expensive, non-deterministic) to ingestion-time (one-off, deterministic). Unlike rigid memory systems, MemoryModel allows developers to <strong>define extraction logic per-node through the console</strong>, enabling domain-specific optimizations without code deployment.</p><h4><strong>Memory Ingestion Pipeline:</strong></h4><p>The ingestion system processes content through a multi-node extraction architecture:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*HdsUw-L16nr2SXilA1fWHw.png" /><figcaption>MemoryModel LoCoMo Ingestion Pipeline</figcaption></figure><p><strong>Key Components:</strong></p><ul><li><strong>Extraction Engine</strong>: Dynamically loads user-defined schemas from the MemoryModel Console and runs them in parallel. For this benchmark, we configured 4 semantic definitions targeting biography extraction.</li><li><strong>Multi-Node Processing</strong>: Each node extracts typed structured memories using its user-defined prompt</li><li><strong>Rate Limiting</strong>: Built-in retry with exponential backoff for API resilience</li><li><strong>Multi-modal Support</strong>: Separate processing pipeline for visual memories with reference matching</li></ul><h3><strong>Retrieval Strategies:</strong></h3><p>The retrieval uses a <strong>hybrid multi-strategy orchestrator</strong>:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*T-ma2nJKw3xpY4drQ23hIQ.png" /><figcaption>MemoryModel Retrieval Strategies</figcaption></figure><h4><strong>Search Strategies:</strong></h4><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/bb34bf8ed712b17334171bd1cdabe1da/href">https://medium.com/media/bb34bf8ed712b17334171bd1cdabe1da/href</a></iframe><p><strong>Relevance Router</strong>: LLM-based semantic scoring to dynamically decide which memory nodes are most relevant to each query.</p><p><strong>Answer Generation Prompt</strong></p><p>The evaluation uses `gemini-2.5-flash` with temperature `0.0` for deterministic answers:</p><pre>You are a helper assistant answering questions based on a set of retrieved memory fragments.<br><br>Context:<br>${contextText}<br><br>Question: ${question}<br><br>Instructions:<br>1. Answer the question using ONLY the provided context.<br>2. **Inference Allowed:** You may perform reasonable logical inferences if strongly supported by the text.<br>3. **Safety:** If the answer is completely missing or cannot be reasonably inferred, strictly say &quot;I don&#39;t know&quot;.<br>4. **Style:** Be concise and direct.</pre><p><strong>LLM-as-Judge Evaluation</strong></p><p>We use a semantic judge following Mem0’s evaluation methodology:</p><pre>Role: You are an impartial semantic judge evaluating a Question Answering system.<br><br>Context:<br>- Question: &quot;${question}&quot;<br>- Ground Truth: &quot;${truthStr}&quot;<br>- Predicted Answer: &quot;${predStr}&quot;<br><br>Task: Determine if the Predicted Answer conveys the SAME meaning as the Ground Truth.<br><br>Evaluation Rules (Be Flexible):<br>1. **Dates:** Treat &quot;2023-05-07&quot;, &quot;May 7th, 2023&quot;, &quot;7/5/23&quot; as EQUIVALENT.<br>2. **Synonyms:** &quot;Happy&quot; == &quot;Joyful&quot;, &quot;Scared&quot; == &quot;Afraid&quot;.<br>3. **Verbosity:** If the Prediction is long but contains the correct answer, it is CORRECT.<br>4. **Lists:** If the Truth is a list, the Prediction must contain the key items.<br>5. **Negation:** Watch out for &quot;NOT&quot;. &quot;He went&quot; != &quot;He did not go&quot;.<br><br>Output: Respond ONLY with &quot;YES&quot; if correct, or &quot;NO&quot; if incorrect.</pre><p><strong>Fast-Pass Optimization:</strong></p><ul><li>String inclusion check before LLM judge (normalized, punctuation-stripped)</li><li>“I don’t know” trap detection to catch abstention failures</li></ul><h4><strong>Reproducibility</strong></h4><p>The benchmark scripts are open-source and available in the <a href="https://github.com/MatteoTuziMM/memory-model-benchmark">`benchmark/` folder</a>.</p><p><strong>Requirements</strong></p><ul><li>Node.js 18+</li><li>Your own MemoryModel API key</li><li>Your own Gemini API key (for evaluation)</li></ul><p><strong>Running the Benchmark</strong></p><pre># Set environment variables<br>export MEMORY_API_KEY=your_memorymodel_api_key<br>export GEMINI_API_KEY=your_gemini_api_key<br><br># Ingest the LoCoMo dataset<br>npx ts-node benchmark/benchmark_ingest.ts<br><br># Run evaluation<br>npx ts-node benchmark/benchmark_eval.ts</pre><h4><strong>References:</strong></h4><ul><li><strong>LoCoMo Dataset:</strong> [<a href="https://github.com/snap-research/locomo">Navigating Long-Context Long-Form Conversations</a>]</li><li><strong>Mem0 Paper:</strong> [<a href="https://arxiv.org/abs/2504.19413">Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory</a>]</li><li><strong>Mem0 Documentation:</strong> [<a href="https://docs.mem0.ai">docs.mem0.ai</a>]</li></ul><h4><strong>Citation:</strong></h4><p>If you use MemoryModel in your research, please cite:</p><pre>@misc{memorymodel2025,<br>  title={MemoryModel: The first autonomous memory architecture},<br>  author = {Tuzi, Matteo},<br>  year={2025}<br>}</pre><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=81dcb269f0a8" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Inside the Architecture of a Self-Optimizing AI Memory System]]></title>
            <link>https://medium.com/@matteo_49605/inside-the-architecture-of-a-self-optimizing-ai-memory-system-0339bdfe1bb2?source=rss-a944ed328513------2</link>
            <guid isPermaLink="false">https://medium.com/p/0339bdfe1bb2</guid>
            <category><![CDATA[software-architecture]]></category>
            <category><![CDATA[system-design-concepts]]></category>
            <category><![CDATA[startup]]></category>
            <category><![CDATA[llm]]></category>
            <category><![CDATA[generative-ai-use-cases]]></category>
            <dc:creator><![CDATA[Matteo Tuzi]]></dc:creator>
            <pubDate>Thu, 04 Dec 2025 16:26:53 GMT</pubDate>
            <atom:updated>2025-12-05T09:48:32.173Z</atom:updated>
            <content:encoded><![CDATA[<h3>Introduction</h3><p>Over the past 18 months, I’ve dedicated myself to solving what I call the <strong>“AI Memory Rebus”</strong> — the fundamental challenge of making artificial intelligence truly remember and understand context the way humans do.</p><p>What started as frustration with AI forgetting conversations mid-dialogue evolved into a deep technical exploration: <em>Why do AI systems, despite their sophistication, fail so catastrophically at context retention?</em> The answer wasn’t in model architecture or parameter count — it was in how we <strong>store, retrieve, and synthesize</strong> memory itself.</p><p>Traditional approaches treat memory as a static database problem. Vector search finds similar text. Key-value stores retrieve facts. But human memory is <strong>adaptive, contextual, and self-organizing</strong>. We don’t search our memories with SQL queries — we fluidly shift between precise recall (“What was that invoice number?”) and exploratory thinking (“What movies do I like?”).</p><p><strong>Memory Model</strong> is the architecture that emerged from solving this rebus — a production system that brings adaptive intelligence to AI memory management. This document presents the high-level design of that system.</p><p>— Matteo Tuzi, Founder &amp; CTO</p><h3>Platform Overview</h3><p><strong>Memory Model</strong> is an enterprise-grade AI memory platform that solves the fundamental problem of context retention in LLM applications. Unlike simple vector databases, we provide <strong>adaptive intelligence</strong> that learns user behavior patterns and self-optimizes retrieval quality over time — all through a zero-code visual interface.</p><p><strong>Core Innovation:</strong></p><ul><li><strong>Adaptive Retrieval</strong>: System automatically adjusts search strategy based on query intent and user memory patterns</li><li><strong>Self-Optimization</strong>: Machine learning-powered parameter tuning eliminates manual configuration</li><li><strong>Meta-Cognitive Insights</strong>: Detects behavioral patterns humans would recognize but traditional search misses</li><li><strong>Zero-Code Platform</strong>: Complete system configuration through visual console with AI-powered automation</li></ul><h3><strong>1. </strong>System Architecture — High Level</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*iadQXCZJLxSypp_omPDr_A.png" /><figcaption>System Architecture Overview</figcaption></figure><p><strong>Key Architectural Principles:</strong></p><ul><li><strong>Separation of Concerns</strong>: Clear boundaries between ingestion, retrieval, and optimization</li><li><strong>Distributed Processing</strong>: Async queues handle high-volume ingestion without blocking</li><li><strong>Dual-Write Pattern</strong>: Ensures consistency between metadata and vector stores</li><li><strong>Horizontal Scalability</strong>: Stateless API layer enables automatic scaling</li></ul><h3>2. Memory Ingestion Flow</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*PcRv9SKKS1fOEMs_p1nQIQ.png" /><figcaption>Memory Ingestion Flow</figcaption></figure><p><strong>Innovation: Multi-Stage Semantic Enrichment</strong></p><p>Traditional systems embed raw text directly. Memory Model applies <strong>bidirectional semantic expansion</strong>:</p><ul><li><strong>For Memories</strong>: Enriches content with implicit semantics (e.g., “Blade Runner” → genre, themes, related concepts)</li><li><strong>For Queries</strong>: Expands with synonyms and related terms (e.g., “AI” → related terminology)</li><li><strong>Adaptive Mode Detection</strong>: System automatically determines which enrichment strategy to apply</li></ul><p><strong>Technical Approach:</strong></p><ul><li>LLM-powered context injection before embedding</li><li>Heuristic-based query vs memory detection</li><li>Maintains semantic coherence while improving recall</li></ul><p><strong>Business Impact:</strong></p><ul><li>40–60% improvement in retrieval relevance</li><li>Eliminates “terminology mismatch” problem (user says “AI”, system finds “machine learning”)</li></ul><h3>3. Adaptive Retrieval System</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ZDACzmXdc50MjQOPs9T5kg.png" /><figcaption>Adaptive Retrieval System</figcaption></figure><p><strong>Core Innovation: Centroid-Aware Adaptive Search</strong></p><p><strong>Problem:</strong> Static vector search fails to adapt to user-specific memory distributions. A query about “favorite movies” should search differently than “invoice #12345”.</p><p><strong>Our Solution:</strong></p><ul><li>System maintains a <strong>semantic centroid</strong> (center of mass) for each memory type</li><li>Compares query similarity to centroid vs similarity to top results</li><li><strong>Automatically decides</strong> between:</li><li><strong>META Mode</strong>: Broad, exploratory search (returns 10+ results)</li><li><strong>SPECIFIC Mode</strong>: Precision-focused search (returns 2–3 exact matches)</li></ul><p><strong>Mathematical Foundation:</strong></p><ul><li>Based on control theory decision functions</li><li>Configurable similarity thresholds with safety margins</li><li>Validated stable convergence properties</li></ul><p><strong>Why This Matters:</strong></p><ul><li>No manual configuration of “k” parameter per query</li><li>Naturally handles diverse query types (factual vs exploratory)</li><li>Improves relevance by 30–50% vs baseline semantic search</li></ul><h3>4. Self-Optimizing Intelligence</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*gqJhx1FSbW2Meot8XrNuAA.png" /></figure><h3>Innovation 1: The Architect (Auto-Tuning System)</h3><p><strong>Problem:</strong> Static thresholds degrade as user behavior evolves. Manual tuning requires ML expertise.</p><p><strong>Our Approach:</strong></p><ul><li><strong>Telemetry-Driven</strong>: Logs every retrieval decision and outcome</li><li><strong>LLM Strategy Analysis</strong>: Detects patterns in what works/doesn’t work</li><li><strong>Controlled Application</strong>: Uses dampening functions from control theory to prevent oscillation</li><li><strong>Bounded Optimization</strong>: Hard constraints prevent system from suggesting invalid parameters</li></ul><p><strong>Technical Characteristics:</strong></p><ul><li>Convergence time: 7–10 days to optimal configuration</li><li>Stability: Provably stable (Lyapunov analysis)</li><li>Zero manual intervention required</li></ul><p><strong>Business Value:</strong></p><ul><li>Eliminates need for dedicated ML engineer to tune system</li><li>Continuous improvement as usage patterns evolve</li><li>15–20% accuracy improvements observed over 2 weeks</li></ul><h3>Innovation 2: The Dreamer (Meta-Cognitive Insights)</h3><p><strong>Problem:</strong> AI systems accumulate facts (“User bought milk”, “User searched keto recipes”) but fail to synthesize higher-order patterns humans recognize (“User transitioning to ketogenic diet”).</p><p><strong>Our Approach:</strong></p><ul><li><strong>Temporal Aggregation</strong>: Analyzes recent memories for cross-domain patterns</li><li><strong>LLM Pattern Detection</strong>: Identifies behavioral shifts, emerging interests, value changes</li><li><strong>Confidence Filtering</strong>: Only stores high-confidence insights (threshold: 0.70+)</li><li><strong>First-Class Storage</strong>: Insights become searchable memories themselves</li></ul><p><strong>Pattern Recognition:</strong></p><ul><li>Cross-domain synthesis (Shopping + Health → Lifestyle change)</li><li>Temporal trends (increasing/decreasing interest)</li><li>Entity-based linking (common themes across activities)</li></ul><p><strong>Business Applications:</strong></p><ul><li>Customer Support: Early churn risk detection</li><li>E-Commerce: Upsell opportunity identification</li><li>Health: Lifestyle change tracking</li></ul><h3>5. Security &amp; Compliance</h3><p><strong>Data Isolation:</strong></p><ul><li>Cryptographic tenant separation</li><li>Zero cross-user data access</li><li>Project-level access controls</li></ul><p><strong>Privacy Controls:</strong></p><ul><li>User-controlled meta-insight generation (can be disabled)</li><li>Data portability (full export capability)</li><li>Right to deletion (GDPR compliant)</li></ul><p><strong>Infrastructure Security:</strong></p><ul><li>Encrypted at rest and in transit</li><li>Regular security audits</li><li>SOC 2 Type II (in progress)</li></ul><h3>6. Platform Philosophy</h3><p><strong>Configuration Over Code</strong></p><ul><li>No programming required for 90% of use cases</li><li>AI wizard generates optimal configurations</li><li>Expert mode available for advanced users</li></ul><p><strong>Adaptation Over Static Rules</strong></p><ul><li>System learns and improves continuously</li><li>Eliminates manual re-tuning</li><li>Responds to evolving user behavior</li></ul><p><strong>Transparency Over Black Boxes</strong></p><ul><li>Every decision includes reasoning</li><li>Audit trails for compliance</li><li>Console shows optimization rationale</li></ul><h3>Conclusion</h3><p><strong>Memory Model</strong> represents a fundamental evolution in AI memory systems:</p><ol><li><strong>From Manual to Automatic</strong>: Self-optimization eliminates ML expertise requirement</li><li><strong>From Static to Adaptive</strong>: Centroid-aware search adapts to query intent automatically</li><li><strong>From Facts to Insights</strong>: Meta-cognitive layer synthesizes behavioral patterns</li><li><strong>From Code to Configuration</strong>: Zero-code platform enables rapid deployment</li></ol><p><strong>Perfect for:</strong></p><ul><li>Enterprise applications requiring production-grade memory</li><li>Teams without dedicated ML engineers</li><li>Use cases where context quality directly impacts business metrics</li><li>Organizations requiring compliance and auditability</li></ul><p><strong>Technical Moat:</strong></p><ul><li>Proprietary adaptive search algorithms</li><li>Control theory-based optimization system</li><li>Multi-year head start on self-optimizing architecture</li></ul><h3>Closing Thoughts</h3><p>The “AI Memory Rebus” I set out to solve 18 months ago wasn’t just a technical puzzle — it was a question about what makes memory <strong>intelligent</strong> rather than just <strong>persistent</strong>. The answer, it turns out, required rethinking the entire stack: from how we capture context, to how we adapt retrieval strategies, to how we synthesize meta-cognitive insights.</p><p><strong>Memory Model</strong> is the architecture that emerged from solving that rebus. But like human memory itself, it continues to evolve — learning from each deployment, adapting to new use cases, optimizing itself in production.</p><p>If you’re building AI systems that need to truly <em>remember</em>, I’d love to hear what problems you’re solving.</p><p>— Matteo</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=0339bdfe1bb2" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>