<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Junchen Pan on Medium]]></title>
        <description><![CDATA[Stories by Junchen Pan on Medium]]></description>
        <link>https://medium.com/@junchenp1018?source=rss-327084052e19------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/2*I7YlSsFa7HP8qOWkApdW4w.jpeg</url>
            <title>Stories by Junchen Pan on Medium</title>
            <link>https://medium.com/@junchenp1018?source=rss-327084052e19------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sat, 30 May 2026 06:34:06 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@junchenp1018/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[High Frequency of PyTest Unit Test Errors]]></title>
            <link>https://medium.com/@junchenp1018/high-frequency-of-pytest-unit-test-errors-b95dcf80e09b?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/b95dcf80e09b</guid>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Wed, 07 May 2025 08:23:45 GMT</pubDate>
            <atom:updated>2025-05-07T08:23:45.729Z</atom:updated>
            <content:encoded><![CDATA[<p># Summary of Common Unit Test Errors, Root Causes, and Golden Patterns</p><p>Based on our interactions and the provided guidelines, here’s a breakdown of frequent unit test issues:</p><p>## 1. Incorrect Mocking Strategy (High Frequency)</p><p>This is the most common category of errors encountered.</p><p>* **Error Type:** Patching the wrong target.<br> * **Root Cause:** `unittest.mock.patch` needs to target the name of the object *as it is looked up in the module under test*, not necessarily where it’s defined. If module `A` does `from B import C`, and you are testing module `A`, you need to patch `A.C` (or `module_A_path.C`), not `B.C`.<br> * **Golden Pattern:**<br> * Always identify where the object (class, function, method) is being imported and used by the code you are testing.<br> * Use `patch(‘module_under_test.object_to_mock’, …)` or `patch.object(ClassUsedInModuleUnderTest, ‘method_to_mock’, …)`.<br> * Refer to `unit_test_guide.mdc` section “Patching Imported Objects/Functions” for examples.</p><p>* **Error Type:** Not using `AsyncMock` for `async` methods/functions.<br> * **Root Cause:** Using `MagicMock` or a regular function for an `async def` callable. When the code under test `await`s the mock, it results in a `TypeError` because `MagicMock` instances are not awaitable by default.<br> * **Golden Pattern:**<br> * Always use `unittest.mock.AsyncMock` when mocking asynchronous functions or methods.<br> * Example: `with patch.object(MyClass, ‘async_method’, new_callable=AsyncMock) as mock_async_method: …`<br> * Ensure your test function itself is marked with `<a href="http://twitter.com/pytest">@pytest</a>.mark.asyncio` if it directly calls or awaits async code.</p><p>* **Error Type:** Incorrect `return_value` or `side_effect` for mocks.<br> * **Root Cause:** The mock doesn’t return the expected type or structure, or a `side_effect` function is overly complex or has incorrect logic. This can also happen when a lambda is used for a side effect and doesn’t handle arguments correctly (e.g., missing `self` if it’s meant to mimic a method).<br> * **Golden Pattern:**<br> * Keep mock return values simple. For `AsyncMock`, use `AsyncMock(return_value=…)`.<br> * If a `side_effect` is needed, ensure it’s a simple function that accurately reflects the behavior needed for the test. Avoid complex logic within `side_effect` functions.<br> * For methods that return `self` (fluent interfaces), use `mock_obj.return_value = mock_obj`.</p><p>* **Error Type:** Overly complex mock setups.<br> * **Root Cause:** Trying to mock too many implementation details or creating intricate `side_effect` functions when simpler approaches would suffice.<br> * **Golden Pattern:**<br> * Focus on mocking the *interface* your code interacts with, not the internal workings of the dependency.<br> * Prefer `return_value` over `side_effect` when possible.<br> * Use fixtures (`<a href="http://twitter.com/pytest">@pytest</a>.fixture`) to create reusable mock setups for common dependencies.</p><p>## 2. Assertion Mismatches (Medium Frequency)</p><p>These errors occur when the test’s expectations don’t align with the actual behavior of the code.</p><p>* **Error Type:** Outdated assertions or incorrect expected values.<br> * **Root Cause:** The code under test has changed (e.g., function signature, return value structure, logic), but the test assertions haven’t been updated to reflect these changes. A specific instance was asserting a detailed query string when the code was updated to use a generic `”*”` (as seen in `test_get_chart_by_id_behavior` from `unit_test_guide.mdc`).<br> * **Golden Pattern:**<br> * When refactoring code, always review and update corresponding unit tests.<br> * Make assertions precise but flexible enough to accommodate minor, non-functional changes if appropriate.<br> * Use descriptive variable names for expected values to improve readability.<br> * When asserting mock calls, use `mock_object.assert_called_once_with(…)` with the exact arguments and keyword arguments expected. `mock_object.call_args` can be used to inspect actual call arguments for debugging.</p><p>* **Error Type:** Asserting non-existent keys or attributes.<br> * **Root Cause:** The test tries to access a key in a dictionary (e.g., `kwargs[‘permissions’]`) or an attribute of an object that isn’t present in the actual result or call arguments.<br> * **Golden Pattern:**<br> * Before asserting the value of a key/attribute, ensure it’s expected to be present in that specific test scenario.<br> * Use `assert ‘key’ in dictionary` or `hasattr(obj, ‘attribute’)` before asserting the value if its presence is conditional.<br> * Carefully check the actual arguments passed to mocks (e.g., using `mock_name.call_args`) to verify what is actually being sent.</p><p>## 3. Scope and Context Issues (Medium Frequency)</p><p>These issues often relate to how the code interacts with its environment or input parameters.</p><p>* **Error Type:** Misunderstanding of parameter handling or conditional logic.<br> * **Root Cause:** The test setup doesn’t correctly reflect the conditions under which certain parameters are passed or certain code paths are taken. For example, initial confusion about whether `permissions` should always be passed to `search_context` or is conditional.<br> * **Golden Pattern:**<br> * Thoroughly understand the function/method signature and its conditional logic.<br> * Create separate tests for different code paths based on input parameters.<br> * In our “Fixing Unit Test in Zsh” interaction, we saw an issue where `PromptPlugin.build_augmented_prompt` expected a dictionary key named `”question”` but the test was providing `”query”`. This was a direct mismatch in expected input structure. The fix was to align the test’s input with the method’s expectation.</p><p>* **Error Type:** Input data structure mismatch.<br> * **Root Cause:** The test provides input data (e.g., a dictionary for arguments) that doesn’t match the structure expected by the function or method under test. This was the core issue in the `test_prompt_plugin.py` scenario where the key `query` was used instead of `question`.<br> * **Golden Pattern:**<br> * Double-check the expected input structure of the function/method being tested, especially if it accepts complex types like dictionaries or Pydantic models.<br> * Ensure the keys, types, and nesting of test data match the requirements.<br> * When dealing with Pydantic models or typed dictionaries, ensure your test data conforms to the schema.</p><p>## 4. Environment/Import Errors (Less Frequent but Critical)</p><p>* **Error Type:** `ImportError` or issues related to test environment setup.<br> * **Root Cause:** The test environment isn’t configured correctly, Python path issues, or dependencies are missing.<br> * **Golden Pattern:**<br> * Ensure your test runner (e.g., `pytest`) is invoked from the correct directory (usually the project root) so that source code modules can be found.<br> * Use virtual environments to manage dependencies consistently.<br> * Ensure all necessary packages are installed in the test environment.</p><p>## General Golden Patterns for Robust Unit Tests:</p><p>* **AAA Pattern (Arrange, Act, Assert):** Structure your tests clearly.<br> * **Arrange:** Set up all preconditions, including mock configurations and input data.<br> * **Act:** Execute the single unit of code being tested.<br> * **Assert:** Verify the outcome (return values, mock calls, state changes).<br>* **Test One Thing at a Time:** Each test should ideally verify a single behavior or code path. This makes failures easier to diagnose.<br>* **Descriptive Test Names:** Name your tests clearly to indicate what they are testing (e.g., `test_method_returns_true_when_input_is_valid`).<br>* **Use Fixtures for Setup:** `<a href="http://twitter.com/pytest">@pytest</a>.fixture` is excellent for creating reusable setup code (e.g., mock instances, test data).<br>* **Keep Tests Independent:** Tests should not depend on the order of execution or the state left by other tests.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=b95dcf80e09b" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[If nums versus If nums is not None in Python]]></title>
            <link>https://medium.com/@junchenp1018/if-nums-or-if-nums-is-not-none-in-python-e21a41b93664?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/e21a41b93664</guid>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Fri, 28 Mar 2025 08:14:52 GMT</pubDate>
            <atom:updated>2025-03-28T08:21:31.929Z</atom:updated>
            <content:encoded><![CDATA[<p># The Philosophy of Explicit Programming: A Case Study in Python Array Handling</p><p>When writing code, clarity and intent matter just as much as functionality. A subtle but common error in Python, particularly when handling arrays like those from NumPy, offers a perfect lens through which to explore this principle. Let’s dive into a real-world example, dissect the problem, and uncover why explicitness in programming is a virtue worth embracing.</p><h3>Root Cause</h3><p>The issue surfaces in a file called `azure_ai_search_memory.py` at line 165, where the following code resides:</p><p>```python<br>if embeddings and len(embeddings) &gt; 0:<br>```</p><p>At first glance, this looks like a reasonable check: ensure `embeddings` exists and has elements. But Python disagrees, throwing an error. Why?</p><h3>Why This Error Happens</h3><p>The error stems from a fundamental mismatch between Python’s expectations and the code’s assumptions. Let’s break it down:</p><p>1. Array Boolean Context<br>- The variable `embeddings` is likely a NumPy array, and the code attempts to evaluate it directly in a boolean context (`if embeddings`).<br>- Python doesn’t inherently know how to interpret an array as `True` or `False`. Should it mean “all elements are non-zero”? “At least one element exists”? Something else entirely?<br>- This ambiguity causes Python to raise an exception rather than guess the programmer’s intent.</p><p>2. Python’s Boolean Evaluation<br>- When an object is used in a boolean context (e.g., `if some_object`), Python calls its `__bool__` method (or `__len__` for containers) to determine truthiness.<br>- NumPy arrays, however, deliberately avoid defining a default boolean conversion. This is a safety feature: it prevents accidental, unclear operations on multi-element data structures.<br>- The result? A `ValueError` or similar exception, forcing the developer to clarify their logic.</p><p>3. Common with NumPy Arrays<br>- This isn’t a rare edge case — it’s a frequent stumbling block when working with NumPy.<br>- Unlike Python lists, which evaluate as `False` when empty and `True` when non-empty, NumPy arrays demand explicit handling.<br>- For example, you might use `embeddings.any()` (are any elements true?) or `embeddings.all()` (are all elements true?), but a bare `if embeddings` won’t cut it.</p><h3>Solution</h3><p>To fix this, we need to rewrite the condition to be explicit about what we’re checking. Here’s the transformation:</p><p>```python<br># Before (problematic)<br>if embeddings and len(embeddings) &gt; 0:</p><p># After (correct)<br>if embeddings is not None and len(embeddings) &gt; 0:<br>```</p><p><strong>Why This Works</strong><br>- `embeddings is not None` explicitly checks if the variable is defined and not `None`, avoiding any attempt to evaluate the array as a boolean.<br>- `len(embeddings) &gt; 0` then safely checks if the array has elements, assuming it’s a valid object with a length.<br>- Together, these conditions make the intent crystal clear: “Ensure `embeddings` exists and isn’t empty.”</p><h3>Takeaways</h3><p>This small error reveals a broader truth about programming: **explicitness often leads to better code**. Here are the key lessons:</p><p>- **Avoid Ambiguity**: Don’t rely on implicit boolean conversions for complex objects like arrays. If you mean “not None,” say `is not None`. If you mean “non-empty,” check the length explicitly.<br>- **Embrace Python’s Guardrails**: Features like NumPy’s refusal to auto-convert to boolean aren’t bugs — they’re protections. They push us to think harder about what we want.<br>- **Clarity Over Convenience**: Writing `if nums` might feel concise, but `if nums is not None` leaves no room for misinterpretation.</p><p>In the end, this isn’t just about fixing a bug — it’s about adopting a mindset. By being forced to articulate our intentions, we write code that’s not only functional but also self-documenting and robust. And that’s a philosophy worth coding by.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=e21a41b93664" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[You only need to know these for Clickhouse]]></title>
            <link>https://medium.com/@junchenp1018/you-only-need-to-know-these-for-clickhouse-b2d941d34540?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/b2d941d34540</guid>
            <category><![CDATA[clickhouse]]></category>
            <category><![CDATA[big-data]]></category>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Mon, 13 Jan 2025 03:27:51 GMT</pubDate>
            <atom:updated>2025-01-13T03:41:00.307Z</atom:updated>
            <content:encoded><![CDATA[<h3>ClickHouse Overview</h3><p>ClickHouse is a high-performance column-oriented database management system (DBMS) developed to handle real-time analytics with low latency. Its design excels in scenarios involving large volumes of read- and write-intensive operations.</p><h3>ClickHouse Architecture Overview</h3><h4>Massive Parallel Processing (MPP)</h4><ul><li><strong>Architecture Model:</strong> ClickHouse uses an MPP (Massive Parallel Processing) architecture with a <strong>shared-nothing</strong> design.</li><li><strong>Decentralized Resource Usage:</strong> Each ClickHouse node operates independently, utilizing its own local resources (memory and storage).</li><li><strong>Parallel Processing:</strong> Nodes collaborate by processing tasks in parallel, without sharing state directly, which maximizes performance and fault tolerance.</li></ul><p>Note:</p><ul><li>Imbalances in how data is distributed across shards can lead to uneven load across nodes, impacting query performance.</li></ul><h3>Columnar Storage in ClickHouse</h3><ul><li>Data is stored and processed by columns instead of rows.</li><li>Operations occur on arrays or chunks of columnar data, enabling <strong>vectorized execution</strong>.</li><li>Minimizes the cost of data processing by working on batches (vectors), improving efficiency for analytical queries.</li></ul><h3>Shard, Replica, and Cluster in ClickHouse</h3><ol><li><strong>Shard:</strong></li></ol><ul><li>A shard is a horizontal partition of a table’s data.</li><li>Data is distributed across multiple shards, each located on different nodes.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/497/1*so3xrs0RZUeoj2mF95sTCA.png" /></figure><p><strong>2. Replica:</strong></p><ul><li>A replica is a copy of a shard.</li><li>Replicas ensure high availability and fault tolerance by allowing operations to continue even if one replica fails.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/863/1*QMN-SDgeB0TsgotYGWmUwA.png" /></figure><p><strong>3. Cluster:</strong></p><ul><li>A cluster in ClickHouse is a group of nodes organized by the MPP model.</li><li>Each cluster consists of multiple shards, with each shard having one or more replicas.</li></ul><p>4. <strong>KV Store</strong>:</p><p>Zookeeper is used to keep the shard, replica metadata</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/898/1*lXzUroPIeT7dOEMvaV6l3w.png" /></figure><h3>Application</h3><p>ClickHouse plays a critical role in application backend as a data warehouse with three types of clusters:</p><ol><li><strong>Query Cluster:</strong></li></ol><ul><li>Stores six months’ behavior data, approximately 30 billion records per day, distributed across 100+ nodes.</li><li>Primarily supports query-heavy workloads.</li></ul><p><strong>2. Write Cluster:</strong></p><ul><li>Temporarily stores the latest day’s data (&lt;1TB) and updates the query cluster every 30 minutes.</li><li>Reduces query cluster’s load for better performance.</li></ul><p><strong>3. Data Durability and Recovery:</strong></p><ul><li>Uses persistent volumes (PV) to ensure data safety.</li><li>If a node or PV fails, a new node and volume are created and populated with replicated data from another node.</li></ul><h3>High Availability (HA) in ClickHouse</h3><ol><li><strong>Current Setup:</strong></li></ol><ul><li>Single query cluster with replicas for intra-cluster disaster recovery.</li></ul><p><strong>2. Future Plan:</strong></p><ul><li>Build duplicate query clusters across different regions and cosmic clusters to ensure high availability and cross-region disaster recovery.</li><li>A multi-cluster setup mitigates risks associated with individual cluster failures.</li></ul><p>ClickHouse’s distributed architecture, combined with its columnar data processing, makes it a powerhouse for real-time and analytics-intensive applications.</p><h3>FAQ</h3><ul><li>Q:Differences between sharding, partitioning and indexing</li><li>A: <strong>Sharding</strong>: Distributes data across <strong>multiple nodes</strong> (horizontal scaling). <strong>Partitioning</strong>: Splits data <strong>within a single node</strong> (logical grouping). <strong>Indexing</strong>: Speeds up <strong>data retrieval</strong> by creating pointers to specific rows or values.</li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=b2d941d34540" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Building Modern AI Applications with LLMs: Challenges and Solutions Part1]]></title>
            <link>https://medium.com/@junchenp1018/building-modern-ai-applications-with-llms-challenges-and-solutions-part1-12f158c0af17?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/12f158c0af17</guid>
            <category><![CDATA[application]]></category>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[llm]]></category>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Sun, 01 Dec 2024 03:21:40 GMT</pubDate>
            <atom:updated>2024-12-01T03:21:40.975Z</atom:updated>
            <content:encoded><![CDATA[<p><em>In </em>this blog, I’ll share my experience building modern AI applications using Large Language Models (LLMs). We’ll explore the challenges and solutions encountered while deploying a real production service.</p><h3>Background</h3><p>Our project began with two primary goals:</p><ol><li><strong>SQL Copilot:</strong> Automating SQL syntax refactoring across database languages and providing explanations for database functions using AI.</li><li><strong>Data Copilot:</strong> Translating natural language into SQL queries (NL2SQL), a problem that has been researched for over two decades but remains challenging in practical applications.</li></ol><p>The project stems from our existing <strong>data platform</strong>, where many product managers and designers lack familiarity with various database query languages. It’s challenging for these non-technical stakeholders to develop data-savvy skills quickly. Enabling them to interact naturally with the platform can bridge this gap and empower decision-making without requiring advanced technical expertise.</p><h3>Challenges</h3><h4>SQL Copilot</h4><p>This project focused on setting up a robust infrastructure for AI services. While conceptually straightforward, building a scalable and reliable Retrieval-Augmented Generation (RAG) system required addressing:</p><ul><li><strong>Embedding and Search Integration:</strong> Selecting a high-quality embedding model (Azure OpenAI) and a search system that supports vector search (Azure Cognitive Search).</li><li><strong>Document Chunking:</strong> Optimizing the division of database documentation for efficient search and retrieval.</li><li><strong>Prompt Design:</strong> Ensuring concise and relevant prompts for consistent and accurate outputs.</li></ul><h4>Data Copilot</h4><p>The challenges here were more nuanced and multifaceted:</p><ul><li><strong>Ambiguity in Natural Language:</strong> Words and phrases have multiple meanings that vary with context, making interpretation non-trivial.</li><li><strong>Database Complexity:</strong></li><li><strong>Abbreviations and Inconsistencies:</strong> Column names and values often lack standardized formats, making semantic interpretation difficult.</li><li><strong>Redundancy:</strong> Redundant data inflates token costs and increases hallucinations in generated SQL.</li><li><strong>Performance Issues:</strong> Balancing retrieval recall and computational efficiency was essential for production-grade performance.</li></ul><h3>Phases</h3><p>The evolving speed of LLMs and their surrounding ecosystem is astonishing. The quality of responses continues to have significant room for improvement, making it impractical to define a fixed methodology or final goal upfront. To tackle this, we broke the project into multiple phases, allowing us to adapt to the rapid changes in methodology and technology while ensuring steady progress.</p><h4>Phase 1 — SQL Copilot</h4><p>This phase focused on delivering SQL Copilot, leveraging its relative simplicity to establish the foundational AI infrastructure. RAG (Retrieval-Augmented Generation) was implemented as the backbone, enabling efficient query processing and retrieval. This phase ensured the infrastructure could support modern AI applications effectively.</p><h4>Phase 2 — Data Copilot</h4><p>Building on the established infrastructure, Phase 2 tackled the more complex challenge of Data Copilot. Here, we explored advanced techniques like semantic parsing and model fine-tuning to address NL2SQL problems. The focus was on adapting methodologies as they evolved, leveraging the modular design from Phase 1 for rapid iteration.</p><h3>Implementations</h3><h4>Sql Copilot</h4><p>Overview</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/683/1*GDAK59vXEuFfv9aEjj2VWQ.png" /></figure><p>There are the below question we need to resolve:</p><ul><li>Selection of embedding and chat service provider — Azure open AI</li><li>Search service that support vector search — Azure AI Search</li><li>Data preparation and chunking — Deciding chunk size</li><li>Prompt representation and format — Deciding instruction, retrieved docs format</li><li>Evaluation — RAGAS</li></ul><p>And for the application to be mature, we need to consider(Our language is python and service stack is on Azure):</p><ul><li>Deployment Infra — Azure K8s</li><li>Security — Enable MSI auth on Azure AAD</li><li>Data persistency — Azure CosmosDB</li><li>Web Server Framework — FastAPI</li><li>LLM Orechestration framework — Semantic Kernel</li><li>Observability — Opentelemetry + Azure Monitor</li></ul><h3>Technical Implementation</h3><h4>SQL Copilot</h4><p>SQL Copilot was designed as a <strong>RAG-based system</strong>. Key components include:</p><p><strong>1.Embedding and Search System:</strong></p><ul><li><strong>Embedding Model:</strong> Azure OpenAI’s embedding model (text-embedding-ada-002).</li><li><strong>Vector Search:</strong> Azure AI Search with native support for semantic search and cosine similarity as the distance metric.</li></ul><p><strong>2. Document Chunking:</strong><br>Large documents were split into smaller, contextually relevant pieces using a <strong>recursive markdown splitter</strong>:</p><ul><li><strong>Step 1:</strong> Split by structural elements like headers and paragraphs.</li><li><strong>Step 2:</strong> Merge smaller units recursively until a defined chunk size is met.</li><li><strong>Optimization:</strong> Empirically determined the optimal chunk size by evaluating retrieval metrics (precision, recall, and relevancy).</li></ul><p><strong>3. Prompt Engineering:</strong></p><ul><li>Initially used simple instruction-based templates with retrieved context.</li><li>Later, introduced <strong>function calling</strong> to improve flexibility. The LLM dynamically determines when to query additional context or external tools, reducing confusion from low-quality retrievals.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*58LUPZuNPAyNT6zwxLEJzw.png" /></figure><p><strong>4. Evaluation Framework:</strong></p><ul><li>Retrieval Metrics: Precision, recall, and relevancy of retrieved data.</li><li>Generation Metrics: Faithfulness and relevancy of the LLM’s response.</li><li>Ground Truth Comparison: F1 score, semantic similarity, and aspect critique to assess completeness and clarity.</li></ul><h4>Data Copilot (NL2SQL)</h4><p>This is a very complex topic, we will discuss in part2.</p><h3>Evaluation Metrics</h3><p>The below metrics are more for sql copilot (RAG system), for nl2sql, we will disclose it in part2.</p><h4>Retrieval Evaluation</h4><ul><li><strong>Contextual Precision:</strong> Measures how effectively the system ranks relevant context higher than irrelevant data.</li><li><strong>Contextual Recall:</strong> Assesses the system’s ability to capture all pertinent information.</li><li><strong>Relevancy:</strong> Ensures retrieved context remains focused without noise.</li></ul><h4>Generation Evaluation</h4><ul><li><strong>Answer Relevancy:</strong> Assesses whether generated SQL addresses user intent accurately.</li><li><strong>Faithfulness:</strong> Checks the factual accuracy of generated SQL against retrieved context.</li></ul><h4>Ground Truth Comparison</h4><ul><li><strong>F1 Score:</strong> Evaluates correctness by comparing generated SQL with ground-truth queries.</li><li><strong>Semantic Similarity:</strong> Analyzes how closely generated queries align with intent.</li><li><strong>Aspect Critique:</strong> Considers relevance, completeness, and clarity.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*413nwjXjfSBQhir135RBbA.png" /></figure><h3>AI Infrastructure</h3><p>A robust AI infrastructure is crucial for building scalable and efficient applications. Our focus was on creating a foundation that supports observability, data persistence, and seamless integration with AI services.</p><h4>Deployment Infrastructure</h4><p>We deployed our application on <strong>Azure Kubernetes Service (AKS)</strong> to ensure scalability and fault tolerance.</p><p><strong>Technical Benefits:</strong></p><ul><li><strong>Horizontal Scaling:</strong> Easily accommodates high request loads by automatically scaling resources.</li><li><strong>Auto-Recovery:</strong> Supports resilient application recovery in case of failures.</li><li><strong>Container Orchestration:</strong> Simplifies management of containerized services, ensuring efficient resource utilization.</li></ul><h4>Web Server Framework</h4><p>We chose <strong>FastAPI</strong> as our web server framework, leveraging its asynchronous capabilities for handling the orchestration-heavy nature of LLM applications.</p><p><strong>Technical Benefits:</strong></p><ul><li><strong>Uvicorn-Based:</strong> Operates as a single-threaded server with a queue, ideal for asynchronous tasks like calling external services.</li><li><strong>Efficient Orchestration:</strong> Handles LLM workflows that require communication with various APIs and services seamlessly.</li><li><strong>Ease of Development:</strong> Provides a developer-friendly environment with built-in support for modern Python features.</li></ul><h4>Application Framework</h4><p>We adopted <strong>Semantic Kernel</strong> as the core application framework for organizing LLM functions and plugins.</p><p><strong>Technical Benefits:</strong></p><ul><li><strong>Modularity:</strong> Enables clear separation of concerns, making functions and plugins reusable and organized.</li><li><strong>Orchestration:</strong> Facilitates efficient coordination of multiple LLM capabilities.</li><li><strong>Scalability:</strong> Supports the addition of new functions and plugins without disrupting the existing architecture.</li></ul><h4>Logging</h4><p>To enhance observability, we prioritized logging application activities in containerized Linux environments. This approach ensures comprehensive debugging context, correlates Q&amp;A sessions with user feedback, and maintains standardized telemetry conventions.</p><p><strong>Technical Selection:</strong></p><ul><li><strong>Framework:</strong> OpenTelemetry</li></ul><p><strong>Reasons:</strong></p><ul><li><strong>Standardization:</strong> Provides consistent telemetry conventions for logs, traces, and spans.</li><li><strong>Comprehensive Observability:</strong> Offers detailed views of operations within a trace, helping understand the end-to-end flow of requests.</li><li><strong>Integration with Azure Monitor:</strong> Supports seamless integration with Azure Monitor for storage, search, and visualization of telemetry data.</li><li><strong>Agent-Based Approach:</strong> Uses an intermediary agent to collect and forward telemetry data, enhancing reliability and efficiency.</li></ul><h4>Data Persistence</h4><p>The database design centers on storing chat messages, including user questions, AI responses, and feedback. This supports retrieving past interactions, improving user experience, and serving as a training dataset for AI models.</p><p><strong>Technical Selection:</strong></p><ul><li><strong>Database:</strong> CosmosDB/NoSQL</li></ul><p><strong>Reasons:</strong></p><ul><li><strong>Native Vector Support:</strong> Essential for AI tasks involving embeddings.</li><li><strong>Azure AI Search Integration:</strong> Facilitates efficient retrieval and search capabilities.</li><li><strong>Document-Based Design:</strong> Optimized for read-heavy operations, improving performance.</li><li><strong>Horizontal Scaling and Schema-Free:</strong> Ideal for managing large datasets with flexible structures.</li></ul><p>This modular infrastructure underpins the AI services, ensuring adaptability for future requirements and rapid iterations.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=12f158c0af17" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[What is Hybrid Search on Azure AI Search?]]></title>
            <link>https://medium.com/@junchenp1018/what-is-hybrid-search-on-azure-ai-search-a3f78d43b4d7?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/a3f78d43b4d7</guid>
            <category><![CDATA[azure-ai-search]]></category>
            <category><![CDATA[semantic-search]]></category>
            <category><![CDATA[search]]></category>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Sat, 30 Nov 2024 06:24:32 GMT</pubDate>
            <atom:updated>2024-11-30T06:24:32.732Z</atom:updated>
            <content:encoded><![CDATA[<p>In this post, we’ll explore the concept of <strong>Hybrid Search</strong> in Azure AI Search. Hybrid search combines full-text search, semantic search, and the Reciprocal Rank Fusion (RRF) algorithm to deliver a unified and more accurate result set.</p><h3>What is Hybrid Search?</h3><p>Hybrid search in Azure AI Search merges the results of full-text search and semantic search using the RRF algorithm. The goal is to leverage both keyword-based and meaning-based search techniques for better retrieval performance.</p><h3>Full-Text Search</h3><p><strong>Full-text search</strong> is a traditional, keyword-based search method. It breaks down content into terms using language-specific analysis, and ranks documents based on keyword relevance using the <strong>BM25</strong> scoring algorithm.</p><h4>Why BM25?</h4><p>BM25 is effective because it combines <strong>term frequency (TF)</strong> and <strong>inverse document frequency (IDF)</strong>:</p><ul><li><strong>TF</strong> measures how often a term appears in a document.</li><li><strong>IDF</strong> evaluates the rarity of a term across the document collection, giving more weight to rare terms.</li></ul><p>BM25 excels at <strong>lexical matching</strong> and provides <strong>token importance weights</strong>, making it highly effective for keyword-based searches.</p><p>Why BM25 has these advantages?</p><p>BM25 estimates relevance by combining <strong>term frequency (TF)</strong> and <strong>inverse document frequency (IDF)</strong>:</p><ul><li><strong>TF</strong>: How often does each term in the query (“endangered,” “Bengal,” “tiger”) appear in a document?</li><li><strong>IDF</strong>: How rare are the query terms across the entire document collection? Rare terms are given more weight.</li></ul><h4><strong>BM25’s score will vary based on these:</strong></h4><ul><li>The number of occurrences of the search term in each document. -&gt; Lexical matching</li><li>The rarity of the term across the entire collection (IDF). -&gt; gives token importance weights</li><li>The document length and how it affects term frequency. -&gt; Contributes to term frequency counting</li></ul><h3>Semantic Search</h3><p><strong>Semantic search</strong> takes a different approach by understanding the meaning behind the query rather than relying solely on keyword matches. It converts the query into a vector representation using models like OpenAI’s <strong>text-embedding-ada-002</strong>, and then performs a vector search to find documents with similar meanings. This method relies on <strong>cosine similarity</strong> or <strong>dot product</strong> to measure how closely a document’s vector matches the query.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/505/1*IkJqhC1mW4_C706AyNxT9A.png" /></figure><p>Unlike full-text search, which focuses on exact lexical matches, semantic search seeks to match the <strong>meaning</strong> behind the query, complementing the limitations of keyword-based search.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/709/1*EDM5EWfXGIrTMzSM9VLATw.png" /></figure><h3>Combining Results</h3><p>The intuition of semantic search is it complements the exact lexical matching, instead it is search for the closeness of meaning, which full-text search is unable to do. So combing these results could significantly increase the potential matching candidates.</p><p>But simply combining the results of both search methods is not enough. <strong>Reciprocal Rank Fusion (RRF)</strong> is an algorithm used to merge the rankings from both searches. RRF recalculates scores based on the rank position of each document in the result set.</p><p>For example, if a document ranks 1st in one result set and 3rd in another, RRF calculates the score for that document as:</p><ul><li>From the first search: 1(1+k)\frac{1}{(1+k)}(1+k)1​</li><li>From the second search: 1(3+k)\frac{1}{(3+k)}(3+k)1​</li></ul><p>The scores from both searches are summed to create a unified score, emphasizing documents that appear earlier in the lists.</p><h3>Final Step: Semantic Re-ranking</h3><p>After the initial retrieval phase using BM25, semantic search, and RRF, <strong>semantic ranking</strong> is applied as a secondary step. This re-ranks the top results based on their semantic relevance, ensuring the most meaningful documents appear first.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a3f78d43b4d7" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Service Observability with Open Telemetry [Code Included]]]></title>
            <link>https://medium.com/@junchenp1018/service-observability-with-open-telemetry-code-included-a7435eceae41?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/a7435eceae41</guid>
            <category><![CDATA[opentelemetry]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[observability]]></category>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Fri, 29 Nov 2024 03:45:42 GMT</pubDate>
            <atom:updated>2024-11-29T03:45:42.901Z</atom:updated>
            <content:encoded><![CDATA[<h3>Intro</h3><p>Service telemetry is an essential aspect of maintaining high availability and stability for any application, but it becomes even more critical for data analytics platforms. This article explores the methodology of implementing service telemetry using OpenTelemetry and Geneva, focusing on the general approach rather than specific products.​</p><h3>Goal</h3><p>Becoming modern observability ready, eventually following the 3 pillars of observability</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*U9ryfJkGSZYMsIJkiGK73Q.png" /></figure><h3>Collection Architecture</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OuY08PndoFLzT1TLnhPO3Q.png" /></figure><p>Agent way is more friendly to the client as a lot of features irrelevant of sending telemetry itself are refactored in the agent.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OY750ilxB4kP7XBWGSCvcw.png" /><figcaption>How it looks like in K8s</figcaption></figure><h4>More detailed</h4><p><strong>For Log and Span</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*9krPZPGQS4byCA65xbsyEw.png" /></figure><p><strong>For Metrics</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*cFHxpxEkK0Yz33eYpaWKGg.png" /></figure><h3>Opentelemetry Data Model</h3><p><strong>For logs</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/411/1*SBuGPYAtAjoP02dY1QnJCw.png" /></figure><p><strong>For metrics</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*N6heRyaxe88Mk3Txq-rqPw.png" /></figure><h3>Telemetry Flow</h3><p>Application will have opentelmetry sdk of its language to instrument telemetry through an exporter to an agent, while you could add a processor in between to process the log record. And the agent will be responsible for sending the telemetry to the monitoring service backend</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/454/1*TYOqNxpacOWXXE6_ZgfTdA.png" /></figure><h3>Implementation</h3><p><strong>For Log</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*IvyrhGZE9c2bM27qcm7MHg.png" /></figure><p><strong>For Span</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1I-0zAiDnKkPbq6brKPxJw.png" /></figure><p><strong>For Metrics</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*VbY45YFizMwwK0gP1ywTdg.png" /></figure><p>Log and Span can share the same agent, just the initialization codes are different. Metrics usually need a different agent specifically for aggregating metrics.</p><h4>Example on how to model the instrumentation</h4><p>Log And Span: Know your component, span each major stage, and log the info and errors. The key is to know the internal state just by the logs.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Ep-S9bP5MwbruiB_mXF5vw.png" /></figure><p>Metrics: Produce key metrics indicator that you would like to get alerted or monitor about.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/605/1*s_CGB1D_d8zop89rMSrB-w.png" /></figure><p>Now you are fully equipped with the modern observability and know the health of your service.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a7435eceae41" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Using Locust to do performance test on azure cosmosDB NoSQL database]]></title>
            <link>https://medium.com/@junchenp1018/using-locust-to-do-performance-test-on-azure-cosmosdb-nosql-database-cbb0c466ee42?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/cbb0c466ee42</guid>
            <category><![CDATA[locust]]></category>
            <category><![CDATA[performance-testing]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[cosmosdb]]></category>
            <category><![CDATA[load-testing]]></category>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Tue, 03 Sep 2024 03:30:33 GMT</pubDate>
            <atom:updated>2024-09-03T03:30:33.306Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/608/1*ZTOxSI6yr82OReP0dwR4LA.png" /><figcaption>Azure CosmosDB</figcaption></figure><h3>Context:</h3><p>I have a copilot chat service that needs saving the messages, cosmosDB NoSQL is chosen for this task. The stack is python.</p><h3>Problem:</h3><p>Even though cosmosDB theoretically should be very performant on the reads and writes due to the “unlimited” scaling. It is important to test it on our own hands and see the practical relation between the RU and the actual performance</p><h3>Expectations:</h3><p>Query and writes should be under second in a database with 500K+ items with 30 concurrent users who should resemble the normal user actions.</p><h3>Methodology:</h3><p>Given that a container’s performance could be optimized, we will have two containers for comparison. One is the “bench container” where we configure the basic minimum, and another one is the “optimized container” where we apply the best practices possible. Finally, with the same throughput provisioned and same workload, we will observe the performance for our targeted scenarios.</p><h3>Setups:</h3><ol><li>Create the resource, database and containers(bench container and optimized container)</li><li>Ingest enough sample data to containers</li><li>Run Locust to mimic 30 concurrent users</li></ol><h3>Details</h3><p>I will skip #1 as it has so many tutorials on how it could be done.</p><p>For #2, I used the stored procedure for fast insertions on the cosmosdb server side.</p><pre>function generateDataAndInsert(params) {<br>    var context = getContext();<br>    var collection = context.getCollection();<br>    var collectionLink = collection.getSelfLink();<br><br>    var numberOfItems = params.numberOfItems;<br>    var userId = params.userId;<br>    var results = params.results || [];<br>    var limit = params.limit;<br><br>    function getRandomFloat(min, max) {<br>        return Math.random() * (max - min) + min;<br>    }<br><br>    function getRandomString(n) {<br>        var text = &quot;&quot;;<br>        var possible = &quot;ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789&quot;;<br>        for (var i = 0; i &lt; n; i++) {<br>            text += possible.charAt(Math.floor(Math.random() * possible.length));<br>        }<br>        return text;<br>    }<br><br>    function getRandomInt(min, max) {<br>        min = Math.ceil(min);<br>        max = Math.floor(max);<br>        return Math.floor(Math.random() * (max - min + 1)) + min;<br>    }<br><br>    function getRandomTimestamp() {<br>        return new Date().toISOString();<br>    }<br><br>    function getRandomEmbeddings() {<br>        var embeddings = [];<br>        for (var i = 0; i &lt; 1536; i++) {<br>            embeddings.push(getRandomFloat(-0.001, 0.01));<br>        }<br>        return embeddings;<br>    }<br><br>    function getRandomDateInDays() {<br>        var date = new Date();<br>        var randomDays = getRandomInt(0, 365);<br>        date.setDate(date.getUTCDate() - randomDays);<br>        return date.toISOString().substring(0, 10);<br>    }<br><br>    tryCreate(numberOfItems, userId, results, limit, callback);<br><br>    function tryCreate(count, userId, results, limit, callback) {<br><br>        if (count === 0 || limit === 0) {<br>            context.getResponse().setBody(count);<br>            return;<br>        }<br><br>        var item = {<br>            &quot;chat_id&quot;: getRandomInt(0, 100000000),<br>            &quot;user_id&quot;: userId,<br>            &quot;question&quot;: getRandomString(20),<br>            &quot;answers&quot;: [{&quot;answer&quot;: getRandomString(20), &quot;top_docs&quot;: [getRandomString(5), getRandomString(5)], &quot;condensed_question&quot; : getRandomString(20), &quot;condensed_question_embeddings&quot; : getRandomEmbeddings(), &quot;answer_embeddings&quot; : getRandomEmbeddings(), &quot;feedback&quot; : {&quot;rating&quot; : getRandomInt(0, 6), &quot;content&quot; : getRandomString(20), &quot;timestamp&quot; : getRandomTimestamp()}}],<br>            &quot;created_day&quot;: getRandomDateInDays(),<br>            &quot;chat_created_day&quot;: getRandomDateInDays(),<br>        };<br>        var isAccepted = collection.createDocument(collectionLink, item, {}, callback);<br>        if (!isAccepted) getContext().getResponse().setBody(count);<br>    }<br><br>    function callback(err, documentCreated) {<br>        if (err) getContext().getResponse().setBody(count);<br>        <br>        results.push(documentCreated.id);<br>        numberOfItems--;<br>        limit--;<br>        tryCreate(numberOfItems, userId, results, limit, callback);<br><br>    }<br>}</pre><p>After you create it on the container, you could execute it on your machine. Note the stored procedure could only run within each logic partition</p><pre># The count is how many items you eventually want to create, the limit is<br># to control how many to be processed at a time, since there is fixed 5 seconds<br># timeout when calling stored procedure, you don&#39;t want to get timeout<br>def execute_stored_procedure(container, count, user_id, limit):<br>    &quot;&quot;&quot;Execute the stored procedure and handle the response.&quot;&quot;&quot;<br>    response = container.scripts.execute_stored_procedure(<br>        sproc=&quot;generateDataAndInsert&quot;,<br>        params=[{&quot;numberOfItems&quot;: count, &quot;userId&quot;: user_id, &quot;limit&quot;: limit}],<br>        partition_key=user_id,<br>        enable_script_logging=True,<br>    )<br><br># Using try catch to resume the remaining if exception occurs<br>count = 10000<br>while count &gt; 0:<br>    try:<br>        count = execute_stored_procedure(container, count, user_id, 5)<br>        print(f&quot;Remaining count: {count} for user_id: {user_id}&quot;)<br>    except Exception as e:<br>        print(&quot;Exception occurred, sleeping for 2 seconds...&quot;)<br>        print(f&quot;Error: {e}&quot;)<br>        sleep(2)</pre><p>For #3, I used locust to mimic a user behavior. A few take aways:</p><ul><li>Implement a class inheriting User from locust</li><li>Create methods for the scenarios you want to test</li><li>Adding @task(n) where n is the ratio to balance among scenarios, mimicking your real user usage</li><li>Customize your fire events if you need to, in my case, I want to observe the actual time spent in container, excluding the time spent on network, so I extracted the db operation time from the response header and fire it to my locust host.</li></ul><pre>import json<br>import os<br>import random<br>from datetime import datetime, timezone<br>from typing import Callable<br><br>from azure.core.credentials import TokenCredential<br>from azure.cosmos import CosmosClient<br>from azure.identity import (<br>    CertificateCredential,<br>    DefaultAzureCredential,<br>    get_bearer_token_provider,<br>)<br>from azure.keyvault.secrets import SecretClient<br>from datasets import load_dataset<br>from dotenv import load_dotenv<br>from locust import User, between, events, tag, task<br>from semantic_kernel.connectors.ai.open_ai import AzureTextEmbedding<br><br>class Mongouser(User):<br>    wait_time = between(2, 5)<br><br>    def on_start(self):<br>        credential = CredentialManager(ConfigLoader()).get_credential()<br>        client = CosmosClient(<br>            url=&quot;https://nezha-wukong-nosql-local-wus2.documents.azure.com&quot;,<br>            credential=credential,<br>        )<br><br>        # Select your database and container<br>        database_name = &quot;WukongDB&quot;<br>        container_name = &quot;OptimizedContainer&quot;<br>        database = client.get_database_client(database_name)<br>        self.container = database.get_container_client(container_name)<br>        self.user_id = random.randint(0, 50)<br>        self.chat_id = []<br>        self.item_ids = []<br>        self.dataset = load_dataset(&quot;squad&quot;, split=&quot;train[:5000]&quot;).train_test_split(<br>            test_size=0.1<br>        )[&quot;train&quot;]<br>        ad_token_provider = CredentialManager(ConfigLoader()).get_bearer_token_provider(<br>            credential, &quot;https://cognitiveservices.azure.com/.default&quot;<br>        )<br>        AOAI_API_BASE_WUS = ConfigLoader().load(&quot;AOAI_API_BASE_WUS&quot;)<br>        self.embedding_model = AzureTextEmbedding(<br>            deployment_name=&quot;text-embedding&quot;,<br>            ad_token_provider=ad_token_provider,<br>            endpoint=AOAI_API_BASE_WUS,<br>        )<br>        with open(&quot;./src/wukong/data_providers/cosmosdb/data.json&quot;) as f:<br>            self.documents = json.load(f)<br><br>    def success_fire(self, request_type, name):<br>        events.request.fire(<br>            request_type=request_type,<br>            name=name,<br>            response_time=float(<br>                self.container.client_connection.last_response_headers[<br>                    &quot;x-ms-request-duration-ms&quot;<br>                ]<br>            ),<br>            response_length=int(<br>                self.container.client_connection.last_response_headers[&quot;Content-Length&quot;]<br>            ),<br>        )<br><br>    def exception_fire(self, request_type, name, exception):<br>        events.request.fire(<br>            request_type=request_type,<br>            name=name,<br>            response_time=float(<br>                self.container.client_connection.last_response_headers[<br>                    &quot;x-ms-request-duration-ms&quot;<br>                ]<br>            ),<br>            response_length=int(<br>                self.container.client_connection.last_response_headers[&quot;Content-Length&quot;]<br>            ),<br>            exception=exception,<br>        )<br><br>    @task(1)<br>    @tag(&quot;GET&quot;)<br>    def fetch_test(self):<br>        if len(self.chat_id) == 0:<br>            return<br>        chat_id = random.choice(self.chat_id)<br><br>        try:<br>            document = self.container.query_items(<br>                query=&quot;SELECT TOP @n * FROM c WHERE c.chat_id = @chat_id OREDER BY c.timestamp DESC&quot;,<br>                parameters=[<br>                    {&quot;name&quot;: &quot;@chat_id&quot;, &quot;value&quot;: chat_id},<br>                    {&quot;name&quot;: &quot;@n&quot;, &quot;value&quot;: 10},<br>                ],<br>                enable_cross_partition_query=False,<br>                populate_query_metrics=True,<br>                partition_key=self.user_id,<br>            )<br><br>            self.success_fire(<br>                &quot;GET&quot;, &quot;Retrieve all messages from a user&#39;s specific chat&quot;<br>            )<br>        except Exception as e:<br>            self.exception_fire(<br>                &quot;GET&quot;, &quot;Retrieve all messages from a user&#39;s specific chat&quot;, e<br>            )<br><br>    @task(4)<br>    @tag(&quot;POST&quot;)<br>    def post_test(self):<br>        document = random.choice(self.documents)<br>        chat_id = random.randint(0, 100)<br>        document[&quot;user_id&quot;] = self.user_id<br>        document[&quot;chat_id&quot;] = chat_id<br>        document[&quot;message_id&quot;] = random.randint(0, 999)<br><br>        try:<br>            res = self.container.create_item(<br>                body=document,<br>                enable_automatic_id_generation=True,<br>                populate_query_metrics=True,<br>            )<br>            print(<br>                f&quot;Document created successfully: {res[&#39;id&#39;]} for partition key {self.user_id}&quot;<br>            )<br>            if chat_id not in self.chat_id and res[&quot;id&quot;] is not None:<br>                self.chat_id.append(chat_id)<br>            if res[&quot;id&quot;] is not None:<br>                self.item_ids.append(res[&quot;id&quot;])<br>            self.success_fire(&quot;POST&quot;, &quot;Create a new message in a user&#39;s specific chat&quot;)<br>        except Exception as e:<br>            self.exception_fire(<br>                &quot;POST&quot;, &quot;Create a new message in a user&#39;s specific chat&quot;, e<br>            )<br><br>    @task(1)<br>    @tag(&quot;PUT&quot;)<br>    def put_test(self):<br>        if len(self.item_ids) == 0:<br>            return<br>        item_id = random.choice(self.item_ids)<br>        partitionKey = self.user_id<br><br>        try:<br>            # Retrieve the document<br>            document = self.container.patch_item(<br>                item_id,<br>                partition_key=partitionKey,<br>                patch_operations=[<br>                    {&quot;op&quot;: &quot;add&quot;, &quot;path&quot;: &quot;/feedback_rating&quot;, &quot;value&quot;: 4},<br>                    {<br>                        &quot;op&quot;: &quot;add&quot;,<br>                        &quot;path&quot;: &quot;/feedback_content&quot;,<br>                        &quot;value&quot;: &quot;Updated feedback content&quot;,<br>                    },<br>                    {<br>                        &quot;op&quot;: &quot;add&quot;,<br>                        &quot;path&quot;: &quot;/feedback_timestamp&quot;,<br>                        &quot;value&quot;: datetime.now(timezone.utc).isoformat(),<br>                    },<br>                ],<br>            )<br><br>            self.success_fire(&quot;PUT&quot;, &quot;Update feedback-related fields&quot;)<br>        except Exception as e:<br>            self.exception_fire(&quot;PUT&quot;, &quot;Update feedback-related fields&quot;, e)<br><br>    @task(2)<br>    @tag(&quot;SEARCH&quot;)<br>    def search_data_test(self):<br>        if len(self.item_ids) == 0:<br>            return<br>        documents = random.choice(self.documents)<br><br>        query_embed = documents[&quot;question_embeddings&quot;]<br><br>        try:<br>            # Define query<br>            query = (<br>                &quot;SELECT TOP 5 c.user_id, c.timestamp, c.chat_id, c.message_id, c.body&quot;<br>                &quot;FROM c&quot;<br>                &quot;WHERE VectorDistance(c.answer_embeddings, {0}) &gt; 0.5&quot;<br>                &quot;ORDER BY VectorDistance(c.answer_embeddings, {0})&quot;<br>            ).format(query_embed)<br><br>            # Query items<br>            query_iterable = self.container.query_items(<br>                query=query,<br>                enable_cross_partition_query=True,<br>                populate_query_metrics=True,<br>                partition_key=self.user_id,<br>            )<br><br>            all_fetched_res = []<br>            for page in query_iterable.by_page():<br>                fetched_res = list(page)<br>                all_fetched_res.extend(fetched_res)<br><br>            for res in all_fetched_res:<br>                print(res)<br><br>            self.success_fire(&quot;SEARCH&quot;, &quot;Search for similar questions&quot;)<br>        except Exception as e:<br>            self.exception_fire(&quot;SEARCH&quot;, &quot;Search for similar questions&quot;, e)<br><br>class Singleton(type):<br>    &quot;&quot;&quot;<br>    This class is a metaclass that implements the singleton pattern.<br>    &quot;&quot;&quot;<br><br>    instance = None<br><br>    def __call__(cls, *args, **kwargs):<br>        if cls.instance is None:<br>            cls.instance = super(Singleton, cls).__call__(*args, **kwargs)<br>        return cls.instance<br><br>class ConfigLoader(metaclass=Singleton):<br>    &quot;&quot;&quot;<br>    This class is responsible for loading environment variables from Azure Key Vault.<br>    &quot;&quot;&quot;<br><br>    def __init__(self):<br>        load_dotenv()<br>        self.key_vault_name = os.environ.get(&quot;KEY_VAULT_NAME&quot;, &quot;&quot;)<br>        self.key_vault_prefixes = os.environ.get(&quot;KEY_VAULT_PREFIXES&quot;, &quot;&quot;).split(&quot;,&quot;)<br><br>        self.client = SecretClient(<br>            vault_url=f&quot;https://{self.key_vault_name}.vault.azure.net&quot;,<br>            credential=DefaultAzureCredential(),<br>        )<br><br>    def load(self, key: str, default_value: str = None):<br>        env_value = os.environ.get(key)<br>        for key_vault_prefix in self.key_vault_prefixes:<br>            if (<br>                env_value<br>                and key_vault_prefix<br>                and env_value.lower().startswith(key_vault_prefix.lower())<br>                and self.key_vault_name<br>            ):<br>                return self.client.get_secret(env_value).value<br>        if env_value is None and default_value:<br>            return default_value<br>        return env_value<br><br>    @classmethod<br>    def reset_instance(cls):<br>        cls.instance = None<br><br>class CredentialManager:<br>    def __init__(self, config_loader: ConfigLoader):<br>        self.config_loader = config_loader<br><br>    def get_credential(self) -&gt; TokenCredential:<br>        if self.config_loader.load(&quot;ENVIRONMENT&quot;) == &quot;local&quot;:<br>            tenant_id = self.config_loader.load(&quot;AZURE_TENANT_ID&quot;)<br>            client_id = self.config_loader.load(&quot;AZURE_CLIENT_ID&quot;)<br>            certificate_path = self.config_loader.load(&quot;CERT_FILE_PATH&quot;)<br>            return CertificateCredential(<br>                tenant_id=tenant_id,<br>                client_id=client_id,<br>                certificate_path=certificate_path,<br>                send_certificate_chain=True,<br>            )<br>        return DefaultAzureCredential()<br><br>    def get_bearer_token_provider(<br>        self, credential: TokenCredential, scope: str<br>    ) -&gt; Callable[[], str]:<br>        ad_token_provider = get_bearer_token_provider(<br>            credential,<br>            scope,<br>        )<br>        return ad_token_provider<br><br>credential = CredentialManager(ConfigLoader()).get_credential()<br><br>ad_token_provider = CredentialManager(ConfigLoader()).get_bearer_token_provider(<br>    credential, &quot;https://cognitiveservices.azure.com/.default&quot;<br>)<br>AOAI_API_BASE_WUS = ConfigLoader().load(&quot;AOAI_API_BASE_WUS&quot;)<br>embedding_model = AzureTextEmbedding(<br>    deployment_name=&quot;text-embedding&quot;,<br>    ad_token_provider=ad_token_provider,<br>    endpoint=AOAI_API_BASE_WUS,<br>)</pre><p>Finally, you will have the beautiful and insightful charts indicating the performance:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/886/1*5VxGQEj8z5biwe0woYQvzQ.png" /><figcaption>For Bench Container</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/845/1*wFZPRSLx8l3bK1vo3roYQA.png" /><figcaption>For Optimized Container</figcaption></figure><p>I didn’t find a formal guide on this, so this is starting from my need. Please feel free to share what you think about how database performance tests should be carried out, cheers!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=cbb0c466ee42" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[In-depth share for deploying Azure Function App with python stack]]></title>
            <link>https://medium.com/@junchenp1018/in-depth-share-for-deploying-azure-function-app-with-python-stack-267c48a76a24?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/267c48a76a24</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[azure-function-app]]></category>
            <category><![CDATA[embedding-model]]></category>
            <category><![CDATA[linux]]></category>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Fri, 19 Apr 2024 05:41:43 GMT</pubDate>
            <atom:updated>2024-09-09T09:07:45.227Z</atom:updated>
            <content:encoded><![CDATA[<h3>Target Audience:</h3><ul><li>You want to deploy azure function app through azure pipelines with automatic CI/CD.</li><li>You have extra component like a transformer model to download for your function app.</li><li>You are using python for azure function app</li><li>You meet the problem : Zip deployment successful, but no function is shown on the azure function app portal</li></ul><p>The Azure Function app is a Platform-as-a-Service (PaaS) offering that provides serverless compute for running event-driven code without the need for explicit provisioning or infrastructure management. In my scenario, I aim to host a simple service that loads an open source embedding model and provides vectorizing functionality for a list of texts.</p><p>For this project, I’ve chosen to use Python 3.10 because it includes the sentence-transformer package necessary for downloading and saving the model. As for the virtual machine image, I&#39;ve opted for ubuntu-latest. To achieve continuous delivery, I rely on Azure DevOps agents, utilizing Azure Pipelines for CI/CD.</p><p>One challenge I encountered is downloading the model, which can be over 10 gigabytes in size. To avoid storing such a massive file in Git, I’ve explored two options: First, having the Azure Pipeline download it during the initial stage, or second, implementing a deployment post-hook so that after successful deployment, the Azure Function app triggers a script to download the model.</p><p>Although I extensively researched online resources, I couldn’t find any solutions regarding the second approach. Therefore, I decided to proceed with the first option, downloading the model during the pipeline process and then packaging all files into a zip deployment for the Azure Function app.</p><p>During the development and deployment phases of the function app, I encountered considerable confusion due to scattered settings and documentation. It took me two days to finally get the function to work. The challenge with Microsoft technology is the scattered nature of the documentation and resources, which often require trial and error to navigate effectively.</p><p>In an effort to bring more organization to the process, I aim to provide clearer explanations of each configuration, rather than relying solely on memorizing configurations as if they were magic spells.</p><h4>Downloading model</h4><p>Leveraging sentence_transformers package, you can easily download the open-source model on hugging-face</p><pre>import os<br><br>model_parent_dir_name = &quot;models&quot;<br>default_embedding_model = &quot;Your Model&quot;<br>model_env_var = &quot;SENTENCE_TRANSFORMERS_TEXT_EMBEDDING_MODEL&quot;<br><br><br>def main():<br>    model_parent_dir = os.path.join(os.getcwd(), model_parent_dir_name)<br>    # Check if the directory exists<br>    if not os.path.exists(model_parent_dir):<br>        # If it doesn&#39;t exist, create it<br>        os.makedirs(model_parent_dir)<br><br>    # Check if model is already downloaded<br>    model_name = os.getenv(model_env_var, default_embedding_model)<br>    model_dir = os.path.join(model_parent_dir, model_name)<br>    print(f&quot;Model directory: {model_dir}&quot;)<br>    if os.path.exists(model_dir):<br>        print(f&quot;Model {model_name} already downloaded&quot;)<br>        return<br><br>    # Initialize and download the model<br>    print(f&quot;Downloading {model_name}...&quot;)<br>    from sentence_transformers import SentenceTransformer<br><br>    model = SentenceTransformer(model_name)<br>    model.save(model_dir)<br><br><br>if __name__ == &quot;__main__&quot;:<br>    main()</pre><blockquote>sentence_transformer might not work for all models, check on the model card on the hugging face to see which package can support downloading it.</blockquote><h4><strong>Azure function app code development</strong></h4><p>I chose vscode + azure function extension for my development. You can first create function and trigger template through extension, add your logic in your function_app.py</p><blockquote>The azure function app is a sub project inside my project which leverages the function app service, so you can see I have a nested folder for organizing the structure. <strong>The key idea is to put your function_app.py, and host.json under your working space for your azure function app</strong>.</blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/529/1*Tn_Fzu2JoTY93zvWbbRR8A.png" /><figcaption>My Project Structure</figcaption></figure><pre>import json<br>import logging<br>import os<br><br>import azure.functions as func<br>from sentence_transformers import SentenceTransformer<br><br>app = func.FunctionApp(http_auth_level=func.AuthLevel.FUNCTION)<br><br>@app.function_name(name=&quot;GetTextEmbedding&quot;)<br>@app.route(route=&quot;embed&quot;)<br>def GetTextEmbedding(req: func.HttpRequest) -&gt; func.HttpResponse:<br>    logging.info(&quot;Python HTTP trigger function processed a request.&quot;)<br><br>    input_text = []<br>    try:<br>        req_body = req.get_json()<br>        if &quot;text&quot; in req_body:<br>            input_text.append(req_body[&quot;text&quot;])<br>        elif &quot;values&quot; in req_body:<br>            for value in req_body[&quot;values&quot;]:<br>                input_text.append(value[&quot;data&quot;][&quot;text&quot;])<br>    except Exception as e:<br>        logging.exception(e, &quot;Could not get text from request body&quot;)<br><br>    if len(input_text) == 0:<br>        return func.HttpResponse(&quot;No input text found&quot;, status_code=400)<br><br>    model = os.getenv(<br>        &quot;SENTENCE_TRANSFORMERS_TEXT_EMBEDDING_MODEL&quot;,<br>        &quot;mixedbread-ai/mxbai-embed-large-v1&quot;,<br>    )<br>    <br>    model_path = os.path.join(os.getcwd(), &quot;models&quot;, model)<br>    #Load the model when the function app is initialized<br>    model = SentenceTransformer(model_path)<br><br>    embeddings = model.encode(input_text)<br>    response = {<br>        &quot;values&quot;: [<br>            {&quot;recordId&quot;: i, &quot;data&quot;: {&quot;vector&quot;: embedding.tolist()}}<br>            for i, embedding in enumerate(embeddings)<br>        ]<br>    }<br><br>    return func.HttpResponse(<br>        body=json.dumps(response), mimetype=&quot;application/json&quot;, status_code=200<br>    )</pre><p>The code deployment is straightforward, in the generated template, you can add your logic in the route.</p><h3>Deployment</h3><blockquote>Different service plans have different deployment limits — for example, consumption plan for linux only support run from package with external package URL</blockquote><p>You can utilize the AzureFunctionApp task within your pipeline to update your function app, requiring a zip file as the input package. This necessitates zipping the working directory of your content in the previous step.</p><p>When it comes to deployment methods, two options typically meet the needs:</p><ol><li>Zip Deployment + Remote Build</li><li>Zip Deployment + Run From Package</li></ol><p>You can employ the AzureFunctionApp@2 task and specify your preferred deployment method in the configuration.</p><p>For “Remote Build” to work on Linux, you need to set the following application settings:</p><ul><li>ENABLE_ORYX_BUILD=true</li><li>SCM_DO_BUILD_DURING_DEPLOYMENT=true</li></ul><h4>Remote Build</h4><p>To enable remote build on Linux, you must set these application settings:</p><ul><li><a href="https://learn.microsoft.com/en-us/azure/azure-functions/functions-app-settings#enable_oryx_build">ENABLE_ORYX_BUILD=true</a></li><li><a href="https://learn.microsoft.com/en-us/azure/azure-functions/functions-app-settings#scm_do_build_during_deployment">SCM_DO_BUILD_DURING_DEPLOYMENT=true</a></li></ul><p>It’s important to note that remote build isn’t executed when an app is using “run-from-package.”</p><p>For my case, I don’t need to run things during the build phase, so I chose run_from_package with local reference to the deployed zip package.</p><h4><strong>Yml file for azure pipelines</strong></h4><blockquote>Pay attention to the working directory and the file path of your script, the working directory on your local might differ from the one on the agent.</blockquote><pre>trigger:<br>  branches:<br>    # Cannot have variables in triggers<br>    include:<br>      - main<br>  paths:<br>    include:<br>      - src/azure_functions/*<br>variables:<br>  # Azure service connection established during pipeline creation<br>  azureSubscription: &#39;----&#39;<br>  appName: &#39;---&#39;<br>  # Agent VM image name<br>  vmImageName: &#39;ubuntu-latest&#39;<br>  <br>jobs:<br>- job: BuildAndDeploy<br>  displayName: &#39;Build and Deploy Azure Function&#39;<br>  pool:<br>    vmImage: $(vmImageName)<br><br>  steps:<br>  - task: UsePythonVersion@0<br>    displayName: &quot;Set Python version to 3.10&quot;<br>    inputs:<br>      versionSpec: &#39;3.10&#39;<br>      architecture: &#39;x64&#39;<br><br>  - task: Bash@3<br>    displayName: &#39;Install pip requirements&#39;<br>    inputs:<br>      targetType: &#39;inline&#39;<br>      script: |<br>        python -m pip install --upgrade pip<br>        pip install --target=&quot;./.python_packages/lib/site-packages&quot; -r ./requirements.txt<br>      workingDirectory: &#39;$(Build.SourcesDirectory)/src/azure_functions/embedding_function&#39;<br>      <br>  - task: ShellScript@2<br>    displayName: &#39;Download Model&#39;<br>    inputs:<br>      scriptPath: &#39;./src/azure_functions/embedding_function/scripts/download_embedding_model.sh&#39;<br>      disableAutoCwd: true<br>      cwd: &#39;$(Build.SourcesDirectory)/src/azure_functions/embedding_function&#39;<br><br>  # Zip function contents<br>  - task: ArchiveFiles@2<br>    displayName: Zip function contents<br>    inputs:<br>      rootFolderOrFile: &#39;$(Build.SourcesDirectory)/src/azure_functions/embedding_function&#39;<br>      includeRootFolder: false<br>      archiveType: &#39;zip&#39;<br>      archiveFile: &#39;$(Build.ArtifactStagingDirectory)/$(Build.BuildId).zip&#39;<br>      replaceExistingArchive: true<br><br>  - task: AzureFunctionApp@2 # Add this at the end of your file<br>    inputs:<br>      azureSubscription: $(azureSubscription)<br>      appType: functionAppLinux # default is functionApp<br>      appName: $(appName)<br>      package: &#39;$(Build.ArtifactStagingDirectory)/$(Build.BuildId).zip&#39;<br>      deploymentMethod: runFromPackage</pre><ul><li>Build.SourcesDirectory: Represents the local path on the agent where your source code files are downloaded.</li><li>Build.ArtifactStagingDirectory: Denotes the local path on the agent where any artifacts are copied to before being pushed to their destination.</li></ul><p>Ensure Python packages are installed in the target environment according to the guidelines provided in the official documentation. Exercise caution when considering the use of virtual environments, as they may not behave as expected.</p><pre>#!/bin/bash<br># Custom Python package directory<br>python_packages_dir=&quot;./.python_packages/lib/site-packages&quot;<br><br># Add the custom directory to the Python path<br>export PYTHONPATH=&quot;$PYTHONPATH:$python_packages_dir&quot;<br># Run the Python script<br>python_script=&quot;./scripts/download_embedding_models.py&quot;  # Replace with the path to your Python script<br>if [ -f &quot;$python_script&quot; ]; then<br>    echo &quot;Running Python script: $python_script&quot;<br>    python &quot;$python_script&quot;<br>else<br>    echo &quot;Error: Python script not found at $python_script&quot;<br>    exit 1<br>fi</pre><p>For the shell script, remember to add the python dependencies path to python sys path. So that your dependencies can be referenced.</p><blockquote>Downloading model can cost a huge disk space, and that is limited on the agent, which is 10GB. So adjust your model size in case you met the space not enough error</blockquote><pre>import os<br><br>model_parent_dir_name = &quot;models&quot;<br>default_embedding_model = &quot;mixedbread-ai/mxbai-embed-large-v1&quot;<br>model_env_var = &quot;SENTENCE_TRANSFORMERS_TEXT_EMBEDDING_MODEL&quot;<br><br><br>def main():<br>    model_parent_dir = os.path.join(os.getcwd(), model_parent_dir_name)<br>    # Check if the directory exists<br>    if not os.path.exists(model_parent_dir):<br>        # If it doesn&#39;t exist, create it<br>        os.makedirs(model_parent_dir)<br><br>    # Check if model is already downloaded<br>    model_name = os.getenv(model_env_var, default_embedding_model)<br>    model_dir = os.path.join(model_parent_dir, model_name)<br>    print(f&quot;Model directory: {model_dir}&quot;)<br>    if os.path.exists(model_dir):<br>        print(f&quot;Model {model_name} already downloaded&quot;)<br>        return<br><br>    # Initialize and download the model<br>    print(f&quot;Downloading {model_name}...&quot;)<br>    from sentence_transformers import SentenceTransformer<br><br>    model = SentenceTransformer(model_name)<br>    model.save(model_dir)<br><br><br>if __name__ == &quot;__main__&quot;:<br>    main()</pre><p>Your Python script should now be ready to utilize the dependencies. As a final step, use zip deployment to deploy your Azure Function. Remember to archive your agent’s working directory, storing it in the staging directory. Then, reference the zip file in the publish task.</p><h4><strong>Configure app settings on azure function app portal</strong></h4><p>Azure pipeline with AzureFunctionApp@2 task will overwrite the app settings depending on your deployment method, so you don’t need to configure it yourself on the portal. But it is good to check on the settings.</p><h4><strong>Debug on the portal and make fixes</strong></h4><p>Diagnosing errors can be challenging without access to error logs. I’ve found the “Function App Down” or “Reporting” section in the <strong>Diagnose section</strong> to be extremely helpful when deploying an app. This section provides a summarized view of errors or exceptions encountered during deployment.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/918/1*eD6X6VoOox_egwMN4ocCdw.png" /></figure><p>Additionally, consider upgrading to a premium plan to <strong>enable Application Insights</strong>. This allows you to access detailed traces and logs, providing valuable insights into any issues that arise.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OsESbNjXVmsfGBlYR87qCQ.png" /><figcaption>Go to traces of application insights to know more</figcaption></figure><p>That’s all I got, feel free to post your thoughts and your way of doing it in the comments, I’d love to hear.</p><h3>Some Traps:</h3><ol><li>Don’t name your function app too long, that will fail the deployment and gives you no warning!</li><li>Deploying through portal interface is error-prone, <strong>try deploying it using pipeline with Azure CLI </strong>where you explicitly specify your “deploymentMethod: runFromPackage” for python package, as it doesn’t need build. This is helpful because some practices are automatically enforced like the example below:</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*2zl6d4EIto2XVZuV7c7uPA.png" /></figure><p>3. remember to add the target pip install — <strong>target=”./.python_packages/lib/site-packages”</strong> -r ./requirements.txt</p><p>4. Function app uses storage accounts for hosting files and runtime triggers. When switching to identity-based connections, remember that only dedicated plans and not consumption plan that is dynamically scaling can support the identity-based connections. Because share file will be enabled for dynamically scaling.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=267c48a76a24" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Differences in Javascript with throwing and return error]]></title>
            <link>https://medium.com/@junchenp1018/differences-in-javascript-with-throwing-and-return-error-d8c0e901c083?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/d8c0e901c083</guid>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Mon, 20 Apr 2020 02:54:21 GMT</pubDate>
            <atom:updated>2020-04-20T02:54:21.982Z</atom:updated>
            <content:encoded><![CDATA[<p>This blog is to differentiate the throwing an error with returning an error. In project, I found different people have different ways of handling errors, but when people work together, this kind of mixture can generate a real error. And I want to take a step back to discuss about the difference, so that people can handle their errors consciously.</p><p>First of all, let’s see what happen if we throwing an error</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*-ryL8M91DA5EAFZNZe0mew.png" /></figure><p>this will cause the thread stop working, we can see the console never gets executed.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/896/1*e8nF128gtUBw9BdoOOzuBw.png" /></figure><p>Now if we add a try catch, we can see the thread will keeping working and let you handle the error in the catch block.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/834/1*hSqgw8UJqpMz37M6_dE0Qg.png" /></figure><p>And even if it is the nested function throwing the error, the catch can handle the error at the outer function level, the thread can still work.</p><p>Okay, now let’s see another style of error handling (The Golang way)</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*yJU7D6nV-5YX5VMFBzGi7Q.png" /></figure><p>So here we handle error by returning it instead of throwing. You can see you have more granularities of error message here, and it is safer, it won’t accidentally stop the whole thread if you forget to try catch, and it is more readable. But we also have a drawback, if we want to add a new error in the very inner function call like inner10(), then we gonna return the error level by level, that’s error-prone and unpleasant to do so.</p><p>Better solutions? Yes, we could see that these two type of errors handling are not conflicting each other at all, we can choose to the second style if we want more granularity control of messages and we can choose to the first style if we just want to catch the error inside multiple depth of functions.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*a5Cvuv5FLXe468Qkg7OAvQ.png" /></figure><p>This example shows, I really don’t care the error details beyond inner2, if any layer throws, just catch it in the inner1, and in the main(), we can clearly test result.error to see if any error returned.</p><p>Overall, the two ways of error handling really provide flexibilities and conveniences on how you want to handling your errors. I am not saying the best way to do error handling, because there is no such case. Find out what works in your case and what kind of error you want to log or recover from.</p><p>Welcome any thoughts on this, always appreciate better ideas :)</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=d8c0e901c083" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[How to test your producer and consumer with AWS sqs by Postman]]></title>
            <link>https://medium.com/@junchenp1018/how-to-test-your-producer-and-consumer-with-aws-sqs-by-postman-16a4d1b545b7?source=rss-327084052e19------2</link>
            <guid isPermaLink="false">https://medium.com/p/16a4d1b545b7</guid>
            <category><![CDATA[postman]]></category>
            <category><![CDATA[sqs]]></category>
            <category><![CDATA[testing]]></category>
            <dc:creator><![CDATA[Junchen Pan]]></dc:creator>
            <pubDate>Mon, 09 Mar 2020 04:31:02 GMT</pubDate>
            <atom:updated>2020-03-09T04:31:02.887Z</atom:updated>
            <content:encoded><![CDATA[<p>This is my very first story to contribute back, I really benefit a lot from these amazing essays on Medium. While working, I came into a task to add some functionalities to my consumer and producer, but later on I found it was really frustrating to do the whole operating line testing, since it is a microservice system and I don’t really need to prepare all the test data for other services. So, I was focusing just to test the partial logic I have added.</p><p>After some googling, I really don’t see the very detailed tutorial on how to easily test, that’s the exact reason why I am writing this story.</p><blockquote>How the sqs service actually works? It is fairly simple, producer sends the messages to the aws queue and consumer (either short polling or long polling) fetch it back to do the processing. <strong>Only after the message gets successfully processed, did the message get destroyed in the queue.</strong></blockquote><p>First, you need four actions in Postman</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/472/1*5QSnmopBjKu3Xupxr0x_JQ.png" /><figcaption>four actions you need for testing</figcaption></figure><p>Let me explain why we need these four, for the first two, as the name suggests, it is the core actions to send and fetch messages in the queue. For the third one, you can get metadata on the queue for troubleshooting, and last one is somewhat important, because as I mentioned earlier, <strong>Only after the message gets successfully processed, did the message get destroyed in the queue. </strong>So this method is to actually purge all the messages in the queue, in case your message cannot get processed in the consumer, and your consumer repeatedly gets the same message.</p><blockquote>Here I only take sendMessage as an example, and other actions just follow the pattern, and you could check the aws sqs docs to see how to play with those.</blockquote><p>First, set up the aws authorization:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*AeV7Qr6QxW0ctXsL-otiUw.png" /><figcaption>you need this in every action</figcaption></figure><p>Second, filling up the parameters:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*EobyA9COq9zrRrjjwVKsxg.png" /><figcaption>Here the action is sendMessage</figcaption></figure><p>Things in the MessageBody are the test data you want to define, you could define it in the pre-request script. MessageAttribute is the metadata for your consumer to perform different tasks.</p><blockquote>Examine carefully with the params’ keys, since HMAC hash examine the request data integrity. More info : <a href="https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-api-request-authentication.html">https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-api-request-authentication.html</a></blockquote><p>Last, it is time to start your consumer, and see the result from there, and you could debug it with the built in vscode debugger, remember to attach the process pid. Happy debugging!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=16a4d1b545b7" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>