<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Siddu on Medium]]></title>
        <description><![CDATA[Stories by Siddu on Medium]]></description>
        <link>https://medium.com/@gsiddartha117?source=rss-44e67f88fa55------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/0*30Bi6gYRlpyRg_H-</url>
            <title>Stories by Siddu on Medium</title>
            <link>https://medium.com/@gsiddartha117?source=rss-44e67f88fa55------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sun, 24 May 2026 02:26:22 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@gsiddartha117/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[How I Built an AI System That Automatically Analyzes 10,000+ Customer Support Calls]]></title>
            <link>https://medium.com/@gsiddartha117/how-i-built-an-ai-system-that-automatically-analyzes-10-000-customer-support-calls-3e9d045bb0f7?source=rss-44e67f88fa55------2</link>
            <guid isPermaLink="false">https://medium.com/p/3e9d045bb0f7</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[nlp]]></category>
            <category><![CDATA[artificial-intelligence]]></category>
            <category><![CDATA[aws]]></category>
            <dc:creator><![CDATA[Siddu]]></dc:creator>
            <pubDate>Thu, 21 May 2026 14:21:17 GMT</pubDate>
            <atom:updated>2026-05-21T14:21:17.254Z</atom:updated>
            <content:encoded><![CDATA[<p><strong>A deep dive into building a production-grade, AWS-native call analytics platform using NLP, sentiment analysis, and serverless architecture</strong></p><p><em>Published by Godena Siddartha | AI/ML Engineer | </em><a href="https://www.linkedin.com/in/siddartha-g-080b23271"><em>LinkedIn</em></a><em> | </em><a href="https://github.com/SiddarthaGodena-AI"><em>GitHub</em></a></p><h3>The Problem: Manual QA Is Broken at Scale</h3><p>Picture this: a mid-sized company handles hundreds of customer support calls every day. Their QA team manually listens to a random 5% sample, fills out spreadsheets, and tries to infer whether agents are performing well.</p><p>The flaws are obvious:</p><ul><li><strong>95% of calls go unreviewed</strong> — meaning quality issues stay hidden</li><li><strong>Human reviewers are inconsistent</strong> — scoring varies person to person</li><li><strong>Feedback is always delayed</strong> — problems get caught weeks after they happen</li><li><strong>It doesn’t scale</strong> — doubling call volume means doubling QA headcount</li></ul><p>I built AI Call Sentry to solve this. It’s a fully automated, AWS-native pipeline that processes <strong>every single call</strong>, scores it using NLP and sentiment analysis, and delivers structured insights — without a human ever pressing play.</p><h3>The Architecture: End-to-End on AWS</h3><p>Here’s the high-level system design:</p><pre>Customer Call (Audio) → S3 Upload<br>         ↓<br>   S3 Event Trigger<br>         ↓<br>   AWS Lambda (Orchestrator)<br>         ↓<br>   AWS Transcribe (Speaker-Aware Transcription)<br>         ↓<br>   Custom NLP Pipeline (FastAPI Service)<br>         ├── Sentiment Classification<br>         ├── Intent Detection<br>         ├── Tone Analysis<br>         └── Resolution Quality Scoring<br>         ↓<br>   Call Quality Score (0–100)<br>         ↓<br>   Structured Output (JSON) → Storage / Dashboard</pre><p>The key design principle: <strong>zero manual intervention</strong>. Once a call recording lands in S3, the entire pipeline runs automatically. No one needs to press a button.</p><h3>Step 1: Automated Ingestion with S3 + Lambda</h3><p>The entry point is dead simple — an S3 bucket configured with an event notification that fires a Lambda function on every .mp3 or .wav upload.</p><pre># Lambda handler — triggered on S3 upload<br>import boto3<br>import json</pre><pre>def lambda_handler(event, context):<br>    s3_client = boto3.client(&#39;s3&#39;)<br>    transcribe_client = boto3.client(&#39;transcribe&#39;)<br>    <br>    # Extract file info from S3 event<br>    bucket = event[&#39;Records&#39;][0][&#39;s3&#39;][&#39;bucket&#39;][&#39;name&#39;]<br>    key = event[&#39;Records&#39;][0][&#39;s3&#39;][&#39;object&#39;][&#39;key&#39;]<br>    job_name = key.replace(&#39;/&#39;, &#39;_&#39;).replace(&#39;.mp3&#39;, &#39;&#39;)<br>    <br>    # Start speaker-aware transcription job<br>    transcribe_client.start_transcription_job(<br>        TranscriptionJobName=job_name,<br>        Media={&#39;MediaFileUri&#39;: f&#39;s3://{bucket}/{key}&#39;},<br>        MediaFormat=&#39;mp3&#39;,<br>        LanguageCode=&#39;en-US&#39;,<br>        Settings={<br>            &#39;ShowSpeakerLabels&#39;: True,<br>            &#39;MaxSpeakerLabels&#39;: 2  # Agent + Customer<br>        }<br>    )<br>    <br>    return {&#39;status&#39;: &#39;transcription_started&#39;, &#39;job&#39;: job_name}</pre><p>This pattern — S3 trigger → Lambda → Transcribe — is the backbone of the entire system. It’s serverless, infinitely scalable, and costs essentially zero when idle.</p><h3>Step 2: Speaker-Aware Transcription</h3><p>Standard transcription gives you a wall of text. That’s not useful for QA.</p><p>AWS Transcribe’s ShowSpeakerLabels parameter gives you <strong>diarized output</strong> — the transcript is segmented by speaker, so you can tell who said what. This is critical because:</p><ul><li>Agent tone and customer tone need to be analyzed <strong>separately</strong></li><li>Resolution detection requires knowing what the <strong>agent</strong> said last</li><li>Compliance checks target <strong>agent language</strong> specifically</li></ul><p>A simplified version of the diarized output looks like this:</p><pre>{<br>  &quot;speaker_labels&quot;: {<br>    &quot;segments&quot;: [<br>      {&quot;speaker_label&quot;: &quot;spk_0&quot;, &quot;start_time&quot;: &quot;0.0&quot;, &quot;end_time&quot;: &quot;5.3&quot;},<br>      {&quot;speaker_label&quot;: &quot;spk_1&quot;, &quot;start_time&quot;: &quot;5.4&quot;, &quot;end_time&quot;: &quot;12.1&quot;}<br>    ]<br>  },<br>  &quot;transcript&quot;: &quot;Hello, thank you for calling... [full text]&quot;<br>}</pre><p>We map spk_0 → Agent, spk_1 → Customer based on who speaks first, then separate their dialogue for downstream NLP.</p><h3>Step 3: The NLP Pipeline (FastAPI Service)</h3><p>The transcribed text goes into a custom FastAPI service that runs three NLP analyses in parallel:</p><h3>3a. Sentiment Classification</h3><p>We run sentiment analysis separately on agent turns and customer turns using a fine-tuned transformer model:</p><pre>from transformers import pipeline</pre><pre>sentiment_model = pipeline(<br>    &quot;sentiment-analysis&quot;,<br>    model=&quot;distilbert-base-uncased-finetuned-sst-2-english&quot;<br>)</pre><pre>def analyze_sentiment(text_segments):<br>    results = []<br>    for segment in text_segments:<br>        result = sentiment_model(segment[&#39;text&#39;][:512])<br>        results.append({<br>            &#39;speaker&#39;: segment[&#39;speaker&#39;],<br>            &#39;sentiment&#39;: result[0][&#39;label&#39;],<br>            &#39;confidence&#39;: result[0][&#39;score&#39;],<br>            &#39;timestamp&#39;: segment[&#39;start_time&#39;]<br>        })<br>    return results</pre><p>This gives us a <strong>sentiment trajectory</strong> across the call — did the customer start frustrated and leave satisfied? Did the agent’s tone stay professional throughout?</p><p><strong>Achieved accuracy: 88–92% on our test set</strong> of labeled call recordings.</p><h3>3b. Tone &amp; Professionalism Scoring</h3><p>Beyond positive/negative sentiment, we classify agent tone across dimensions like:</p><ul><li>Empathy signals (“I understand”, “I apologize”)</li><li>Urgency language (positive: “I’ll resolve this right now”)</li><li>Negative signals (interruptions, dismissive phrasing)</li></ul><h3>3c. Resolution Detection</h3><p>We use keyword pattern matching + contextual NLP to determine whether the call ended in resolution. Key signals: confirmation language from the agent, absence of escalation keywords, positive customer response in the final 20% of the transcript.</p><h3>Step 4: The 0–100 Quality Score</h3><p>All three analysis outputs feed into a weighted scoring formula:</p><pre>def compute_call_score(sentiment_results, tone_results, resolution_result):<br>    weights = {<br>        &#39;agent_sentiment&#39;:    0.30,  # Was the agent positive and professional?<br>        &#39;customer_sentiment&#39;: 0.25,  # Did customer sentiment improve across the call?<br>        &#39;tone_quality&#39;:       0.25,  # Empathy, clarity, professionalism<br>        &#39;resolution&#39;:         0.20   # Was the issue resolved?<br>    }<br>    <br>    scores = {<br>        &#39;agent_sentiment&#39;:    score_agent_sentiment(sentiment_results),<br>        &#39;customer_sentiment&#39;: score_sentiment_trajectory(sentiment_results),<br>        &#39;tone_quality&#39;:       score_tone(tone_results),<br>        &#39;resolution&#39;:         100 if resolution_result[&#39;resolved&#39;] else 30<br>    }<br>    <br>    final_score = sum(weights[k] * scores[k] for k in weights)<br>    return round(final_score, 1)</pre><p>The output: a single, interpretable 0–100 score per call, plus a breakdown of what drove the score up or down. This is what QA managers actually see.</p><h3>The Results</h3><p>After deploying this system to production:</p><ul><li><strong>85% reduction</strong> in manual auditing effort — QA reviewers now only investigate flagged calls (score &lt; 60)</li><li><strong>10x faster</strong> turnaround — real-time scores vs. 2-week manual review cycles</li><li><strong>100% call coverage</strong> — up from 5% random sampling</li><li><strong>88–92% accuracy</strong> on sentiment classification validated against human-labeled samples</li></ul><p>The most powerful outcome: <strong>pattern detection became possible</strong>. When you analyze every call, you can find systemic issues — specific product complaints spiking on certain dates, agents consistently struggling with certain query types, regional tone differences. None of that was visible with 5% sampling.</p><h3>What I’d Do Differently</h3><p><strong>1. Replace rule-based resolution detection with a fine-tuned classifier.</strong> The keyword approach works but misses nuanced cases. A model fine-tuned on labeled resolved/unresolved transcripts would push accuracy significantly higher.</p><p><strong>2. Add real-time streaming analysis.</strong> The current pipeline works on completed call recordings. Processing calls as they happen (using Amazon Kinesis + real-time Transcribe) would enable live coaching for agents.</p><p><strong>3. Build a dashboard.</strong> Right now the output is structured JSON. A Grafana or Streamlit dashboard showing score trends, sentiment heatmaps, and flagged call counts would make this immediately useful to non-technical QA managers.</p><h3>Try It Yourself</h3><p>The full project is on GitHub: <a href="https://github.com/SiddarthaGodena-AI/AI-Call-Sentry">AI-Call-Sentry</a></p><p>The repo includes the Lambda handler, FastAPI NLP service, scoring logic, and setup instructions for the AWS infrastructure.</p><p><em>If you found this useful, follow me on </em><a href="https://www.linkedin.com/in/siddartha-g-080b23271"><em>LinkedIn</em></a><em> — I write about building production AI systems, GenAI pipelines, and lessons from deploying ML at work.</em></p><p><em>Next post: How I built a voice-driven multi-agent troubleshooting system using LangChain, Gemini, and RAG — and why multi-agent routing changed everything about response quality.</em></p><p><strong>Tags:</strong> #MachineLearning #AWS #NLP #Python #ArtificialIntelligence #SentimentAnalysis #ProductionML #LLM #FastAPI #MLEngineer</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3e9d045bb0f7" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>