<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Cherukuri sai on Medium]]></title>
        <description><![CDATA[Stories by Cherukuri sai on Medium]]></description>
        <link>https://medium.com/@cherukurisai?source=rss-64295cea6d86------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*ATgWyxhtOoC2DA5rEairBA@2x.jpeg</url>
            <title>Stories by Cherukuri sai on Medium</title>
            <link>https://medium.com/@cherukurisai?source=rss-64295cea6d86------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sun, 17 May 2026 02:48:41 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@cherukurisai/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[ I Built an AI That Fixes Pipeline Failures Before Platform or DevSecOps teams Gets the Slack…]]></title>
            <link>https://pub.towardsai.net/i-built-an-ai-that-fixes-pipeline-failures-before-platform-or-devsecops-teams-gets-the-slack-82ff81114175?source=rss-64295cea6d86------2</link>
            <guid isPermaLink="false">https://medium.com/p/82ff81114175</guid>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[cloud-computing]]></category>
            <category><![CDATA[ai-agent]]></category>
            <category><![CDATA[ci-cd-pipeline]]></category>
            <category><![CDATA[devops]]></category>
            <dc:creator><![CDATA[Cherukuri sai]]></dc:creator>
            <pubDate>Sun, 22 Feb 2026 18:35:15 GMT</pubDate>
            <atom:updated>2026-02-23T06:15:37.921Z</atom:updated>
            <content:encoded><![CDATA[<h3><strong>🤖 I Built an AI That Fixes Pipeline Failures Before Platform or DevSecOps teams Gets the Slack Message</strong></h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*2VljXKA1W0PSfeMdC5jUdg.jpeg" /></figure><p><strong>📉 The Slack Notification That’s Crushing Your Sprint Velocity</strong></p><p><strong>“Hey, the pipeline failed again. Can you check?”</strong></p><p><em>Whether it’s 2 AM production calls or during business hours, your phone buzzes. You squint at Slack — another developer hasn’t even *looked* at the error logs. They just pinged you. Again. Meanwhile, your actual engineering tasks? Delayed. Your sprint commitments? Slipping.</em></p><p><strong>Sound familiar?</strong></p><p><em>If you’re on a Platform, or DevSecOps team, this is your daily reality. Developers don’t read error logs most of the time. They screenshot red X’s and ask, “</em><strong><em>Can you fix this?</em></strong><em>” Meanwhile, you’re playing detective through </em><strong><em>terraform</em></strong><em> traces, </em><strong><em>kubectl</em></strong><em> describes, and </em><strong><em>npm</em></strong><em> error dumps.</em></p><h3><strong>🤔 “But wait, doesn’t my CI/CD tool have AI features now?</strong></h3><p>Sure. Many platforms are adding AI capabilities. But here’s the problem:</p><p><em>- ❌ They don’t understand *your* organizational standards and policies</em></p><p><em>- ❌ You can’t customize them with your company’s best practices</em></p><p><em>- ❌ They’re generic — trained on public data, not your codebase patterns</em></p><p><em>- ❌ Vendor lock-in — you’re stuck with whatever they decide to build</em></p><p><em>- ❌ Can’t inject your known failure patterns and compliance rules</em></p><p><strong>💡 What if you could build something better? A pipeline that truly understands YOUR context and diagnoses itself?</strong></p><blockquote><strong>I Got Tired of Being a Human Log Parser</strong></blockquote><p>After the 47th “npm test failed, help!” message (where the error literally said `package.json not found`), I’d had enough.</p><p>So, I built something different: <strong>An AI-powered failure analyzer that tells developers EXACTLY what’s wrong and how to fix it — right in the pipeline output.</strong></p><blockquote>No Slack. No tickets. No context switching. Just instant, actionable answers.</blockquote><h3><strong>Here’s What It Looks Like in Action</strong></h3><p><strong>Example 1: NPM Test Failure<br>Before: </strong><em>The Typical Developer Experience</em></p><pre>❌ Run Tests Failed<br>Error: Process completed with exit code 1<br><br>npm ERR! code ENOENT<br>npm ERR! syscall open<br>npm ERR! path /home/runner/work/project/package.json<br>npm ERR! errno -2<br>npm ERR! enoent ENOENT: no such file or directory, open &#39;/home/runner/work/project/package.json&#39;<br>npm ERR! enoent This is related to npm not being able to find a file.</pre><blockquote><strong>Developer:” I’ll ask DevOps…”</strong> 😕</blockquote><p><strong>After: </strong><em>With AI Analysis</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ztWVbqtFxQccx6XFNvYj1A.png" /></figure><blockquote>Developer:” Oh<em>, I’ll fix that.”</em> ✅</blockquote><p><strong>Example 2: Helm Deployment Failure<br>Before: </strong><em>The Typical Developer Experience</em></p><pre>❌ Helm Install Failed<br>Error: timed out waiting for the condition</pre><blockquote>Developer:” I’ll<em> ask DevOps…”</em></blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Fax5KWsPzh5YpiZw3Dyz6g.png" /></figure><blockquote>Developer:” Got<em> it, fixing now. And I’ll add those kubectl commands to my pipeline for next time.”</em> ✅</blockquote><p><strong>Example 3: Terraform Validation Failure<br>Before: </strong><em>The Typical Developer Experience</em></p><pre>❌ Terraform Validate Failed<br>Error: Error: Unsupported argument</pre><blockquote>Developer:” I’ll<em> ask DevOps…”</em></blockquote><p><strong>After: </strong><em>With AI Analysis</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*95c1rTUz6lsunBT7OdfntQ.png" /></figure><blockquote>Developer:” Makes<em> sense, changing it.”</em> ✅</blockquote><p><strong>Example 4: Git Authentication Failure (Rule-Based)<br>Before: </strong><em>The Typical Developer Experience</em></p><pre>❌ Clone Private Repo Failed<br>Error: Process completed with exit code 1<br><br>Cloning repository...<br>fatal: could not read Username for &#39;https://github.com&#39;: No such device or address<br>Permission denied (publickey).<br>fatal: Could not read from remote repository.</pre><blockquote>Developer:” I’ll<em> ask DevOps…”</em> 😕</blockquote><p><strong>After: </strong><em>With Rule-Based Analysis</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*DjP3iHu9Gqzbu4nBXZQqCg.png" /></figure><blockquote>Developer:” Ah<em>, permissions issue. Let me check with the Platform team — they manage our tokens.”</em> ✅</blockquote><p><strong>🔬 The Secret Sauce: Rule-Based + AI Hybrid Approach</strong></p><p>I didn’t want another chatbot. I wanted something that understands <em>*</em><strong><em>pipeline failures</em></strong><em>*</em> specifically.</p><p>So, I built a hybrid system that combines the best of both worlds:</p><p><strong>1. Rule-Based Analysis (Standards &amp; Policies)</strong></p><blockquote>Injected our organization’s coding standards and best practices</blockquote><blockquote>Pre-configured common failure patterns</blockquote><blockquote>Fast, deterministic checks for known issues</blockquote><blockquote>Enforces company-specific policies and compliance requirements</blockquote><p><strong>2. AI-Powered Deep Analysis (Custom Model)</strong></p><blockquote>When rules don’t match, custom AI model takes over</blockquote><blockquote>Custom prompts tuned for infrastructure and deployment errors</blockquote><blockquote>Understands context across multiple files and configurations</blockquote><blockquote>Learns from error patterns we haven’t seen before</blockquote><p><strong>3. Smart Context Capture</strong></p><blockquote>Not just “npm failed” but the actual error output</blockquote><blockquote>Pod descriptions for Kubernetes issues</blockquote><blockquote>Terraform validation errors with line numbers</blockquote><blockquote>Relevant configuration files (values.yaml, package.json, etc.)</blockquote><p><strong>4. Actionable Intelligence</strong></p><blockquote>“Change X to Y in file Z”</blockquote><blockquote>Not vague suggestions like “check your config”</blockquote><blockquote>Includes confidence level so devs know when to escalate</blockquote><p><strong>🏗️ System Architecture<br></strong>Here’s how it all flows together:</p><pre><br>┌─────────────────────────────────────┐<br>│      GitHub Actions Pipeline       │<br>│    (or any CI/CD platform)         │◄──── Developer pushes code<br>└───────────────┬─────────────────────┘<br>                │<br>                │ ❌ Pipeline Failure<br>                ▼<br>┌─────────────────────────────────────┐<br>│       Error Context Capture         │<br>│  • Build/test logs                  │<br>│  • Config files (values.yaml, etc)  │<br>│  • Pod status (Kubernetes)          │<br>│  • Terraform output                 │<br>└───────────────┬─────────────────────┘<br>                │<br>                ▼<br>┌─────────────────────────────────────────────────────┐<br>│            (Serverless)                             │<br>│                                                     │<br>│  ┌───────────────────────────────────────────┐     │<br>│  │      Rule-Based Analysis Engine           │     │<br>│  │  • Organizational Standards               │     │<br>│  │  • Common Failure Patterns                │     │<br>│  │  • Policy Compliance Checks               │     │<br>│  └───────────────┬───────────────────────────┘     │<br>│                  │                                  │<br>│     Known Issue? │                                  │<br>│         ✓        │         ✗ Unknown Issue          │<br>│         │        └──────────────┐                   │<br>│         │                       ▼                   │<br>│         │          ┌─────────────────────────┐      │<br>│         │          │   Custom AI Model       │      │<br>│         │          │  • Custom Prompts       │      │<br>│         │          │  • Context Analysis     │      │<br>│         │          │  • Root Cause Detection │      │<br>│         │          └─────────────────────────┘      │<br>│         │                       │                   │<br>│         └───────────┬───────────┘                   │<br>│                     ▼                               │<br>│  ┌─────────────────────────────────────────┐       │<br>│  │         Response Generation             │       │<br>│  │  • Root Cause Identified                │       │<br>│  │  • Affected File/Location               │       │<br>│  │  • Exact Fix Instructions               │       │<br>│  │  • Confidence Level                     │       │<br>│  └─────────────────────────────────────────┘       │<br>└───────────────┬─────────────────────────────────────┘<br>                │<br>                ▼<br>┌─────────────────────────────────────┐<br>│       Pipeline Output (UI)          │<br>│  🤖 AI FAILURE ANALYSIS             │<br>│  📊 Root Cause + File Location      │<br>│  🔧 Step-by-Step Fix                │<br>│  ✅ Confidence: High/Medium/Low     │<br>└───────────────┬─────────────────────┘<br>                │<br>                ▼<br>┌─────────────────────────────────────┐<br>│         Developer Action            │<br>│  • Reads clear explanation          │<br>│  • Applies fix immediately          │<br>│  • No DevOps interruption needed ✅ │<br>└─────────────────────────────────────┘<br></pre><p><strong>The Flow:</strong></p><p>1. <strong>Pipeline Fails</strong> → Error logs, config files, and context captured<br>2. <strong>Rule Engine</strong> → Checks against organizational standards and known patterns<br>3. <strong>AI Analysis</strong> → If rules don’t match, custom AI model analyzes with tailored prompts<br>4. <strong>Response</strong> → Developer gets exact file location, root cause, and fix steps<br>5. <strong>Resolution</strong> → Developer applies fix or auto-fix PR is created</p><blockquote><em>The entire process takes 2–5 seconds from failure to actionable recommendation.</em></blockquote><h3><strong>📊 The Impact: Measured in Hours Saved</strong></h3><p><strong>Before:<br></strong>- Average resolution time: 45 minutes<br>- 60% of issues: developers didn’t read logs<br>- Platform team: interrupt-driven firefighting</p><p><strong>After:<br></strong>- 80% of issues: self-service fixes<br>- Developers get answers in seconds<br>- Platform team: focused on actual platform work</p><p><strong>🛠️ You Can Build This Too<br></strong><em>The architecture is straightforward and cloud-agnostic:</em></p><p>✅ <strong>Serverless Function: </strong>(Azure Functions, AWS Lambda, or Google Cloud Functions)<br>✅ <strong>AI Model: </strong>(Azure OpenAI, AWS Bedrock, or Google Vertex AI — deployed as custom model for this POC)<br>✅ <strong>CI/CD Integration:</strong> (Works with GitHub Actions, Harness, GitLab CI, Jenkins, Azure DevOps, or any pipeline)<br>✅ <strong>Multi-Stack Support:</strong> (npm, Docker, Kubernetes, Terraform, Helm, and any build/deployment tool)</p><blockquote>No vendor lock-in. Choose your cloud provider and AI service. The concept works across all major platforms.</blockquote><p><strong>🚀 The Future: Self-Healing Pipelines?</strong></p><p>Right now, it <em>*</em><strong><em>diagnoses</em></strong><em>*</em>. <br>Next step? <strong>Auto-fix pull requests.<br>Imagine:</strong></p><p>1. Pipeline fails on invalid Kubernetes <strong>CPU</strong> value<br>2. AI detects: `<strong>cpu: INVALID_VALUE</strong>` in values.yaml<br>3. Bot creates PR: “<strong>Fix: Change cpu to 100m</strong>”<br>4. Developer reviews and merges<br>5. Pipeline passes</p><p>From failure to fix in 30 seconds. No human parsing logs.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*VUn-zOI6rQiX68V19kioGg.png" /></figure><h3><strong>💥 The Bottom Line: This Changes Everything</strong></h3><p>Every platform or DevOps team on the planet faces this problem. From startups to Fortune 500 companies, the cycle is the same:</p><blockquote>Developer breaks pipeline</blockquote><blockquote>DevOps drops everything to read logs</blockquote><blockquote>Manual explanation of obvious error</blockquote><blockquote>Repeat 50 times a day</blockquote><p><strong>**I built this AI agent because I got tired of being a log-reading service.**</strong></p><p>This isn’t just another automation script. It’s a <strong>**fundamental shift**</strong> in how we think about developer independence and platform or DevOps team efficiency.</p><p><strong>The Real Impact:</strong></p><blockquote>Your platform or DevOps team stops being interrupt-driven</blockquote><blockquote>Developers solve their own issues in seconds, not hours</blockquote><blockquote>Your organizational knowledge is baked into every analysis</blockquote><blockquote>You own the code — no vendor dictating features or pricing</blockquote><blockquote>It works across every tool in your stack</blockquote><p><strong>**What took me 45 minutes to debug now takes developers 45 seconds to fix themselves. **</strong></p><p>That’s not just time saved. That’s your platform engineers building the future instead of explaining the past.</p><p>This is what modern platform engineering looks like — <strong>**systems that scale knowledge, not just infrastructure. **</strong></p><h3><strong>💬 Want to stop being your team’s human error parser?</strong></h3><p>I’ve built the complete architecture — rule engine, custom AI integration, and deployment framework — that’s already saving platform teams hours every week.</p><p><strong>**Let’s talk.**</strong> Visit my portfolio to get in touch if you’re ready to:</p><blockquote>Cut your team’s pipeline debugging time by 80%</blockquote><blockquote>Give developers self-service failure diagnosis</blockquote><blockquote>Finally focus on building features instead of reading logs</blockquote><p><strong><em>Portfolio: </em></strong><a href="https://www.cherukurisai.com/"><strong><em>Click Here</em></strong></a></p><p>The code isn’t open source, but the conversation is. Let’s build something powerful for your team.</p><p><em>*Have a platform engineering or DevOps story? Drop it in the comments — I’d love to hear how other teams are scaling their operations. *</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=82ff81114175" width="1" height="1" alt=""><hr><p><a href="https://pub.towardsai.net/i-built-an-ai-that-fixes-pipeline-failures-before-platform-or-devsecops-teams-gets-the-slack-82ff81114175">🤖 I Built an AI That Fixes Pipeline Failures Before Platform or DevSecOps teams Gets the Slack…</a> was originally published in <a href="https://pub.towardsai.net">Towards AI</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Beyond Migration: How We Engineered a Secure & Intelligent Delivery Platform with Harness CICD]]></title>
            <link>https://medium.com/devsecops-community/beyond-migration-how-we-engineered-a-secure-intelligent-delivery-platform-with-harness-cicd-6b994077dee4?source=rss-64295cea6d86------2</link>
            <guid isPermaLink="false">https://medium.com/p/6b994077dee4</guid>
            <category><![CDATA[devsecops]]></category>
            <category><![CDATA[harness]]></category>
            <category><![CDATA[architecture]]></category>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[cicd]]></category>
            <dc:creator><![CDATA[Cherukuri sai]]></dc:creator>
            <pubDate>Thu, 19 Feb 2026 05:44:39 GMT</pubDate>
            <atom:updated>2026-02-26T16:00:11.022Z</atom:updated>
            <content:encoded><![CDATA[<h4>Our Harness migration became the turning point — not because of the tool, but because of the architecture we built around it.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*RBwhyps-gux00f4788Vf2g.png" /></figure><p><strong>TABLE OF CONTENTS</strong></p><ol><li>Introduction</li><li>Executive Outcomes</li><li>Phase 1 — Redesigning Identity</li><li>Phase 2 — Delegate Architecture Redesign</li><li>Phase 3 — Deterministic Execution</li><li>Phase 4 — Governance as Code</li><li>Phase 5 — Immutable Artifact Lifecycle</li><li>Phase 6 — Progressive Delivery and Feature Flags</li><li>Capabilities Most Teams Never Operationalize</li><li>Migration vs Modernization</li><li>Conclusion</li></ol><h4>Introduction:</h4><p><strong>Most organizations treat CI/CD migration as a tooling upgrade.</strong></p><blockquote><strong>Replace Jenkins, TeamCity, GitHub Actions etc.<br>Adopt Harness.<br>Recreate pipelines.</strong></blockquote><blockquote><strong><em>B</em></strong><em>ut migration only upgrades tools.<br>Modernization upgrades architecture.</em></blockquote><p>When we moved to Harness, I knew that simply shifting pipelines would not reduce risk, improve reliability, or strengthen governance. Carrying forward our existing trust and execution model would only scale our weaknesses.</p><p>So instead of treating this as a CI/CD replacement, we approached it as <strong>Secure Delivery Platform Engineering</strong> — redesigning identity, governance, execution boundaries, artifact flow, and reliability as first‑class platform concerns.</p><p><strong>CI/CD is not automation.</strong></p><blockquote>It is a privileged control plane.<br>If engineered casually, it scales risk.<br>If engineered intentionally, it scales safety and velocity.</blockquote><h3>🔎 Executive Results</h3><ul><li>🔐 <strong>100% removal of static cloud credentials</strong> — 37 IAM keys eliminated with OIDC</li><li>📉 <strong>~40% reduction in pipeline inconsistencies</strong> through deterministic execution</li><li>🚫 <strong>Zero unapproved production deployments</strong> after policy‑as‑code enforcement</li><li>⚡ <strong>~30% throughput improvement</strong> with delegate segmentation + scaling</li><li>🛡 <strong>~50% reduction in deployment‑related risk</strong> using feature flags &amp; progressive delivery</li><li>📦 <strong>100% artifact traceability</strong> via build‑once, promote‑everywhere</li><li>📊 Stronger audit posture and reduced governance review overhead</li></ul><p><strong><em>These were not cosmetic improvements.<br>They were architectural corrections.</em></strong></p><h3>Phase 1 — Redesigning Identity, Not Just Pipelines</h3><p><strong>Our first challenge</strong>: credential sprawl.</p><ul><li>37 static IAM access keys across pipelines</li><li>Shared service accounts</li><li>Cross‑environment permissions</li></ul><p>We replaced static credentials with <strong>OIDC‑based role assumption</strong>:</p><ul><li>Pipelines assumed short‑lived scoped roles</li><li>Environment‑specific access</li><li>Long‑lived secrets eliminated</li></ul><p><strong>Impact:</strong></p><ul><li>Entire category of credential leakage risk removed</li><li>90% reduction in credential rotation</li><li>Stronger audit traceability</li></ul><p>CI/CD became <strong>identity‑aware execution infrastructure</strong>.</p><h3>Phase 2 — Treating Delegates as Privileged Control Plane Infrastructure</h3><p><strong>Delegates perform:</strong></p><ul><li>Infrastructure provisioning</li><li>Cluster operations</li><li>Secret access</li><li>Production deployments</li></ul><blockquote>They are not background agents.<br>They are privileged systems.</blockquote><p>We redesigned delegate architecture:</p><ul><li>Dedicated delegate groups per environment</li><li>Enforced delegate selectors in pipelines</li><li>Production delegates placed in private subnets</li><li>Restricted outbound egress</li></ul><p><strong>Impact:</strong></p><ul><li>Reduced cross‑environment execution risk</li><li>Contained blast radius</li><li>Clear execution boundaries</li></ul><p>Trust became <strong>intentional</strong>, not shared.</p><h3>Phase 3 — Deterministic Execution Using Containerized Toolchains</h3><p>Instead of manual tool installation on delegates, we built versioned CI images containing:</p><ul><li>Terraform, TFLint, Checkov</li><li>kubectl, Helm</li><li>AWS CLI, AZ CLI</li><li>OPA, Cosign</li><li>Internal validation scripts</li></ul><p>Pipelines executed inside these containers.</p><p><strong>Impact:</strong></p><ul><li>Zero delegate drift</li><li>~40% fewer pipeline inconsistencies</li><li>Easy tool upgrades via image versioning</li></ul><p>Tooling became <strong>deterministic and reproducible</strong>.</p><h3>Phase 4 — Governance as Code, Not Process</h3><p>Security guidance without enforcement is optional.</p><p>We enforced governance at platform level:</p><ul><li>Organization‑level reusable templates</li><li>Mandatory scanning and validation steps</li><li>Policy‑as‑code enforcement (OPA)</li><li>Approval logic encoded in pipelines</li><li>Registry restrictions and disallowed “latest” tags</li></ul><p><strong>Impact:</strong></p><ul><li>Zero bypassed production governance</li><li>Standardized patterns across teams</li><li>Faster compliance cycles</li></ul><p>Governance became <strong>automated, not manual</strong>.</p><h3>Phase 5 — Immutable Artifact Lifecycle</h3><p>We eliminated rebuild‑per‑environment patterns.</p><p>Instead:</p><ul><li>Build once</li><li>Sign artifact</li><li>Promote Dev → QA → Prod</li><li>Verify signatures before deploy</li></ul><p><strong>Impact:</strong></p><ul><li>100% artifact traceability</li><li>Less drift and fewer surprises</li><li>Strong rollback confidence</li></ul><p>Production became a <strong>promotion environment</strong>, not a rebuild environment.</p><h3>Phase 6 — Progressive Delivery &amp; Feature Flags</h3><p>The biggest risk reduction came from feature flags:</p><ul><li>Canary rollouts</li><li>Gradual traffic exposure</li><li>Instant rollback via flag toggle</li><li>Environment‑based flag policies</li></ul><p><strong>Impact:</strong></p><ul><li>~50% reduction in deployment incidents</li><li>Faster mitigation</li><li>Higher deployment frequency with lower risk</li></ul><p>Deployment and exposure were <strong>decoupled</strong>.</p><h3>Capabilities Most Teams Never Operationalize</h3><p>Most teams adopt Harness.<br>Few operationalize its full platform capabilities.</p><p>Here’s what we embedded:</p><h3>A. Git‑Based Pipeline Change Governance</h3><ul><li>PR‑based updates</li><li>No UI editing</li><li>Full traceability</li></ul><p>Pipelines became <strong>infrastructure‑as‑code</strong>.</p><h3>B. Monitoring‑Driven Automated Rollback</h3><ul><li>Canary vs baseline checks</li><li>Automated anomaly detection</li><li>Auto‑rollback triggers</li></ul><p>Deployments became <strong>self‑validating</strong>.</p><h3>C. Delegate Auto‑Scaling</h3><ul><li>Kubernetes‑based scaling</li><li>Elastic execution</li><li>Reduced idle costs</li></ul><p>CI/CD became <strong>elastic infrastructure</strong>.</p><h3>D. Error Budget–Aware Deployment Gating</h3><ul><li>SLO health checks</li><li>Deployment restrictions during instability</li></ul><p>Delivery became <strong>reliability‑aware</strong>.</p><h3>E. Chaos‑Validated Rollbacks</h3><ul><li>Rollback paths tested through chaos engineering</li></ul><p>Resilience became <strong>provable</strong>.</p><h3>F. Centralized Connector Governance</h3><ul><li>No team‑owned connectors</li><li>Centralized authentication patterns</li></ul><p>Credential sprawl dropped significantly.</p><h3>G. Developer Experience Uplift</h3><ul><li>Faster troubleshooting</li><li>Reusable templates</li><li>Safer experimentation</li><li>Predictable deployments</li></ul><p>Developers gained <strong>safe autonomy</strong>.</p><h3>Migration vs. Modernization</h3><blockquote><strong>Migration moves pipelines.</strong><br><strong>Modernization redesigns the delivery platform.</strong></blockquote><p>Modernization means:</p><ul><li>Identity redesign</li><li>Shared‑nothing execution boundaries</li><li>Governance as code</li><li>Deterministic toolchains</li><li>Immutable artifacts</li><li>Progressive delivery</li><li>Reliability‑aware deployment gates</li></ul><blockquote><strong>Many organizations migrate.<br>Few modernize.</strong></blockquote><h3>Conclusion:</h3><p>Harness did not modernize our ecosystem.<br><strong>Architectural intent did.</strong></p><p>By redesigning identity, segmentation, deterministic execution, governance, artifact flow, and reliability, we transformed CI/CD from automation into a <strong>secure delivery platform</strong>.</p><p>CI/CD is not just a pipeline.<br>It is a <strong>privileged control plane</strong>.</p><p>When engineered deliberately, it becomes the foundation of safe, scalable, high‑trust delivery.</p><p><strong>Tools don’t create maturity.<br>Architecture does.<br>Intent does.<br>Design does.</strong></p><blockquote><strong>Harness was the canvas.<br>Secure Delivery Platform Engineering was the art.</strong></blockquote><h3>DevSecOps — Community 🚀</h3><p><em>Thank you for being a part of the </em><a href="https://medium.com/devsecops-community/devopsin90days/home"><strong><em>DevSecOps — Community</em></strong></a><strong><em> </em></strong><em>community! Before you go:</em></p><ul><li>Be sure to <strong>clap</strong> and <strong>follow</strong> ️ the Author👏<strong>️️</strong></li><li>Follow: <a href="https://medium.com/devsecops-community/newsletters/devsecops"><strong>Newsletter</strong></a> |<a href="https://www.linkedin.com/groups/14547253/"> <strong>LinkedIn Group</strong></a><strong>s </strong>|</li><li>More content at <a href="https://medium.com/devsecops-community/devopsin90days/home"><strong>DevSecOps — Community</strong></a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=6b994077dee4" width="1" height="1" alt=""><hr><p><a href="https://medium.com/devsecops-community/beyond-migration-how-we-engineered-a-secure-intelligent-delivery-platform-with-harness-cicd-6b994077dee4">Beyond Migration: How We Engineered a Secure &amp; Intelligent Delivery Platform with Harness CICD</a> was originally published in <a href="https://medium.com/devsecops-community">devsecops-community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[ Part 2 — How to Build Your Own AI Agent: (Cloud-Agnostic, Fully Automated, Enterprise-Ready)]]></title>
            <link>https://pub.towardsai.net/part-2-how-to-build-your-own-ai-agent-cloud-agnostic-fully-automated-enterprise-ready-ec3c749570ac?source=rss-64295cea6d86------2</link>
            <guid isPermaLink="false">https://medium.com/p/ec3c749570ac</guid>
            <category><![CDATA[ci-cd-pipeline]]></category>
            <category><![CDATA[cloud-computing]]></category>
            <category><![CDATA[ai-agent]]></category>
            <category><![CDATA[agentic-ai]]></category>
            <category><![CDATA[terraform]]></category>
            <dc:creator><![CDATA[Cherukuri sai]]></dc:creator>
            <pubDate>Mon, 16 Feb 2026 09:16:20 GMT</pubDate>
            <atom:updated>2026-02-23T06:14:32.691Z</atom:updated>
            <content:encoded><![CDATA[<h4><em>From natural-language prompts → to Terraform module → to PR → to CI/CD → to validation</em></h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*p_M5C9s7tpRyHYx-hnwZHQ.jpeg" /></figure><p>Most AI content explains concepts.<br>This guide helps you <strong>build something real</strong> — a fully functioning AI agent that:</p><ul><li>Understands natural-language infrastructure requests</li><li>Generates <strong>complete Terraform modules</strong> (multi-file)</li><li>Enforces <strong>strict enterprise standards</strong></li><li>Auto-fixes issues via LLM reasoning</li><li>Creates a GitHub branch</li><li>Commits all files</li><li>Opens a pull request</li><li>Triggers GitHub Actions</li><li>Runs <strong>unit tests</strong></li><li>Runs <strong>Terraform init + validate + plan</strong></li><li>Works in <strong>any cloud</strong> (AWS, Azure, GCP, etc.)</li><li>Works in <strong>any pipeline</strong> (GitHub, Harness, GitLab, Jenkins, Azure DevOps, etc.)</li></ul><p>By the end of this article, you’ll have the blueprint of a <strong>Digital DevOps Engineer Agent</strong>.</p><h3>1️⃣ What Makes This Agent Different?</h3><p>It does <em>not</em> assume AWS or Azure or GCP.<br>If your prompt says:</p><blockquote>“Create AWS Lambda module” → Generates AWS Terraform</blockquote><blockquote>“Create Azure Storage Account module” → Generates Azure Terraform</blockquote><blockquote>“Create GCP Cloud Run module” → Generates GCP Terraform</blockquote><p>It adapts automatically because it generates <strong>pure Terraform (HCL)</strong>.</p><h3>✔ Cloud‑Agnostic</h3><p>Works with <strong>ANY Terraform provider</strong>, including:</p><ul><li>AWS</li><li>Azure</li><li>GCP</li><li>OCI</li><li>Cloudflare</li><li>Kubernetes</li><li>DigitalOcean</li><li>VMware vSphere</li><li>Proxmox</li><li>GitHub provider</li><li>And <em>every</em> provider on the Terraform registry</li></ul><h3>✔ CI/CD‑Agnostic</h3><p>Runs in:</p><ul><li>GitHub Actions</li><li>Harness</li><li>GitLab CI</li><li>Bitbucket Pipelines</li><li>Jenkins</li><li>Azure DevOps</li><li>CircleCI</li></ul><p>Anywhere Python + Terraform exist — the agent works.</p><h3>✔ Enterprise‑Grade Validation</h3><p>Your standards engine validates:</p><blockquote>Required tags</blockquote><blockquote>snake_case variable names</blockquote><blockquote>IAM least privilege</blockquote><blockquote>Secret detection</blockquote><blockquote>Deprecated syntax detection</blockquote><blockquote>VPC subnet structure</blockquote><blockquote>Provider best practices</blockquote><blockquote>Module reusability</blockquote><blockquote>Terraform version constraints</blockquote><p>And does this for <strong>every .tf file in the module</strong>.</p><h3>✔ Complete Multi‑File Generation</h3><p>Every request generates:</p><pre>main.tf<br>variables.tf<br>outputs.tf<br>providers.tf<br>README.md</pre><h3>2️⃣ Understanding AI Agents (Enterprise Context)</h3><p>An AI Agent is not just an LLM.<br> It is a system made of:</p><p>🔹 <strong>Brain (LLM):</strong> <em>Interprets the user’s request.</em><br>🔹 <strong>Memory (Context, RAG):</strong> <em>Holds standards, patterns, best practices.<br></em>🔹 <strong>Tools (Python, Filesystem, Terraform CLI, GitHub API):</strong> <em>Allows the agent to act, not just talk.<br></em>🔹 <strong>Reasoning Loop:</strong> <em>Plan → Generate → Validate → Fix → Loop<br></em>🔹 <strong>Policy Layer:</strong> <em>Your org’s security, naming, tagging, compliance.<br></em>🔹 <strong>Runtime Environment</strong>: <em>GitHub Actions, Pipelines, Local runner, Cloud VMs.</em></p><p>Together, they form a <strong>Digital DevOps Engineer</strong>.</p><h3>3️⃣ Prerequisites</h3><h3>✔ Skills</h3><ul><li>Python</li><li>Terraform</li><li>GitHub</li><li>CI/CD</li><li>Prompt engineering basics</li></ul><h3>✔ Installations</h3><p>Run:</p><pre>pip install openai langchain python-dotenv PyGithub<br>brew install terraform        # Mac<br>choco install terraform       # Windows</pre><h3>✔ Organization Inputs</h3><p>Prepare a standards file:<br><strong>standards.md</strong></p><pre>1. Tags required: environment, owner, cost_center.<br>2. Variables must use snake_case.<br>3. IAM must follow least privilege.<br>4. No hardcoded secrets.<br>5. Modules must be reusable.<br>6. VPC must include public + private subnets.</pre><p>This becomes your <strong>policy engine</strong>.</p><h3><strong>4️⃣ Architecture of What We’re Building</strong></h3><pre>User Provides Prompt<br>        ↓<br>┌───────────────────────────────────────────────────────┐<br>│  Agent Pipeline (agent_real.py)                       │<br>│  ────────────────────────────────────                 │<br>│  1. Plan → Break request into steps (LLM)            │<br>│  2. Generate → Create Terraform module files          │<br>│     • main.tf (resources)                             │<br>│     • variables.tf (inputs with validation)           │<br>│     • outputs.tf (outputs with descriptions)          │<br>│     • providers.tf (provider versions &amp; config)       │<br>│     • README.md (usage documentation)                 │<br>│  3. Validate → Check against standards                │<br>│     • Required tags (environment, owner, cost_center) │<br>│     • snake_case variables                            │<br>│     • No hardcoded secrets                            │<br>│     • Least-privilege IAM                             │<br>│     • No deprecated features                          │<br>│     • Current provider best practices                 │<br>│  4. Fix → Auto-correct issues (LLM or heuristics)     │<br>│  5. Loop → Repeat validate/fix until clean            │<br>└───────────────────────────────────────────────────────┘<br>        ↓<br>┌───────────────────────────────────────────────────────┐<br>│  GitHub Integration (agent.py)                        │<br>│  ─────────────────────────                            │<br>│  • Create feature branch: ai/&lt;slug&gt;-YYYYMMDDHHMMSS    │<br>│  • Commit all module files to modules/&lt;branch&gt;/       │<br>│  • Open Pull Request                                  │<br>└───────────────────────────────────────────────────────┘<br>        ↓<br>┌───────────────────────────────────────────────────────┐<br>│  CI/CD Workflows (.github/workflows/)                 │<br>│  ────────────────────────────────────                 │<br>│  • python-tests.yml → Run validation tests            │<br>│  • terraform.yml → init + validate + plan all modules │<br>└───────────────────────────────────────────────────────┘<br>        ↓<br>   PR Ready for Review</pre><h3><strong>5️⃣ Step 1: Build the Python Utility (Terraform Standards Validator)</strong></h3><p><strong>Create: terraform_standards.py<br></strong>This is <strong>your AI enforcer</strong>.</p><pre>import re<br><br>TERRAFORM_STANDARDS = &quot;&quot;&quot;<br>1. Required tags: environment, owner, cost_center.<br>2. IAM roles must have least-privilege policies.<br>3. Use snake_case for variables.<br>4. No hardcoded secrets allowed.<br>5. Modules must be reusable.<br>6. VPC must include public + private subnets.<br>&quot;&quot;&quot;<br><br><br>def validate(code: str) -&gt; str:<br>    issues = []<br><br>    # Check for tags presence anywhere<br>    if not re.search(r&quot;\btags\s*=\s*\{&quot;, code):<br>        issues.append(&quot;Missing required tags block (environment, owner, cost_center).&quot;)<br>    else:<br>        # Ensure required tag keys exist in any tags block<br>        tags_blocks = re.findall(r&quot;tags\s*=\s*\{([^}]*)\}&quot;, code, flags=re.S)<br>        for tb in tags_blocks:<br>            if &quot;environment&quot; not in tb or &quot;owner&quot; not in tb or &quot;cost_center&quot; not in tb:<br>                issues.append(&quot;Tags block missing one of environment/owner/cost_center.&quot;)<br>                break<br><br>    # Hardcoded secret detection (common patterns)<br>    if re.search(r&quot;(?i)aws_secret_access_key|aws_access_key_id|secret\s*=|password\s*=|passwd\s*=|\bSECRET_&quot;, code):<br>        issues.append(&quot;Hardcoded secret or credentials detected.&quot;)<br>    # Detect secrets in variable defaults or variable names (e.g., default = &quot;secret123&quot;)<br>    if re.search(r&quot;(?i)default\s*=\s*\&quot;.*(secret|password|passwd).*\&quot;&quot;, code) or re.search(r&quot;(?i)variable\s+\&quot;.*(password|secret|passwd).*\&quot;&quot;, code):<br>        issues.append(&quot;Hardcoded secret detected in variable default or name.&quot;)<br><br>    # Variable naming heuristic: flag variables with uppercase or camelCase<br>    if re.search(r&quot;variable\s+\&quot;.*([A-Z].*|[a-z]+[A-Z].*)\&quot;&quot;, code):<br>        issues.append(&quot;Variables should use snake_case (avoid CamelCase or uppercase).&quot;)<br><br>    # IAM least-privilege heuristic: look for wildcard resources or actions<br>    if re.search(r&quot;aws_iam_policy|aws_iam_role_policy&quot;, code):<br>        if re.search(r&#39;&quot;?Resource&quot;?\s*:\s*\[?\s*&quot;?\*&quot;?&#39;, code) or re.search(r&#39;&quot;?Action&quot;?\s*:\s*\[?\s*&quot;?.*\*.*&quot;?&#39;, code):<br>            issues.append(&quot;IAM policy uses wildcard Action or Resource; prefer least-privilege.&quot;)<br><br>    # IAM role existence but missing policy<br>    if &quot;aws_iam_role&quot; in code and not re.search(r&quot;aws_iam_policy|role_policy|policy\s*=&quot;, code):<br>        issues.append(&quot;IAM role present but no inline or attached policy found.&quot;)<br><br>    # VPC subnet check<br>    if re.search(r&quot;resource\s+\&quot;aws_vpc\&quot;&quot;, code) and not re.search(r&quot;resource\s+\&quot;aws_subnet\&quot;.*(public|private)|public_subnet|private_subnet&quot;, code, flags=re.S):<br>        issues.append(&quot;VPC must include both public and private subnets.&quot;)<br><br>    # Module reusability: prefer modules rather than repeating resources<br>    if re.search(r&quot;resource\s+\&quot;aws_vpc\&quot;.*resource\s+\&quot;aws_vpc\&quot;&quot;, code, flags=re.S):<br>        issues.append(&quot;Duplicate VPC resources detected; prefer reusable modules.&quot;)<br><br>    return &quot;OK&quot; if not issues else &quot;\n&quot;.join(issues)<br><br><br>if __name__ == &quot;__main__&quot;:<br>    sample = &#39;&#39;&#39;<br>resource &quot;aws_vpc&quot; &quot;main&quot; {<br>  cidr_block = &quot;10.0.0.0/16&quot;<br>}<br>&#39;&#39;&#39;<br>    print(validate(sample))<br>import re<br><br>TERRAFORM_STANDARDS = &quot;&quot;&quot;<br>1. Required tags: environment, owner, cost_center.<br>2. IAM roles must have least-privilege policies.<br>3. Use snake_case for variables.<br>4. No hardcoded secrets allowed.<br>5. Modules must be reusable.<br>6. VPC must include public + private subnets.<br>&quot;&quot;&quot;<br><br><br>def validate(code: str) -&gt; str:<br>    issues = []<br><br>    # Check for tags<br>    if not re.search(r&quot;\btags\s*=&quot;, code):<br>        issues.append(&quot;Missing required tags: environment, owner, cost_center.&quot;)<br><br>    # Hardcoded secret detection (simple heuristics)<br>    if re.search(r&quot;(?i)secret\s*=|password\s*=|passwd\s*=|\bSECRET_&quot;, code):<br>        issues.append(&quot;Hardcoded secret detected.&quot;)<br><br>    # Variable naming heuristic: flag variables with uppercase letters<br>    if re.search(r&quot;variable\s+\&quot;.*[A-Z].*\&quot;&quot;, code):<br>        issues.append(&quot;Variables should use snake_case.&quot;)<br><br>    # IAM role policy check<br>    if &quot;aws_iam_role&quot; in code and not (&quot;policy&quot; in code or &quot;aws_iam_policy&quot; in code):<br>        issues.append(&quot;IAM roles must define least-privilege policy.&quot;)<br><br>    # VPC subnet check<br>    if re.search(r&quot;resource\s+\&quot;aws_vpc\&quot;&quot;, code) and not re.search(r&quot;aws_subnet|public_subnet|private_subnet&quot;, code):<br>        issues.append(&quot;VPC must include public + private subnets.&quot;)<br><br>    return &quot;OK&quot; if not issues else &quot;\n&quot;.join(issues)<br><br><br>if __name__ == &quot;__main__&quot;:<br>    # quick local smoke test<br>    sample = &#39;&#39;&#39;<br>resource &quot;aws_vpc&quot; &quot;main&quot; {<br>  cidr_block = &quot;10.0.0.0/16&quot;<br>}<br>&#39;&#39;&#39;<br>    print(validate(sample))</pre><h3><strong>6️⃣ Step 2: Build the Agent (Brain + Tools)</strong></h3><p><strong>This file does the magic:</strong></p><ul><li><em>Uses OpenAI (or Mock LLM for free testing)</em></li><li><em>Generates 5 Terraform files with </em><em>### FILE: markers</em></li><li><em>Parses multi-file output</em></li><li><em>Validates each </em><em>.tf file</em></li><li><em>Fixes issues using LLM or fallback heuristics</em></li><li><em>Repeats validation (max iterations)</em></li><li><em>Returns a clean, production-ready module</em></li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*6ZZDlwD1ODbwKDICdtNNew.png" /></figure><p>Create: <strong>agent_real.py</strong></p><pre>import os<br>import re<br>import json<br>from dotenv import load_dotenv<br><br>load_dotenv()<br><br>from terraform_standards import validate, TERRAFORM_STANDARDS<br><br>OPENAI_MODEL = os.getenv(&quot;OPENAI_MODEL&quot;, &quot;gpt-4o-mini&quot;)<br><br><br>PROMPT_TEMPLATE = &quot;&quot;&quot;<br>You are an expert Terraform generator. Follow these org standards exactly:<br>{standards}<br><br>User request:<br>{request}<br><br>Produce a complete, production-ready, reusable Terraform module with proper enterprise structure.<br><br>Generate the following files with clear separators:<br><br>### FILE: main.tf<br>&lt;main resource definitions with proper tags blocks in resource blocks only&gt;<br><br>### FILE: variables.tf<br>&lt;all input variables with descriptions, types, defaults, and validation&gt;<br><br>### FILE: outputs.tf<br>&lt;all outputs with value and description only - NO tags, NO other arguments&gt;<br><br>### FILE: providers.tf<br>&lt;required provider versions, terraform version constraints, and provider configurations&gt;<br><br>### FILE: README.md<br>&lt;module documentation with description, usage example, inputs table, outputs table, requirements&gt;<br><br>CRITICAL Terraform Syntax Rules:<br>- Tags ONLY go inside resource blocks, NOT in output/variable/provider blocks<br>- Output blocks ONLY support: value, description, sensitive, depends_on<br>- Variable blocks ONLY support: type, description, default, validation, sensitive, nullable<br>- Provider blocks do NOT have tags - use default_tags in AWS provider if needed<br>- Always close ALL braces properly - verify each opening brace has a closing brace<br>- Use proper HCL syntax - check for missing commas, quotes, and braces<br><br>Resource-Specific Requirements:<br>- AWS Lambda: MUST specify exactly ONE of: filename (with default=&quot;lambda.zip&quot;), s3_bucket+s3_key, or image_uri<br>- For reusable Lambda modules, use s3_bucket + s3_key as required variables (most common pattern)<br>- IAM roles: must have assume_role_policy with proper JSON<br>- Security groups: must have at least one ingress or egress rule<br>- VPCs: should include both public and private subnets<br>- CloudWatch alarms: require comparison_operator, evaluation_periods, metric_name, namespace, period, statistic, threshold<br><br>Ensure:<br>- Include required tags: environment, owner, cost_center in RESOURCE blocks only<br>- Use snake_case for all variable names<br>- No hardcoded secrets or credentials<br>- Least-privilege IAM policies (avoid Resource = &quot;*&quot;)<br>- Proper variable validation and constraints<br>- Clear descriptions for all variables and outputs<br>- Module should be reusable across environments<br>- Include usage examples in README<br>- Use ONLY current, non-deprecated resource types and arguments<br>- Follow latest provider best practices (check documentation)<br>- Use required_providers block with version constraints<br>- Specify minimum terraform version in providers.tf<br>- Use terraform.workspace or variables for environment-specific values<br>- Avoid deprecated syntax (e.g., use for_each over count when appropriate)<br><br>IMPORTANT: Return ONLY raw Terraform code with ### FILE: separators.<br>DO NOT wrap code in markdown fences like ```hcl or ```terraform.<br>DO NOT include any code block markers or backticks.<br>Return clean, parseable Terraform code only.<br>&quot;&quot;&quot;<br><br><br>class MockLLM:<br>    @staticmethod<br>    def generate(prompt: str) -&gt; str:<br>        # Keep the previous deterministic example for local testing<br>        from agent import MockLLM as OldMock<br>        return OldMock.generate(prompt)<br><br><br>def call_openai(prompt: str) -&gt; str:<br>    try:<br>        from openai import OpenAI<br><br>        client = OpenAI(api_key=os.getenv(&quot;OPENAI_API_KEY&quot;))<br>        resp = client.chat.completions.create(<br>            model=OPENAI_MODEL,<br>            messages=[{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: prompt}],<br>            max_tokens=1500,<br>        )<br>        return resp.choices[0].message.content<br>    except Exception as e:<br>        print(f&quot;OpenAI call failed: {e}&quot;)<br>        return MockLLM.generate(prompt)<br><br><br>def llm(prompt: str) -&gt; str:<br>    if os.getenv(&quot;OPENAI_API_KEY&quot;):<br>        return call_openai(prompt)<br>    return MockLLM.generate(prompt)<br><br><br>def generate_tf(request: str) -&gt; dict:<br>    &quot;&quot;&quot;Generate Terraform module files. Returns dict of {filename: content}&quot;&quot;&quot;<br>    prompt = PROMPT_TEMPLATE.format(standards=TERRAFORM_STANDARDS, request=request)<br>    result = llm(prompt)<br>    return parse_multi_file_response(result)<br><br><br>def parse_multi_file_response(response: str) -&gt; dict:<br>    &quot;&quot;&quot;Parse LLM response with ### FILE: separators into dict of files&quot;&quot;&quot;<br>    files = {}<br>    pattern = r&quot;###\s*FILE:\s*([\w\.-]+)\s*\n(.*?)(?=###\s*FILE:|$)&quot;<br>    matches = re.findall(pattern, response, re.DOTALL | re.IGNORECASE)<br>    <br>    if matches:<br>        for filename, content in matches:<br>            # Strip markdown code fences if present<br>            cleaned_content = content.strip()<br>            # Remove opening code fence (```hcl, ```terraform, ```)<br>            cleaned_content = re.sub(r&#39;^```(?:hcl|terraform)?\s*\n&#39;, &#39;&#39;, cleaned_content)<br>            # Remove closing code fence<br>            cleaned_content = re.sub(r&#39;\n```\s*$&#39;, &#39;&#39;, cleaned_content)<br>            <br>            # Fix common LLM mistakes in outputs.tf<br>            if filename.strip() == &#39;outputs.tf&#39;:<br>                cleaned_content = fix_output_syntax(cleaned_content)<br>            <br>            files[filename.strip()] = cleaned_content.strip()<br>    else:<br>        # Fallback: treat entire response as main.tf<br>        content = response.strip()<br>        content = re.sub(r&#39;^```(?:hcl|terraform)?\s*\n&#39;, &#39;&#39;, content)<br>        content = re.sub(r&#39;\n```\s*$&#39;, &#39;&#39;, content)<br>        files[&quot;main.tf&quot;] = content.strip()<br>    <br>    return files<br><br><br>def fix_output_syntax(content: str) -&gt; str:<br>    &quot;&quot;&quot;Remove invalid arguments from output blocks (e.g., tags)&quot;&quot;&quot;<br>    # Remove tags blocks from output definitions<br>    # Pattern: find output blocks and remove tags = { ... } from them<br>    def remove_invalid_output_args(match):<br>        output_block = match.group(0)<br>        # Remove tags blocks<br>        output_block = re.sub(r&#39;\s*tags\s*=\s*\{[^}]*\}\s*&#39;, &#39;\n&#39;, output_block, flags=re.DOTALL)<br>        # Remove other invalid args (type, default, validation, etc.)<br>        output_block = re.sub(r&#39;\s*type\s*=\s*[^\n]+\n&#39;, &#39;\n&#39;, output_block)<br>        output_block = re.sub(r&#39;\s*default\s*=\s*[^\n]+\n&#39;, &#39;\n&#39;, output_block)<br>        return output_block<br>    <br>    # Match output blocks<br>    content = re.sub(<br>        r&#39;output\s+&quot;[^&quot;]+&quot;\s*\{[^}]*\}&#39;,<br>        remove_invalid_output_args,<br>        content,<br>        flags=re.DOTALL<br>    )<br>    <br>    return content<br><br><br>def run_terraform_validate(files: dict) -&gt; str:<br>    &quot;&quot;&quot;Run actual terraform validate on generated files&quot;&quot;&quot;<br>    import tempfile<br>    import subprocess<br>    import shutil<br>    from pathlib import Path<br>    <br>    # Create temp directory<br>    temp_dir = tempfile.mkdtemp(prefix=&quot;tf-validate-&quot;)<br>    try:<br>        # Write all files<br>        for filename, content in files.items():<br>            file_path = Path(temp_dir) / filename<br>            file_path.write_text(content)<br>        <br>        # Run terraform init<br>        init_result = subprocess.run(<br>            [&quot;terraform&quot;, &quot;init&quot;, &quot;-backend=false&quot;],<br>            cwd=temp_dir,<br>            capture_output=True,<br>            text=True,<br>            timeout=60<br>        )<br>        <br>        if init_result.returncode != 0:<br>            return f&quot;Terraform init failed:\n{init_result.stderr}&quot;<br>        <br>        # Run terraform validate<br>        validate_result = subprocess.run(<br>            [&quot;terraform&quot;, &quot;validate&quot;, &quot;-json&quot;],<br>            cwd=temp_dir,<br>            capture_output=True,<br>            text=True,<br>            timeout=30<br>        )<br>        <br>        if validate_result.returncode != 0:<br>            # Parse JSON output for better error messages<br>            try:<br>                import json<br>                result = json.loads(validate_result.stdout)<br>                if not result.get(&quot;valid&quot;, False):<br>                    errors = []<br>                    for diag in result.get(&quot;diagnostics&quot;, []):<br>                        severity = diag.get(&quot;severity&quot;, &quot;error&quot;)<br>                        summary = diag.get(&quot;summary&quot;, &quot;&quot;)<br>                        detail = diag.get(&quot;detail&quot;, &quot;&quot;)<br>                        errors.append(f&quot;{severity.upper()}: {summary}\n{detail}&quot;)<br>                    return &quot;\n&quot;.join(errors)<br>            except:<br>                pass<br>            return f&quot;Terraform validation failed:\n{validate_result.stderr}&quot;<br>        <br>        return &quot;OK&quot;<br>    <br>    except subprocess.TimeoutExpired:<br>        return &quot;Terraform validation timed out&quot;<br>    except Exception as e:<br>        return f&quot;Terraform validation error: {str(e)}&quot;<br>    finally:<br>        # Cleanup<br>        try:<br>            shutil.rmtree(temp_dir)<br>        except:<br>            pass<br><br><br>def validate_tf(files: dict) -&gt; str:<br>    &quot;&quot;&quot;Validate all .tf files in the module&quot;&quot;&quot;<br>    issues = []<br>    <br>    # First, validate individual files against standards<br>    for filename, content in files.items():<br>        if filename.endswith(&#39;.tf&#39;):<br>            result = validate(content)<br>            if result != &quot;OK&quot;:<br>                issues.append(f&quot;{filename}: {result}&quot;)<br>    <br>    # Cross-file validation: check for undefined variable references<br>    defined_vars = set()<br>    if &#39;variables.tf&#39; in files:<br>        var_matches = re.findall(r&#39;variable\s+&quot;([^&quot;]+)&quot;&#39;, files[&#39;variables.tf&#39;])<br>        defined_vars = set(var_matches)<br>    <br>    # Check all .tf files for var. references<br>    for filename, content in files.items():<br>        if filename.endswith(&#39;.tf&#39;):<br>            var_refs = re.findall(r&#39;var\.(\w+)&#39;, content)<br>            for var_ref in var_refs:<br>                if var_ref not in defined_vars:<br>                    issues.append(f&quot;{filename}: References undeclared variable &#39;var.{var_ref}&#39;&quot;)<br>    <br>    # Run actual terraform validate (most comprehensive check)<br>    tf_issues = run_terraform_validate(files)<br>    if tf_issues != &quot;OK&quot;:<br>        issues.append(f&quot;Terraform validation: {tf_issues}&quot;)<br>    <br>    return &quot;OK&quot; if not issues else &quot;\n&quot;.join(issues)<br><br><br>def fix_tf(files: dict, issues: str) -&gt; dict:<br>    &quot;&quot;&quot;Fix issues in Terraform files&quot;&quot;&quot;<br>    if os.getenv(&quot;OPENAI_API_KEY&quot;):<br>        # Use LLM to fix issues<br>        files_str = &quot;\n\n&quot;.join([f&quot;### FILE: {name}\n{content}&quot; for name, content in files.items()])<br>        fix_prompt = f&quot;&quot;&quot;The following Terraform module has validation issues that MUST be fixed:<br><br>ISSUES:<br>{issues}<br><br>CURRENT MODULE FILES:<br>{files_str}<br><br>FIX INSTRUCTIONS:<br>1. Fix ALL issues listed above<br>2. If a variable is referenced but not declared, add it to variables.tf with proper type and description<br>3. If tags are in output blocks, remove them (outputs only support: value, description, sensitive)<br>4. If provider uses undefined variables, add them to variables.tf<br>5. Ensure all braces are properly closed<br>6. Keep all existing working code intact<br><br>IMPORTANT: Return the COMPLETE corrected module with ### FILE: separators.<br>DO NOT use markdown code fences (no ```hcl or ```).<br>Return raw Terraform code only.<br>&quot;&quot;&quot;<br>        result = llm(fix_prompt)<br>        if result:<br>            return parse_multi_file_response(result)<br>    <br>    # Heuristic fallback: fix main.tf only<br>    import agent<br>    if &quot;main.tf&quot; in files:<br>        files[&quot;main.tf&quot;] = agent.fix_tf(files[&quot;main.tf&quot;], issues)<br>    return files<br><br><br>def full_pipeline(user_request: str, max_iterations: int = 3) -&gt; dict:<br>    &quot;&quot;&quot;Run the full agent pipeline. Returns dict of {filename: content}&quot;&quot;&quot;<br>    print(f&quot;\n Starting pipeline for: {user_request}&quot;)<br>    <br>    # Step 1: Generate initial code<br>    print(&quot;\n Generating Terraform module...&quot;)<br>    files = generate_tf(user_request)<br>    <br>    if not files:<br>        print(&quot; Failed to generate initial code&quot;)<br>        return {}<br>    <br>    print(f&quot; Generated {len(files)} files: {&#39;, &#39;.join(files.keys())}&quot;)<br>    <br>    # Step 2: Validate<br>    print(&quot;\n Validating module...&quot;)<br>    issues = validate_tf(files)<br>    <br>    if issues == &quot;OK&quot;:<br>        print(&quot; Validation passed!&quot;)<br>        return files<br>    <br>    print(f&quot;⚠  Validation issues found:\n{issues}&quot;)<br>    <br>    # Step 3: Auto-fix loop<br>    iter_count = 0<br>    while issues != &quot;OK&quot; and iter_count &lt; max_iterations:<br>        iter_count += 1<br>        print(f&quot;\n🔧 Auto-fix iteration {iter_count}/{max_iterations}...&quot;)<br>        <br>        fixed_files = fix_tf(files, issues)<br>        if fixed_files == files:<br>            print(&quot;  No changes made by fix attempt&quot;)<br>            break<br>        <br>        files = fixed_files<br>        issues = validate_tf(files)<br>        <br>        if issues == &quot;OK&quot;:<br>            print(f&quot; Validation passed after {iter_count} iteration(s)!&quot;)<br>        else:<br>            print(f&quot;  Still have issues:\n{issues}&quot;)<br>    <br>    if issues != &quot;OK&quot;:<br>        print(f&quot;\n Could not fix all issues after {max_iterations} iterations&quot;)<br>        print(&quot;Returning module with remaining issues.&quot;)<br>    <br>    return files<br><br><br>def create_pr(branch_name: str, files: dict, module_path: str = None) -&gt; str:<br>    &quot;&quot;&quot;Reuse the create_pr from agent.py to avoid duplication&quot;&quot;&quot;<br>    import agent<br>    return agent.create_pr(branch_name, files, module_path)<br><br><br>if __name__ == &quot;__main__&quot;:<br>    sample = &quot;Create an AWS VPC module with public and private subnets and a basic IAM role.&quot;<br>    print(full_pipeline(sample))</pre><blockquote><em>Note: Using OpenAI in the Example — But This Agent Works With Any LLM</em></blockquote><p><strong>Example swap: OpenAI → Gemini</strong></p><pre># Instead of OpenAI:<br>from openai import OpenAI<br>client = OpenAI()<br><br># Use Gemini:<br>import google.generativeai as genai<br>genai.configure(api_key=os.getenv(&quot;GEMINI_API_KEY&quot;))<br>model = genai.GenerativeModel(&quot;gemini-1.5-pro&quot;)<br><br>response = model.generate_content(prompt)</pre><h3>7️⃣ Step 3 — Local Build &amp; Testing (Optional but Highly Recommended)</h3><p><em>(Test the agent locally before sending code to GitHub)</em></p><p>Before integrating the agent into a CI/CD pipeline or letting it create PRs, you may want to test it locally. This step lets you:</p><ul><li>Validate the module generation</li><li>Run standards checks</li><li>See auto‑fixes happen in real‑time</li><li>Inspect generated files</li><li>Debug issues faster</li><li>Avoid unnecessary PR noise</li></ul><p>You can skip this and rely fully on GitHub Actions — <br><strong>but local testing gives you a faster feedback loop, especially during development.</strong></p><p><strong>Run the agent locally with:</strong></p><pre>python run_example.py --prompt &quot;Create an AWS Lambda with CloudWatch monitoring&quot;</pre><p>This produces:</p><ul><li>main.tf</li><li>variables.tf</li><li>outputs.tf</li><li>providers.tf</li><li>README.md</li></ul><p>All validated and auto‑fixed before being written to:</p><h3>8️⃣ Step 4 — Project Folder Structure</h3><p>Before we move into CI/CD automation, here is the <strong>full directory structure</strong> of the Terraform Agent you just built.</p><p>This structure is intentionally modular, testable, and enterprise‑ready.</p><p>📁 <strong>Project Structure</strong></p><pre>terraform-agent/<br>├── agent_real.py                # Main AI agent (LLM + validation + fixes)<br>├── agent.py                     # PR creation + mock LLM utilities<br>├── terraform_standards.py       # Org standards + validation engine<br>├── run_example.py               # Local execution entrypoint<br>├── modules/                     # Auto-created PR modules<br>├── out/                         # Local generated modules (no PR)<br>├── tests/                       # Full unit test suite<br>│   ├── test_agent_pipeline.py<br>│   └── test_terraform_standards.py<br>└── .github/<br>    └── workflows/<br>        ├── python-tests.yml     # Unit tests run on every PR<br>        ├── terraform.yml        # Terraform validate + plan workflow<br>        └── e2e_pr.yml           # End-to-end PR generation workflow</pre><h3><strong>9️⃣ </strong>Step 5 — Running the Workflow with a Prompt (CI/CD Automation)</h3><p>Now that the codebase is structured correctly, you can run the entire agent and pipeline <strong>with a single prompt</strong> — either locally or directly inside GitHub.</p><p>You now have <strong>two ways</strong> to run the pipeline:</p><blockquote><strong>Option A:</strong> Run the Agent Locally (Fast Feedback Loop)</blockquote><p>If you want to test code generation before opening a PR:</p><pre>python run_example.py --prompt &quot;Create an Azure Storage Account module&quot;<br></pre><h4>This will:</h4><ol><li>Generate all module files</li><li>Validate them</li><li>Auto-fix issues</li><li>Write output to:</li></ol><pre>out/&lt;branch-name&gt;/</pre><p>5. (Optional) Create a PR if you pass --create-pr</p><pre>python run_example.py --prompt &quot;Create a GCP Cloud Run module&quot; --create-pr</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*AoIaKrmJ3FqmRLqHAoX1EQ.png" /></figure><blockquote><strong>Option B</strong>: Run the Entire Agent Inside GitHub Actions</blockquote><p>You can trigger the <strong>e2e_pr.yml</strong> workflow from GitHub UI.</p><pre>name: E2E PR (manual)<br><br>on:<br>  workflow_dispatch:<br>    inputs:<br>      prompt:<br>        description: &#39;Terraform module prompt for the agent&#39;<br>        required: false<br>        default: &#39;Create a reusable Terraform module that create a lambda function with health check with cloudwatch alarm&#39;<br>        type: string<br><br>permissions:<br>  contents: write<br>  pull-requests: write<br><br>jobs:<br>  e2e:<br>    runs-on: ubuntu-latest<br>    steps:<br>      - uses: actions/checkout@v3<br>      - name: Setup Python<br>        uses: actions/setup-python@v4<br>        with:<br>          python-version: &#39;3.10&#39;<br>      - name: Install deps<br>        run: |<br>          python -m pip install --upgrade pip<br>          pip install -r requirements.txt<br>      - name: Run example (create PR)<br>        env:<br>          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}<br>          # use GH_TOKEN if provided, otherwise fall back to the Actions-provided GITHUB_TOKEN<br>          GITHUB_TOKEN: ${{ secrets.GH_TOKEN || secrets.GITHUB_TOKEN }}<br>          GITHUB_REPO: ${{ github.repository }}<br>        run: |<br>          python run_example.py --prompt &quot;${{ github.event.inputs.prompt }}&quot; --create-pr</pre><p>1. Go to<br><strong>GitHub → Actions → E2E PR (Manual Run)<br></strong>2. Click “Run Workflow”<br>3. Enter your natural language Terraform request:<br>Example prompt:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*gM53oWhh7-QQ_-hzohzzVA.png" /></figure><h3>4. Click Run Workflow</h3><p>This workflow will:</p><p>✔ Run the full agent<br>✔ Generate module (main.tf, variables.tf, outputs.tf, providers.tf, README.md)<br>✔ Validate with your standards file<br>✔ Auto-fix any issues<br>✔ Create AI/-timestamp branch<br>✔ Commit the module<br>✔ Open a Pull Request<br>✔ Trigger <strong>python-tests.yml</strong><br>✔ Trigger <strong>terraform.yml</strong><br>✔ Show terraform init/validate/plan output right in the pipeline</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Wstj4Xb4xJ79hBoslKWk9w.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qR6qJAE4eJl_sRsUKnC-Tw.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*WOsCwOo5edZjqPpvNisUCg.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*9wLbJEP1tAFqm6jQ67lTFw.png" /></figure><h3>What Happens Automatically After PR Creation?</h3><p>When the agent opens a PR, <strong>GitHub Actions takes over</strong>.</p><p><strong>✔ Workflow #1 — Unit Tests</strong></p><p>File: .github/workflows/python-tests.yml</p><pre>name: Python Tests<br><br>on:<br>  push:<br>    branches: [&quot;main&quot;, &quot;master&quot;]<br>  pull_request:<br><br>jobs:<br>  test:<br>    runs-on: ubuntu-latest<br>    steps:<br>      - uses: actions/checkout@v3<br>      - name: Setup Python<br>        uses: actions/setup-python@v4<br>        with:<br>          python-version: &#39;3.10&#39;<br>      - name: Install dependencies<br>        run: |<br>          python -m pip install --upgrade pip<br>          pip install -r requirements.txt<br>      - name: Run tests<br>        env:<br>          PYTHONPATH: ${{ github.workspace }}<br>        run: |<br>          pytest -q</pre><pre>requirements.txt<br><br>openai&gt;=0.27.0<br>PyGithub&gt;=1.59.0<br>python-dotenv&gt;=1.0.0<br>pytest&gt;=7.0.0</pre><p>This runs:</p><ul><li>Standards tests</li><li>Validation tests</li><li>Agent pipeline tests</li><li>Mock LLM &amp; real LLM behavior tests</li></ul><p><strong>✔ Workflow #2 — Terraform Validation</strong></p><p>File: .github/workflows/terraform.yml<br>This workflow:</p><pre>name: Terraform Plan<br><br>on:<br>  pull_request:<br><br>jobs:<br>  terraform:<br>    runs-on: ubuntu-latest<br><br>    steps:<br>    - uses: actions/checkout@v3<br><br>    - name: Setup Terraform<br>      uses: hashicorp/setup-terraform@v3<br>      with:<br>        terraform_wrapper: false<br><br>    - name: Find and Validate All Modules<br>      run: |<br>        echo &quot;🔍 Finding all Terraform modules...&quot;<br>        <br>        # Find all directories containing .tf files<br>        MODULE_DIRS=$(find . -type f -name &quot;*.tf&quot; -exec dirname {} \; | sort -u)<br>        <br>        if [ -z &quot;$MODULE_DIRS&quot; ]; then<br>          echo &quot;ℹ️  No Terraform modules found in this PR&quot;<br>          echo &quot;This is normal for PRs that don&#39;t include Terraform code&quot;<br>          exit 0<br>        fi<br>        <br>        echo &quot;Found modules:&quot;<br>        echo &quot;$MODULE_DIRS&quot;<br>        echo &quot;&quot;<br>        <br>        # Validate each module<br>        for dir in $MODULE_DIRS; do<br>          echo &quot;&quot;<br>          echo &quot;📂 Module: $dir&quot;<br>          cd &quot;$dir&quot;<br>          <br>          echo &quot;⚙️  Running terraform init...&quot;<br>          if terraform init -backend=false; then<br>            echo &quot;✅ Init successful&quot;<br>            <br>            echo &quot;📋 Running terraform validate...&quot;<br>            if terraform validate; then<br>              echo &quot;✅ Validation successful&quot;<br>            else<br>              echo &quot;❌ Validation failed&quot;<br>              exit 1<br>            fi<br>            <br>            echo &quot;📊 Running terraform plan...&quot;<br>            if terraform plan -input=false -out=tfplan 2&gt;&amp;1 | tee plan.log; then<br>              echo &quot;✅ Plan successful&quot;<br>              echo &quot;&quot;<br>              echo &quot;📄 Plan output:&quot;<br>              terraform show tfplan<br>            else<br>              PLAN_EXIT=$?<br>              echo &quot;⚠️  Plan failed (exit code: $PLAN_EXIT)&quot;<br>              <br>              # Check if it&#39;s just missing variables (expected for reusable modules)<br>              if grep -q &quot;No value for required variable&quot; plan.log; then<br>                echo &quot;&quot;<br>                echo &quot;ℹ️  This is a reusable module that requires input variables.&quot;<br>                echo &quot;This is EXPECTED behavior. The module syntax is valid.&quot;<br>                echo &quot;&quot;<br>                echo &quot;Missing variables:&quot;<br>                grep &quot;variable \&quot;&quot; plan.log | head -10<br>                echo &quot;&quot;<br>                echo &quot;✅ Module validation passed (plan failure due to missing vars is OK)&quot;<br>              else<br>                echo &quot;&quot;<br>                echo &quot;❌ Plan failed with actual errors:&quot;<br>                cat plan.log<br>                exit 1<br>              fi<br>            fi<br>          else<br>            echo &quot;❌ Init failed&quot;<br>            exit 1<br>          fi<br>          <br>          cd - &gt; /dev/null<br>        done<br>        echo &quot;&quot;<br>        echo &quot;✅ All modules validated successfully!&quot;</pre><ul><li>Discovers modules in the PR</li><li>Runs:</li></ul><pre>terraform init<br>terraform validate</pre><h3>Final Conclusion — You Just Built a Digital DevOps Engineer</h3><p>This is not just an AI demo.</p><p>You now have a <strong>fully operational, cloud‑agnostic, Terraform‑agnostic Agentic DevOps system</strong>, capable of:</p><ul><li>Understanding natural language</li><li>Generating Terraform modules</li><li>Enforcing standards</li><li>Auto-fixing code</li><li>Creating GitHub PRs</li><li>Passing unit tests</li><li>Running terraform validate + plan</li><li>Operating across AWS / Azure / GCP / any provider</li><li>Running in any pipeline</li></ul><p>This is the future of platform engineering:</p><ul><li>Consistent</li><li>Automated</li><li>Secure</li><li>Extensible</li><li>Agentic</li></ul><p>And you’ve built the <strong>first working version</strong>.</p><p>Now extend it:</p><ul><li>Add tfsec / tflint security scanning</li><li>Add Infracost cost intelligence</li><li>Add policy-as-code (OPA/Rego)</li><li>Add Slack/Jira approvals</li><li>Add RAG with your internal playbooks</li><li>Add multi-cloud capabilities</li><li>Add Terratest integration</li></ul><blockquote>Your agent is no longer theory — <strong>it’s an operational teammate.<br>Welcome to Agentic DevOps. 🚀</strong></blockquote><h3>🛠️ <strong>What You Can Build Next (Beyond Terraform)</strong></h3><p>The pattern you built in Part 2 is not limited to Terraform.</p><p><em>By swapping the generation prompt and your validation logic, you can build additional agents that safely accelerate development across your entire organization:</em></p><h3>🧩 Helm Agent</h3><p>Generate:</p><ul><li>Chart.yaml</li><li>values.yaml</li><li>templates</li><li>non‑privileged containers</li><li>resource limits</li><li>required annotations</li><li>org‑approved patterns</li></ul><h3>🧩 Kubernetes Manifest Agent</h3><p>Generate Deployments, Services, Ingress, HPA, RBAC with:</p><ul><li>Policy checks</li><li>OPA/Conftest validation</li><li>Security constraints</li><li>Label/annotation standards</li></ul><h3>🧩 CI/CD Pipeline Agent</h3><p>Generate GitHub Actions / GitLab CI / Jenkins pipelines using:</p><ul><li>Org‑standard workflows</li><li>Security gating</li><li>Approval flows</li></ul><h3>🧩 Policy‑as‑Code Agent</h3><p>Generate:</p><ul><li>OPA/Rego rules</li><li>Gatekeeper constraints</li><li>Governance or compliance templates</li></ul><h3>🧩 Any Config Agent</h3><ul><li>Dockerfiles</li><li>API Gateway configs</li><li>Monitoring dashboards</li><li>Secrets templates</li><li>CloudFormation</li><li>Kustomize</li></ul><blockquote>Everything becomes <strong>automatable</strong> with your organizational rules.</blockquote><blockquote>This means your teams can drastically reduce development time while staying <strong>secure</strong>, <strong>consistent</strong>, and <strong>aligned with internal standards</strong>.</blockquote><blockquote>Your Terraform Agent is simply the <em>first example</em> of what’s possible with Agentic DevOps.</blockquote><h3>🔜 What’s Coming in Part 3 — Automated Copilot PR Reviews</h3><p>Now that the agent can generate Terraform modules, validate them, auto‑fix issues, create PRs, and run full CI/CD checks, the next logical step is improving your <strong>review workflow</strong>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=ec3c749570ac" width="1" height="1" alt=""><hr><p><a href="https://pub.towardsai.net/part-2-how-to-build-your-own-ai-agent-cloud-agnostic-fully-automated-enterprise-ready-ec3c749570ac">📘 Part 2 — How to Build Your Own AI Agent: (Cloud-Agnostic, Fully Automated, Enterprise-Ready)</a> was originally published in <a href="https://pub.towardsai.net">Towards AI</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[ Gen AI vs Agentic AI vs Traditional AI]]></title>
            <link>https://medium.com/generative-ai-revolution-ai-native-transformation/gen-ai-vs-agentic-ai-vs-traditional-ai-051a5dc1e0f7?source=rss-64295cea6d86------2</link>
            <guid isPermaLink="false">https://medium.com/p/051a5dc1e0f7</guid>
            <category><![CDATA[ai-for-beginners]]></category>
            <category><![CDATA[agentic-ai]]></category>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[generative-ai-tools]]></category>
            <category><![CDATA[ai-agent]]></category>
            <dc:creator><![CDATA[Cherukuri sai]]></dc:creator>
            <pubDate>Mon, 16 Feb 2026 02:29:42 GMT</pubDate>
            <atom:updated>2026-02-16T09:31:31.472Z</atom:updated>
            <content:encoded><![CDATA[<blockquote><strong><em>Part 1: What They Are, How They Work, and When to Use Which</em></strong></blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/1*__UNfiUmCPNjh18tOZyfrQ.jpeg" /></figure><p>Artificial Intelligence is evolving so fast that even experts sometimes struggle to keep up. New terms appear every month: <strong>Gen AI</strong>, <strong>Agentic AI</strong>, <strong>Predictive AI</strong>, <strong>Foundational Models</strong>, and many more.<br>For someone starting their AI journey — or even someone already working in tech — it’s easy to feel overwhelmed.</p><p><strong><em>We hear terms like:</em></strong></p><ul><li>Generative AI</li><li>Agentic AI</li><li>Autonomous AI</li><li>AI Agents</li><li>LLMs</li><li>Machine Learning</li><li>RAG</li><li>Copilots</li></ul><blockquote>But most people don’t truly understand:<br> 👉 How they actually work<br> 👉 What makes them different<br> 👉 When to use which one<br> 👉 And what to learn first</blockquote><p><strong><em>Let’s simplify everything.</em></strong></p><h3>A. Predictive AI (Traditional Machine Learning)</h3><p>This is the AI most companies have been using for years. It answers very specific questions:</p><ul><li>Will the customer churn?</li><li>What is the credit risk?</li></ul><blockquote><strong><em>What is the predicted demand?<br></em></strong>Predictive AI is good at <strong>classification</strong>, <strong>regression</strong>, and <strong>pattern recognition</strong>, but it cannot generate new content or act on its own.</blockquote><p><strong>Best for:</strong> Finance, analytics, forecasting, retail, operations.</p><h4>How It Works</h4><ol><li><em>Collect labeled data</em></li><li><em>Train a model</em></li><li><em>Validate it</em></li><li><em>Deploy it</em></li><li><em>Model predicts output</em></li></ol><p><strong>It does not create new content.<br></strong><em>It only predicts based on patterns it learned.<br>When to Use It<br> ✅ When you need prediction<br> ✅ When you have structured data<br> ✅ When outcomes are measurable</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/713/1*9uIX2aySNoTFOQ3UdP5vRw.png" /></figure><h3>B. Generative AI (Gen AI)</h3><p><em>This is what most people refer to when they say “AI” today.<br></em><strong>Gen AI creates new things:</strong></p><ul><li>Text</li><li>Images</li><li>Code</li><li>Reports</li><li>Designs</li></ul><p>Models like GPT, Llama, Claude, etc., are all examples of Gen AI.</p><p><strong>What Gen AI does well:</strong></p><ul><li>Summarizing</li><li>Explaining complex topics</li><li>Brainstorming ideas</li><li>Writing code</li><li>Drafting emails and documentation</li></ul><blockquote><strong>But note</strong>:<br>Gen AI is still <strong>reactive</strong> — it waits for your instructions. It doesn’t take initiative.</blockquote><p><strong>Best for:</strong> Creators, analysts, students, developers, business teams.</p><pre>Massive Data → Train Foundation Model → User Prompt → AI Processing → Generated Content</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/608/1*Wpc31uD6BbR2jfjoLW9nsA.png" /></figure><h3>C. Agentic AI (AI Agents)</h3><p>Agentic AI is the next big leap. Unlike Gen AI, which only responds to prompts, <strong>Agentic AI can take action</strong>.<br>Think of an AI intern or digital employee that can:</p><ul><li>Plan tasks</li><li>Make decisions</li><li>Execute steps autonomously</li><li>Use tools or software</li><li>Monitor progress</li><li>Correct itself</li></ul><p><strong>Examples:</strong></p><ul><li>An AI agent that books your flights</li><li>An agent that runs testing workflows</li><li>Agents that analyze documents, then update dashboards, then notify teams</li><li>Agents that manage customer support tickets end‑to‑end</li></ul><p>Agentic AI = <strong>Autonomous, goal-driven, action-taking AI</strong>.</p><blockquote>This is where the future is heading.</blockquote><p><strong>Best for:</strong> Automation, operations, DevOps, QA, business workflows, enterprise systems.</p><pre>User Goal → Agent Reasoning → Plan → Use Tools → Execute → Final Result</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/927/1*2K8vRoj5xVD95FKkA9uxXg.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/933/1*al84RhdJUiJNZVfj-jEimg.png" /></figure><h3>2. How to Know Which AI to Choose</h3><p>A common question people ask is:<br><strong>“Which AI should I choose when building a product or learning a new skill?”</strong><br> Here’s a simple rule of thumb.</p><blockquote><strong>•<em> Choose Predictive AI if</em>:<br></strong>You want <strong>numbers</strong>, <strong>probabilities</strong>, or <strong>forecasts</strong>.<br><strong>Examples</strong>: risk scoring, time-series forecasting, anomaly detection.</blockquote><pre>Historical Data → Feature Engineering → ML Model → Prediction Score → Dashboard / Alert</pre><blockquote><strong>• <em>Choose Gen AI if</em>:<br></strong>You want AI to <strong>generate content</strong> or provide <strong>knowledge-driven insights</strong>.<br><strong>Examples</strong>: customer replies, documentation, email drafting, coding help.</blockquote><pre>User Question → LLM → Knowledge Base (RAG) → Generated Response → User</pre><blockquote><strong>• <em>Choose Agentic AI if</em>:<br></strong>You want AI to <strong>take actions</strong>, not just respond.<br><strong>Examples</strong>: autonomous testing, workflow automation, CRM updates, financial reconciliation.</blockquote><pre>User Goal → Agent (LLM Brain) → Planning → Tool Usage (API / DB / Browser) → Execution → Result</pre><h3>3. How These AIs Actually Work (A Simple Breakdown)</h3><h3>Predictive AI (ML)</h3><ul><li>Learns patterns from structured data</li><li>Maps input → output</li><li>Doesn’t “understand” meaning</li><li>Cannot generate new content</li></ul><h3>Gen AI</h3><ul><li>Trained on massive text, code, or image datasets</li><li>Learns relationships between words, sentences, or pixels</li><li>Uses statistical patterns to generate new content</li><li>Can reason “as if” it understands context</li></ul><h3>Agentic AI</h3><ul><li>Uses Gen AI as a “brain”</li><li>Adds memory, tools, decision logic, and feedback loops</li><li>Can connect to apps, APIs, databases</li><li>Can plan, act, evaluate, and improve itself</li></ul><p>In short: <br><strong>Predictive AI = analysis</strong><br><strong>Gen AI = creation</strong><br><strong>Agentic AI = action</strong></p><h3>4. For Beginners: How Should You Start Learning?</h3><p>If you’re new to AI, don’t jump directly into advanced agent frameworks.<br><strong>Start with a foundation.</strong></p><blockquote><strong><em>Step 1: Understand the fundamentals</em></strong></blockquote><blockquote>What is ML?</blockquote><blockquote>What is Gen AI?</blockquote><blockquote>What problem is each model solving?</blockquote><blockquote><strong><em>Step 2: Learn to use Gen AI tools (hands-on)</em></strong></blockquote><blockquote>ChatGPT</blockquote><blockquote>Gemini</blockquote><blockquote>Claude</blockquote><blockquote>Llama</blockquote><blockquote>GitHub Copilot</blockquote><blockquote><strong><em>This builds intuition.</em></strong></blockquote><blockquote><strong><em>Step 3: Learn Prompt Engineering</em></strong></blockquote><blockquote>This helps you interact with AI systems effectively.</blockquote><blockquote><strong><em>Step 4: Learn Applied AI Skills</em></strong></blockquote><blockquote>Vector databases</blockquote><blockquote>RAG (Retrieval-Augmented Generation)</blockquote><blockquote>Embeddings</blockquote><blockquote>Model evaluation</blockquote><blockquote><strong><em>Step 5: Move into Agentic AI</em></strong></blockquote><blockquote>Once comfortable, explore:</blockquote><blockquote>LangGraph</blockquote><blockquote>AutoGen</blockquote><blockquote>CrewAI</blockquote><blockquote>OpenAI Agents</blockquote><blockquote>Microsoft Autogenics (when available)</blockquote><p><strong>This is where future jobs will be.</strong></p><h3>5. For Professionals: How to Decide What to Build</h3><p>If you’re already working with AI or building AI tools, use this strategy:<br><strong>Ask yourself these questions:</strong></p><ol><li>Do I just need insights? → <strong>Predictive AI</strong></li><li>Do I need content or explanation? → <strong>Gen AI</strong></li><li>Do I need automation and actions? → <strong>Agentic AI</strong></li><li>Do I need domain expertise embedded? → <strong>Fine-tuned models</strong></li><li>Do I need the AI to learn from company knowledge? → <strong>RAG system</strong></li></ol><p>This framework helps avoid confusion and prevents overengineering.</p><h3>Conclusion:</h3><p>AI is evolving faster than ever, but the truth is simple: not all AI is the same, and not every AI solves the same problem. Predictive AI helps you <em>analyze</em>, Generative AI helps you <em>create</em>, and Agentic AI helps you <em>act</em>. Once you understand these three pillars, the entire AI landscape becomes clearer, and choosing the right approach stops being confusing.</p><p>If you’re just starting, begin with the basics — learn how Gen AI works and how LLMs think. If you’re already in the field, focus on choosing the right AI based on the problem, not the hype. And if you’re building for the future, prepare for Agentic AI, because that’s where real automation, intelligence, and impact are heading.</p><p>In the next part, we’ll go deeper into the future of AI — <strong>how to actually build your own agent</strong>, how tools, memory, reasoning loops work, and why understanding these systems will soon become as essential as learning to code.</p><p>The world is moving toward intelligent workflows and autonomous systems. With the right foundation, <strong>you won’t just follow that future — you’ll help build it</strong>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=051a5dc1e0f7" width="1" height="1" alt=""><hr><p><a href="https://medium.com/generative-ai-revolution-ai-native-transformation/gen-ai-vs-agentic-ai-vs-traditional-ai-051a5dc1e0f7">🚀 Gen AI vs Agentic AI vs Traditional AI</a> was originally published in <a href="https://medium.com/generative-ai-revolution-ai-native-transformation">Agentic AI &amp; GenAI Revolution</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[The Terraform Blueprint (2026): How to Structure, Scale & Secure Your Infrastructure‑as‑Code]]></title>
            <link>https://awstip.com/the-terraform-blueprint-2026-how-to-structure-scale-secure-your-infrastructure-as-code-b35c9e637c80?source=rss-64295cea6d86------2</link>
            <guid isPermaLink="false">https://medium.com/p/b35c9e637c80</guid>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[cloud]]></category>
            <category><![CDATA[terraform]]></category>
            <category><![CDATA[infrastructure-as-code]]></category>
            <category><![CDATA[cloud-security]]></category>
            <dc:creator><![CDATA[Cherukuri sai]]></dc:creator>
            <pubDate>Sun, 15 Feb 2026 08:36:36 GMT</pubDate>
            <atom:updated>2026-02-23T19:34:38.933Z</atom:updated>
            <content:encoded><![CDATA[<p><em>By — Cloud | DevSecOps | SRE | Platform Engineering</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/800/1*ATssnvB95EYglqykqDSvMQ.avif" /></figure><h3>Table of Contents</h3><ol><li>Introduction</li><li>Repository Architecture</li><li>Module Design Principles</li><li>Environment &amp; State Structure</li><li>IaC Security &amp; Scanning</li><li>Secure Terraform Execution</li><li>Drift Detection &amp; Observability</li><li>Terraform Maturity Framework</li><li>Conclusion</li></ol><h3>1. Introduction</h3><blockquote>Terraform has evolved from a simple provisioning tool into the backbone of modern cloud infrastructure. Today, teams rely on it to manage thousands of resources across AWS, Azure, and GCP — yet scaling Terraform successfully is harder than most engineering leaders expect.</blockquote><blockquote>Misconfigured modules, unmanaged drift, weak pipelines, and security gaps often become hidden liabilities inside cloud environments. The good news? These problems are completely avoidable with the right structure, patterns, and guardrails.</blockquote><blockquote>This guide distills the <strong>core principles, best practices, and real‑world patterns</strong> that high‑performing cloud, DevOps, and SRE teams use to keep Terraform secure, predictable, and scalable.</blockquote><blockquote>From repository design to IaC scanning, CI/CD hardening, drift detection, and multi‑environment state strategy — this blueprint gives you everything you need to build Terraform the <em>right</em> way.</blockquote><blockquote>Whether you’re improving an existing Terraform setup or building a new foundation, this framework will help you deliver <strong>reliable IaC with confidence.</strong></blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*s1u1PZSf2wQXwDYoKJiepw.png" /></figure><h3>2. Repository Architecture for Scalable IaC</h3><blockquote><strong>2.1 Mono‑Repo<br>Strengths:</strong><br> • Centralized governance<br> • Consistent patterns<br> • Easier module management<br><strong>Weaknesses:</strong><br> • Can slow down independent teams</blockquote><blockquote><strong>2.2 Service‑Scoped Repos<br>Strengths:</strong><br> • Fast iteration per team<br> • Clear ownership boundaries<br><strong>Weaknesses:</strong><br> • Duplication<br> • Harder to enforce universal standards</blockquote><blockquote><strong>2.3 Hybrid (Recommended)<br></strong>The <strong>core‑infra → module‑registry → app‑repo</strong> model ensures:<br> • cross‑team consistency<br> • ability to scale<br> • module reusability<br> • minimal duplication</blockquote><pre>Example for Hybrid: <br>repo-root/<br> ├── core-infra/<br> ├── modules/<br> │    ├── vpc/<br> │    ├── eks/<br> │    ├── iam/<br> ├── application-services/<br> │    ├── service-a/<br> │    ├── service-b/</pre><h3>3. Module Design Principles That Prevent Chaos</h3><blockquote><strong>3.1 What a Good Module Looks Like<br></strong>A strong module is:<br> • small, composable, and reusable<br> • predictable (inputs/outputs documented)<br> • version‑pinned<br> • never environment‑specific</blockquote><pre>module &quot;storage&quot; {<br>  source = &quot;git::ssh://example.com/storage.git?ref=v1.3.0&quot;<br>  name   = var.name<br>  region = var.region<br>}</pre><blockquote><strong>3.2 Semantic Versioning</strong></blockquote><pre>1.0.0 → breaking changes<br>1.1.0 → enhancements (safe)<br>1.1.1 → fixes</pre><blockquote>Pin versions like this:</blockquote><pre>module &quot;vpc&quot; {<br>  source  = &quot;git::ssh://example.com/vpc.git?ref=v1.2.3&quot;<br>}</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/479/1*X-Y-YhlH4cQzAW3TGb-vUA.png" /></figure><h3>4. Environment Isolation &amp; State Structure</h3><blockquote>State mistakes are one of the fastest ways to break production.<br><strong>4.1 Isolate Everything<br></strong>Use a separate state per:<br> • dev<br> • qa<br> • staging<br> • prod<br>Never mix them.</blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/535/1*Ybc1tLvdB2HAW3uI0ATJQA.png" /></figure><blockquote><strong>4.2 Backend Best Practices<br>AWS: S3 backend + native S3 state locking<br>Azure</strong>: Storage Account + blob locking<br><strong>GCP</strong>: GCS + lock management through CI pipeline</blockquote><pre>terraform {<br>  backend &quot;s3&quot; {<br>    bucket       = &quot;mycompany-terraform-prod&quot;<br>    key          = &quot;network/terraform.tfstate&quot;<br>    region       = &quot;us-east-1&quot;<br>    use_lockfile = true<br>  }<br>}</pre><h3>5. IaC Security, Scanning &amp; Policy Enforcement</h3><blockquote>This is one of the most important parts of the entire blueprint.<br><strong><em>5.1 Pre‑Commit Hooks (Local)</em><br></strong>Run automatically before committing:</blockquote><pre>terraform fmt<br>terraform validate<br>tflint<br>tfsec<br>checkov</pre><blockquote><strong><em>5.2 Security Scanners</em></strong></blockquote><blockquote><strong>Tfsec</strong><br> • IAM issues<br> • Network exposure<br> • Missing encryption</blockquote><pre>ERROR: aws-s3-enable-bucket-encryption<br>S3 Bucket encryption is not enabled.</pre><blockquote><strong>Checkov<br> </strong>• Compliance rules<br> • Cloud‑specific misconfigurations<br> • Data leakage prevention</blockquote><blockquote><strong><em>5.3 Policy as Code<br></em></strong>Recommended engines:<br> • <strong>OPA / Conftest</strong><br> • <strong>HashiCorp Sentinel<br></strong>Example: deny untagged resources</blockquote><pre>deny[msg] {<br>  input.resource.tags == {}<br>  msg = &quot;All resources must include tags.&quot;<br>}</pre><h3>6. Secure Terraform Execution</h3><blockquote>❌ Never Run Terraform Locally<br>Local terraform apply introduces:<br> • drift<br> • audit gaps<br> • privilege risks<br> • shadow infrastructure</blockquote><blockquote>✅ Always Use CI/CD Runners</blockquote><h3>6.1 Secure Execution Identity</h3><blockquote><strong>Best practices</strong>:<br> • Federated identities (OIDC)<br> • No static credentials<br> • Least‑privilege roles<br> • Short‑lived tokens</blockquote><blockquote><strong><em>6.2 Recommended Pipeline</em></strong></blockquote><pre><br><br>+-----------------------------+<br>|        Developer           |<br>|      Git Commit/PR         |<br>+-------------+--------------+<br>              |<br>              v<br>+-------------+--------------+<br>|      Harness CI Stage      |<br>|----------------------------|<br>| - Checkout                 |<br>| - Build &amp; Test             |<br>| - SAST/DAST                |<br>| - Build Docker Image       |<br>| - Push to Artifact Repo    |<br>+-------------+--------------+<br>              |<br>              v<br>+-------------+--------------+<br>| Harness CD: Terraform IaC  |<br>|----------------------------|<br>| terraform init             |<br>| terraform fmt/validate     |<br>| tflint / tfsec / checkov   |<br>| terraform plan             |<br>| Approval Step              |<br>| terraform apply            |<br>+-------------+--------------+<br>              |<br>              v<br>+-------------+--------------+<br>|  Harness Deploy Stage      |<br>|----------------------------|<br>| Helm / K8s |<br>| Health Checks              |<br>| Feature Flags (optional)   |<br>+-------------+--------------+<br>              |<br>              v<br>+-------------+--------------+<br>| Observability &amp; Governance |<br>|----------------------------|<br>| Logs, Metrics, Traces      |<br>| Drift Detection            |<br>| Notifications              |<br>+----------------------------+</pre><h3>7. Drift Detection &amp; Observability</h3><blockquote><strong><em>7.1 Automated Drift Checks<br></em></strong>Run daily/weekly:</blockquote><pre><br>terraform plan -detailed-exitcode<br>if [ $? -eq 2 ]; then<br>  echo &quot;Drift Detected!&quot;<br>fi<br></pre><blockquote><strong>Send output to</strong>:<br> • Slack<br> • Teams<br> • Jira<br> • GitHub issues</blockquote><figure><img alt="" src="https://cdn-images-1.medium.com/max/715/1*mDGvg4vkh2MLElRA0U1sHg.png" /></figure><blockquote><strong><em>7.2 Observability Alignment</em></strong><br>Integrate Terraform outputs with:<br> • Datadog<br> • Prometheus<br> • New Relic<br> • CloudWatch<br> • Azure Monitor<br> • GCP Monitoring</blockquote><blockquote>This enables cross‑visibility between configuration and runtime signals.</blockquote><h3>8. Terraform Maturity Framework</h3><blockquote>A simple view of Terraform maturity:<br><strong>Level 1 — Basic:</strong><br> • Local applies<br> • Zero scanning<br><strong>Level 2 — Standardized:</strong><br> • Structured repos<br> • Versioned modules<br><strong>Level 3 — Governed:</strong><br> • Scanning enforced<br> • Policy‑as‑code<br> • Drift detection<br><strong>Level 4 — Operational Excellence:</strong><br> • Multi‑cloud consistency<br> • IaC observability<br><strong>Level 5 — Intelligent Automation:</strong><br> • Predictive analysis<br> • AI‑assisted Terraform plans<br> • Automated remediation</blockquote><h3>9. Conclusion</h3><blockquote>A well‑designed Terraform ecosystem isn’t just about writing modules or running plans — it’s about building a foundation that teams can trust as they scale. When your IaC is structured, secure, reviewed, tested, and continuously validated through automation, it becomes a force multiplier for every cloud initiative that follows.</blockquote><blockquote>The patterns in this blueprint are not theoretical. They’re the practices that consistently separate stable, predictable cloud environments from ones held together by tribal knowledge and luck. Whether you’re modernizing legacy infrastructure or enabling a high‑velocity platform team, adopting these principles will help you eliminate drift, reduce risk, and accelerate delivery with confidence.</blockquote><blockquote>Infrastructure‑as‑Code should empower teams, not slow them down. With the right pipelines, security checks, and execution workflows in place, Terraform becomes a strategic advantage — unlocking a cloud environment that is reproducible, scalable, and aligned with the operational rigor today’s engineering landscape demands.</blockquote><blockquote>If you’re investing in Terraform today, invest in doing it right. Your future platform will thank you.</blockquote><p>If you found this guide helpful, follow me here on Medium for more deep‑dives on Cloud Architecture, DevSecOps, Terraform, Kubernetes, SRE practices, CI/CD pipelines, and Platform Engineering.</p><p>I publish hands‑on insights, real implementation patterns, and practical frameworks to help engineers and leaders build secure, scalable, and reliable cloud platforms.</p><p>➡️ Follow for more content like this.<br>➡️ Share this article if it helped you.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=b35c9e637c80" width="1" height="1" alt=""><hr><p><a href="https://awstip.com/the-terraform-blueprint-2026-how-to-structure-scale-secure-your-infrastructure-as-code-b35c9e637c80">The Terraform Blueprint (2026): How to Structure, Scale &amp; Secure Your Infrastructure‑as‑Code</a> was originally published in <a href="https://awstip.com">AWS Tip</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>