Punit VaraBuilding an LLM App: My Experience with Ollama, LLaMA 3, and ChatGPTIn my recent project, I set out to build a Local Language Model (LLM) app using Ollama and LLaMA 3. The goal was to create an application…Jul 17Jul 17
Punit VaraReal Time Inference Architecture-1Here is common real time inference architecture if you want to deploy your inference. For example I have used AWS Sagemaker but you can use…Jul 4Jul 4
Punit VarainDevOps.devLearnings from building AWS Sagemaker Pipeline/Processing jobThis is place where I would like to note down my learnings because I didn’t find enough documents in AWS which could have helped me solve…Mar 14Mar 14
Punit VarainGoPenAITesting AWS Sagemaker EndpointThis is very short article to show case once AWS sagemaker Endpoint is deployed, how can you do testing ?Mar 61Mar 61
Punit VarainGoPenAIScalable Python ML App Deployment: Best PracticesThere are tons of tutorials on internet to deploy python ML application with various backend frameworks such as Flask, FastAPI, Streamlit…Feb 17Feb 17
Punit VaraUpstox Trading API — DemoI got free access to Upstox trading APIs. I thought of playing with APIs. Documents to set it up is not quite straight forward. Putting…Jan 15Jan 15
Punit VaraDemystifying Custom Comparators: A Guide to Sorting AlgorithmsGoal is to put out code snippet written from scratch to understand how custom comparator works with various sorting algorithms. If you want…Dec 12, 2023Dec 12, 2023
Punit VarainGoPenAIBoosting API Performance: Compressing Payloads for FastAPI POST EndpointsWhile working on an ML project, I wanted to squeeze everything I can to optimise response time for real time inferencing. Network latency…Nov 25, 2023Nov 25, 2023
Punit VaraAWS development Setup: All things AWSThis article will always be WIP. I am going to keep this to quickly refer common code snippet I use for my AWS development projects.Nov 3, 2023Nov 3, 2023
Punit VaraAnalysing API response timeI have been working on ML model deployment since quite sometime. Application consumes model in real time. Hence, requires online…Oct 19, 2023Oct 19, 2023