As an AI-focused builder with a decade in data platforms, the current work centers on designing pragmatic LLM systems—RAG copilots, tool-using agents, and evaluable AI services that are production-ready.
Confidence lies in architecting scalable, cost-aware data and vector pipelines, with tight feedback loops, observability, and CI/CD for rapid iteration.
-
Technical range spans Python/TypeScript, Spark, and SQL to modern AI stacks (LangChain/LangGraph/LangStack), vector DBs (Pinecone/Weaviate/Qdrant/FAISS), and cloud-native serving.
-
Hands-on across AWS, Azure, and GCP with MLOps patterns: tracing, evals, caching, and latency/cost controls.
-
Strong communicator and system thinker with domain grounding in Retail and Healthcare.
-
Always learning, shipping, and refining—agent safety, prompt/eval regression, and data-centric retrieval patterns.
-
🔭 I’m currently working on AI productization: domain copilots, data-aware agents, and retrieval stacks wired to real KPIs.
-
🌱 I’m currently learning agentic orchestration (graphs), safe tool-use, eval-driven development, and structured reasoning.
-
👯 I’m looking to collaborate on Agents || RAG Platforms || Applied AI.
-
📝 I regularly write on practical AI systems engineering and data-to-LLM workflows for blogs and social.
-
💬 Ask me about Agents, RAG, Evals/Observability, and shipping AI to production.
-
📫 How to reach me niranjanagaram@gmail.com
-
⚡ Hobbies Playing Guitar, singing, fitness and standup comedy.
- LangChain, LangGraph, LangStack, CrewAI for agentic workflows and graph-based orchestration. - RAG: Pinecone, Weaviate, Qdrant, FAISS with hybrid search and rerankers. - Evals/Observability: Ragas, DeepEval, tracing, prompt regression, golden sets. - Serving: vLLM/Ollama, Ray Serve, BentoML with caching/batching for latency/cost.


