AI Engineer · RAG · Agents · Evaluation

Building grounded AI systems that retrieve, reason, cite, and stay measurable.

I design production-grade GenAI workflows with Azure OpenAI, Azure AI Search, Pinecone, LangGraph, RAGAS, TruLens, FastAPI, Docker, Kubernetes, and observability built in from day one.

5K+model runs evaluated
96%function-call success target
23%retrieval faithfulness lift
25%token overhead reduction

Featured AI Systems

Enterprise RAG Dispute Assistant

Hybrid Azure AI Search + Pinecone retrieval with semantic chunking, metadata filters, citation-first drafting, and evaluation feedback loops.

Repository

LangGraph Agentic Workflows

Role-based triage, research, drafting, critique, and clarification agents with policy lookup tools and backend validation.

Repository

RAG Evaluation Harness

Golden questions, LLM-as-Judge, RAGAS, TruLens, Recall@k, nDCG, citation coverage, faithfulness, and human review tracking.

Repository

Production Pattern

My systems start with document ingestion and metadata strategy, then layer hybrid retrieval, agent orchestration, tool validation, answer drafting, and continuous evaluation.

  1. Airflow ingestion and redaction
  2. Semantic chunking with overlap
  3. Azure AI Search BM25 and Pinecone vector retrieval
  4. BERT re-ranking and freshness boosting
  5. LangGraph agent orchestration
  6. RAGAS, TruLens, and LLM-as-Judge evaluation

Stack

PythonFastAPILangChainLangGraphAzure OpenAI Azure AI SearchAzure AI FoundryPineconeRAGASTruLens AWS SageMakerDockerKubernetesOpenTelemetryGrafana