
Building production
agentic AI systems.
I'm Hamad Khan — an AI/ML engineer shipping multi-agent backends, Agentic RAG, and observable LLM services. Peer-reviewed researcher in low-resource NLP.
Featured projects.
Production AI, client web work, peer-reviewed research, and open-source dev tools — filter by what you're here for.

Self-Enroll AI
Conversational enrolment assistant for higher-ed. Five role-scoped agents under a LangGraph supervisor — replacing a static contact form with a stateful, observable assistant.

Pension Review
Public pension-eligibility assistant. Natural-language input parsed by Gemini, district lookup served by Typesense, FastAPI returns the verdict with citations.
LawSync
Agentic RAG over Islamabad High Court rulings. Bounded ReAct loop, hybrid semantic-plus-fuzzy search, and a verifier pass enforcing citations on every answer. Private beta with a legal-research firm.

Pension Calculations
Public-facing calculator paired with the Pension Review backend. Clean, fast Vercel deployment.
Have something to build?
Production AI backends, RAG systems, custom chatbots, or full-stack web apps. One-off, ongoing, or agency collaborations.
Where I've shipped.
AI/ML Engineer
Shipping production AI backends across three flagship products: Self-Enroll AI, Pension Review, and LawSync (IHC Agentic RAG).
- Built a multi-agent FastAPI backend on LangGraph with five role-scoped agents (RBAC) under a supervisor graph, calling Gemini via google-genai with context caching.
- Shipped an Agentic RAG service over IHC judgments — bounded ReAct loop, hybrid semantic+fuzzy search, gemini-embedding-001 in Qdrant, and a verifier pass that emits numbered citations.
- Secured production APIs with API-Key auth, slowapi rate limiting, Pydantic validation; LangGraph checkpoints in Redis (Upstash); users in Firebase Auth + Firestore.
- Instrumented end-to-end tracing in OpenTelemetry + Grafana Cloud + Langfuse with component-level versioning and per-model cost tracking.
- Wrote pytest suites and offline eval datasets (regression, refusal, low-relevance) for the legal RAG agent. Shipped all services as Docker containers on Cloud Run.
AI/ML Engineer (Remote)
Built an AI-powered American College Test (ACT) preparation tool for US students.
- Led creation of an NLP dataset digitised from ACT test books — questions, answers, and rationales.
- Fine-tuned a language model with LangChain, OpenAI API, and vector databases.
- Engineered an interactive chatbot to let students review answers and practice new questions.
Production-grade AI/ML stack.
The tools I reach for to ship reliable, observable AI systems.
AI / LLM Engineering
LLM Models & APIs
Backend Engineering
Retrieval & Persistence
Cloud & Containers
Observability & Eval
Engineer and researcher.
I build production-grade backends for multi-agent systems and Agentic RAG — observable LLM services on FastAPI, Gemini, LangGraph, and Qdrant, deployed to Cloud Run with end-to-end tracing and per-prompt cost tracking.
Peer-reviewed researcher in low-resource NLP (PeerJ Computer Science, 2024) — currently applying for fully funded graduate programs.
Academic
BSc Software Engineering
3.63 / 4.0 — University of Malakand
Published
PeerJ Computer Science · 2024
Pashto NLG with transformers
Production
3+ years shipping AI backends
Multi-agent · RAG · LLM APIs
Stack
FastAPI · Gemini · LangGraph
Qdrant · Cloud Run · Langfuse
Peer-reviewed publications.
Pashto poetry generation: deep learning with pre-trained transformers for low-resource languages
Pashto Poetry Dataset (PPD) for Machine and Deep Learning Applications
Let's build something together.
AI/ML projects, research collaborations, or graduate opportunities — drop a line.
- · Full-time & part-time positions
- · Freelance & contract work
- · Master's programs in AI / NLP / CS