Available for new AI engineering roles

// senior ai engineer

Shipping production LLM
and multi-agent systems.

I'm David Lukić — a Senior AI Engineer with 7+ years designing, shipping, and observing RAG, MCP, and multi-agent systems in production. From prototype to ~500K users, with evaluation, guardrails, and cost-latency tuning baked in.

Start a conversation Download CV View work →

years shipping

~500K

users served

100%

Upwork JSS

Top Rated

Freelancer

~/agents/ops.log

$ agent.route --provider=bedrock

› routed in 142ms

$ rag.query "compliance evidence"

› 12 chunks · groundedness 0.94

$ eval.run --suite=prod

› 248/250 passed · p95 1.8s

● observability langfuse + langsmith

10 agents · 500K users healthy

// 01 · about

What I do, and how I do it.

I'm a Senior AI Engineer and Senior Data Engineer with 7+ years shipping production LLM and autonomous agent systems alongside large-scale data platforms. I'm a Top Rated Upwork freelancer with a 100% Job Success Score.

I own the full lifecycle — prototype through scaled, monitored production — including evaluation, observability, and guardrails. Recent work: a multi-agent customer-support system serving ~500K users on AWS Bedrock and ECS; an internal hybrid RAG + MCP data platform unifying MySQL, S3, Drive, and Sheets behind a natural-language interface; a CrewAI-based multi-agent SEO content pipeline; and a compliance-report AI that grades client materials against regulatory requirements.

Fluent across major model providers, async Python, and containerized infrastructure (Docker, Kubernetes, ECS).

OpenAIAnthropicGeminiAWS Bedrock

// 02 · experience

Where I've been shipping.

Six roles across AI engineering, full-stack, and data — a track record of owning production systems end-to-end.

Senior AI Engineer

BetterCollective

Nov 2024 — Present

Promoted from Data Engineer / Full-Stack Developer (2022)
- Built an internal Model Context Protocol (MCP) server unifying MySQL, AWS S3, Google Drive, and Google Sheets behind a hybrid RAG + MCP interface. Analysts ask questions in natural language and get immediate SQL execution, default visualizations, and a custom Power BI–style dashboard builder — replacing a workflow that previously required a developer to write queries and build reports manually.
- Shipped the MCP platform as a Django application with a custom UI; stack: FastMCP, FastAPI, vector DB for hybrid retrieval.
- Designed and deployed a multi-agent SEO content system — ~8 autonomous CrewAI agents covering research, analysis, scraping, and writing — running the full content pipeline end-to-end (CrewAI, FastAPI, Docker, Pydantic).
- Built three production RAG systems: (1) a cross-source knowledge RAG spanning Google Drive, Slack, internal docs, and external conference and article content, with full source attribution for every answer; (2) a competitor content intelligence RAG for angle and gap analysis against competitor articles and site structures; (3) a historical content performance recommender indexing past articles against their metrics so writers can surface what worked for a given topic with evidence.
- Established evaluation and observability for the RAG and agent systems using LangFuse and LangSmith — tracking groundedness, answer quality, latency, and cost per query across OpenAI, Anthropic, and Gemini.
- Own architecture decisions, model selection, prompt design, guardrails, and cost/latency tuning.
- Partner with product, marketing, and data teams to turn vague problems into shipped AI features.
PythonFastAPIDjangoCrewAILangFuseLangSmithMCPVector DBOpenAIAnthropicGemini
Senior AI Engineer (Part-time)

ComplianceLabX

Jan 2025 — Present
- Build and ship a production AI application that generates compliance reports — ingests client materials, evaluates them against regulatory requirements, and returns a structured report flagging compliant and non-compliant areas with evidence.
- Designed the end-to-end pipeline from ingestion and retrieval through inference, evaluation, and monitoring — with a focus on reliability, low-latency inference, and auditable outputs.
- Implemented guardrails and evaluation harnesses for groundedness, citation accuracy, and hallucination rate, so every report claim is traceable to the source regulation.
- Optimize for cost and performance through batching, caching, vector search tuning, and async processing; established LLM observability across requests, tokens, and latency.
- Own architecture decisions, production debugging, and post-mortems across the stack.
RAGGuardrailsEvaluationVector SearchPythonAsync
Senior AI Engineer (Contract)

OnTheGoSystems

Oct 2025 — Apr 2026
- Built and shipped a multi-agent AI customer support system serving ~500,000 users in production. Roughly 10 autonomous agents handle refunds, how-to support, ticket triage, bug reports, and credit-related queries.
- Designed across multiple repositories (agent service, support backend, infrastructure) using FastAPI, Docker, and AWS ECS.
- Integrated multiple model providers — AWS Bedrock, OpenAI, Anthropic, Gemini — with routing logic balancing cost and latency per agent role.
- Owned the full pipeline: ingestion, retrieval, inference, tool use, guardrails, and LLM observability (tracing, latency, cost, quality metrics).
- Built evaluation harnesses over real production conversation data to catch regressions in agent behavior; iterated prompts and routing with product and backend teams.
CrewAIFastAPIDockerAWS ECSAWS BedrockOpenAIAnthropicGeminiMulti-Agent
Senior Data Engineer

index.dev

2024 — 2025
- Architected scalable ETL pipelines with Pydantic, dbt, and Airflow — 30% faster processing and 20% operational efficiency gain for enterprise clients.
- Tuned Postgres and DuckDB analytical workloads through query optimization and indexing strategies, accelerating decision-making by 20%.
- Enhanced Snowflake warehouse performance and reduced query costs on high-volume analytics.
- Built 10+ reproducible containerized development environments (Docker, Poetry), cutting onboarding time by 40%.
- Delivered interactive Power BI and Plotly Dash dashboards with cross-filtering, callbacks, and drill-downs.
- Enforced testing (pytest), type safety (mypy/pyright), and code quality (ruff/black) across production codebases.
AirflowdbtPydanticSnowflakePostgresDuckDBDockerPower BIPlotly Dash
Data Engineer / Full-Stack Developer

BetterCollective

2022 — Nov 2024

Promoted to Senior AI Engineer in Nov 2024
- Shipped 14 production Django + Plotly Dash applications for 50+ internal users, eliminating ~80% of manual reporting workflows.
- Designed LogAna, an automated log analysis platform processing 10M+ daily log entries — cut manual analysis from 8 hours to 30 minutes (400% improvement in processing speed).
- Engineered high-throughput data pipelines (pandas, Polars) processing millions of daily records with 99.9% uptime.
- Built ETL workflows across REST APIs, AWS S3, PostgreSQL, and DuckDB with sub-second query performance.
- Implemented asynchronous web scraping services with proxy rotation, rate limiting, and robust error handling.
- Managed AWS-hosted PostgreSQL with Redis caching and strategic indexing for fast, reliable data access.
DjangoPlotly DashpandasPolarsAWS S3PostgreSQLDuckDBRedis
AI & Data Engineering Consultant

Self-Employed — Upwork

2020 — Present

Top Rated Upwork Freelancer · 100% Job Success Score
- Architect and deploy production LLM applications — LangChain, LlamaIndex, RAG with Pinecone and PGVector — delivering document Q&A and semantic search for global clients.
- Build custom AI automation workflows integrating GPT-4, Claude, and open-source models; clients report ~60% operational efficiency gains.
- Design end-to-end ETL pipelines for lead generation and marketing analytics — 60% accuracy improvement and 70% processing time reduction.
- Founded knowtheprice.com.au — full-stack real-time price comparison platform (Next.js, Supabase, Vercel).
LangChainLlamaIndexPineconePGVectorGPT-4ClaudeNext.jsSupabase

// 03 · work

Selected production systems.

Five builds that represent how I think about production AI — end-to-end ownership, measurable outcomes, and evaluation baked in.

OnTheGoSystems · Oct 2025 — Apr 2026

Multi-Agent Production Support System

~10 autonomous agents serving roughly 500K users — handling refunds, how-to support, ticket triage, bug reports, and credit queries. Multi-provider routing balances cost and latency per agent role.

impact ~500K users

CrewAIAWS BedrockFastAPIDocker +4

BetterCollective · 2025

Hybrid RAG + MCP Data Platform

Internal Model Context Protocol server unifying MySQL, S3, Google Drive, and Sheets behind a natural-language interface. Analysts get instant SQL execution, default visualizations, and a custom Power BI–style dashboard builder.

impact Replaced dev-gated reporting

FastMCPFastAPIDjangoVector DB +2

BetterCollective · 2025

Multi-Agent SEO Content Pipeline

End-to-end CrewAI system with ~8 autonomous agents covering research, analysis, scraping, and writing — running the entire content pipeline without human hand-off.

impact ~8 CrewAI agents

CrewAIFastAPIDockerPydantic +1

ComplianceLabX · 2025 — Present

Compliance Report AI

Production AI grading client materials against regulatory requirements. Every finding is source-attributed; guardrails track groundedness, citation accuracy, and hallucination rate end-to-end.

impact Auditable outputs

RAGGuardrailsEvaluationPython +1

Founder — Independent · 2024 — Present

knowtheprice.com.au

Full-stack, real-time price comparison platform I built and shipped end-to-end. Next.js front-end, Supabase back-end, Vercel deploy — designed to stay fast and cheap at scale.

impact Founder project

Next.jsSupabaseVercelTypeScript +1

// 04 · stack

The toolbox.

What I reach for when designing production AI and data systems.

AI & LLM Engineering

LangChain LangGraph LlamaIndex CrewAI DSPy MCP (Model Context Protocol) FastMCP RAG Multi-Agent Systems Prompt Engineering LLM Evaluation Observability & Guardrails LangFuse LangSmith Pinecone Weaviate PGVector Hybrid / Semantic Search

LLM Providers

OpenAI API Anthropic Claude Google Gemini AWS Bedrock Open-source LLMs

Programming & Backend

Python SQL Bash FastAPI Django Flask Pydantic Async Python

Data Engineering

Apache Spark (PySpark) Airflow Dagster dbt Kafka Delta Lake ETL / ELT Data Modeling Data Warehousing pandas Polars NumPy Real-time Pipelines

Cloud & Infrastructure

AWS (Bedrock, SageMaker, Lambda, ECS, S3, Redshift) Azure Google Cloud Databricks Snowflake Vercel Supabase

MLOps & DevOps

Docker Kubernetes MLflow CI/CD (GitHub Actions) Poetry pytest mypy ruff / black Git Linux Web Scraping Celery

Databases

PostgreSQL DuckDB MongoDB MariaDB Redis S3 Delta Lake

Visualization & BI

Power BI Plotly Dash Streamlit Matplotlib Seaborn

// 05 · writing

Technical writing — in progress.

Distilled notes from shipping production LLM systems. Coming soon.

I'm drafting deep-dives on the lessons I've taken from running LLM systems in production. Topics on the list:

Designing production RAG: retrieval, reranking, and evaluation
Multi-agent orchestration with CrewAI — what works, what breaks
Model Context Protocol: patterns for real analyst workflows
LLM observability: groundedness, cost, and latency in one loop

→ Want early access? Reach out below.

// 06 · contact

Let's build something measurable.

If you're scaling AI infrastructure, shipping your first production LLM system, or just want to talk multi-agent design — I'd love to hear from you.

Book a 30-min call Email me Download CV

Also reachable at

$ curl davidlukic99.github.io

Shipping production LLM and multi-agent systems.

Senior AI Engineer

Senior AI Engineer (Part-time)

Senior AI Engineer (Contract)

Senior Data Engineer

Data Engineer / Full-Stack Developer

AI & Data Engineering Consultant