Work

Systems we've shipped.

Client names anonymized. Outcomes are real.

Series B SaaS — Legal tech

Contract review agent: 94% accuracy, 80% faster than human review

FDE-Agent12 weeks

Built a multi-agent pipeline that ingested contracts via API, extracted key clauses using structured LLM output, flagged risk areas against a custom rubric, and routed edge cases to human reviewers. Shipped with an eval harness of 500 labeled contracts.

Claude 3.5 SonnetLangGraphPostgreSQLVercel

Fortune 500 — Financial services

On-prem LLM deployment for internal analyst tooling — zero data leaves the perimeter

FDE-Sovereign20 weeks

Deployed Llama 3.1 70B on-premises behind the client firewall. Built a RAG pipeline over 10 years of internal research documents. Integrated with existing Bloomberg Terminal workflows. Full FedRAMP documentation package included in handoff.

Llama 3.1 70BvLLMpgvectorOn-prem

Series A — Healthcare AI

Eval framework that reduced clinical AI false positive rate by 34%

FDE-Eval6 weeks

Audited an existing clinical NLP system and found systematic bias in the training eval set. Rebuilt the evaluation pipeline with clinician-annotated ground truth. Implemented continuous monitoring that alerts on demographic performance drift.

HIPAAPythonWeights & BiasesCustom evals

Growth-stage — Developer tools

Cut embedding pipeline cost by 70% while improving retrieval quality

FDE-Infrastructure8 weeks

Replaced a naive full-doc embedding approach with a hybrid chunking strategy and cohere reranking. Migrated from a managed vector DB to self-hosted Weaviate. Implemented async batch embedding that reduced API costs from $18K/month to $5K/month.

WeaviateCohereFastAPIGCP

Your system next.

We scope every engagement in 2 days. No commitment required.

Start a project