I build systems that help people discover and understand information at scale. The through-line of my career is a simple belief: language models can supercharge how humans find, process, and act on the information that matters to them.
I'm the founder of Zipf AI, where we deploy agents to search and explore internal and external data sources continuously. Under the hood, this is a closed-loop system: agents search and crawl, an evaluation layer scores information gain, and a self-healing pipeline optimizes search queries and crawl targets using preference-based feedback.
Before Zipf, I worked on retrieval, extraction, and generation at Snowflake where our team shipped Arctic-Embed, open-source embedding models that matched the best proprietary alternatives—trained on just two H100 nodes over a few weeks. At Neeva (acquired by Snowflake), I built LLM-powered retrieval pipelines for search. At Microsoft, I co-created MS MARCO, the benchmark that defined neural information retrieval (3,400+ citations, 10,000+ researchers).
I currently co-organize the TREC RAG Track (2024–2026), defining how retrieval-augmented generation systems should be evaluated. I previously co-organized the TREC Deep Learning Track (2018–2023) and the TREC Product Search Track (2023–2025).
The web contains the answers to most of humanity's questions, but finding the right information at the right time remains hard. Search engines return links. LLMs can reason but hallucinate. I believe the path forward is deploying agents that don't just answer queries but continuously search and explore data, understand what changed, and surface signals before you know to ask.
This means working across the full stack: evaluation methodology (how do you know your system is actually helping?), efficient inference (how do you make this affordable at scale?), model training (how do you teach models to reason over evidence rather than pattern-match?), and system design (how do you build agents that improve themselves over time?).
Based in the lower Hudson Valley, juggling AI research, renovations, hunting, and raising three incredible children.
PhD in Computer Science from UIUC (advised by ChengXiang Zhai), MS in Computational Linguistics from UW, BS in Computer Science from RPI. Born and raised in Mexico City.