Research

6,600+ citations, h-index 23. My research spans evaluation methodology, efficient inference, embedding models, and retrieval-augmented generation. The connecting thread is making information systems smarter, faster, and more trustworthy. Google Scholar.

Research Themes

Evaluation & Benchmarks

How do you know a search or RAG system is actually working? I've spent a decade building the evaluation infrastructure the field uses to answer this question.

Efficient Inference & Model Compression

Making language models fast and cheap enough to deploy at web scale. My work on sparsity, pruning, and distillation predates the current wave of interest in efficient LLM inference.

Embedding Models & Retrieval

Training and deploying embedding models that make retrieval work for everyone. All models released under Apache 2.0 and have reached millions of monthly downloads.

RAG & Agentic Systems

From one-shot answers to agents that continuously search, explore, and accumulate understanding over time.

Community