Sahaj Upadhyay, Nandan Thakur, Ronak Pradeep, Nick Craswell, Daniel Campos, Jimmy Lin - Overview of the TREC 2025 Retrieval Augmented Generation (RAG) Track - TREC 2025
Ronak Pradeep, Nandan Thakur, Sahaj Upadhyay, Daniel Campos, Nick Craswell, Ian Soboroff, Jimmy Lin - The Great Nugget Recall: Automating Fact Extraction and RAG Evaluation with Large Language Models - SIGIR 2025
Sahaj Upadhyay, Ronak Pradeep, Nandan Thakur, Daniel Campos, Nick Craswell, Ian Soboroff, Jimmy Lin - A Large-Scale Study of Relevance Assessments with Large Language Models Using UMBRELA - SIGIR 2025
Nandan Thakur, Ronak Pradeep, Sahaj Upadhyay, Daniel Campos, Nick Craswell, Jimmy Lin - Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges
Nandan Thakur, Ronak Pradeep, Sahaj Upadhyay, Daniel Campos, Nick Craswell, Ian Soboroff, Jimmy Lin - Assessing Support for the TREC 2024 RAG Track - SIGIR 2025
Jaeseong Lee, seung-won hwang, Aurick Qiao, Daniel F Campos, Zhewei Yao, Yuxiong He - STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning - ACL 2025
Yunjae Lee, seung-won hwang, Daniel F Campos, Filip Gralinski, Zhewei Yao, Yuxiong He - Inference Scaling for Bridging Retrieval and Augmented Generation - NAACL 2025
Yunjae Lee, seung-won hwang, Daniel F Campos, Filip Gralinski, Zhewei Yao, Yuxiong He - CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation - NAACL 2025
Jaeseong Lee, seung-won hwang, Aurick Qiao, Daniel Campos, Zhewei Yao, Yuxiong He - TALE: Token-Adaptive Low-Rank KVCache Approximation with Reconstruction Elimination - TACL 2025
Keshav Huang, Tara Venkatesh, Utkarsh Dingankar, Antonio Mallia, Daniel Campos, Jimmy Jiao, Christopher Potts, Omar Khattab - ColBERT-serve: Efficient Multi-Stage Memory-Mapped Scoring - ECIR 2025
Michael J Ryan, Danmei Xu, Chris Nivera, Daniel Campos - EnronQA: Towards Personalized RAG over Private Documents
Gabriele Oliaro, Zhihao Jia, Daniel Campos, Aurick Qiao - SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications - NeurIPS 2025
Sahaj Upadhyay, Ronak Pradeep, Nandan Thakur, Daniel Campos, Nick Craswell, Ian Soboroff, Jimmy Lin - A Large-Scale Study of Relevance Assessments with Large Language Models: An Initial Look
Ronak Pradeep, Nandan Thakur, Sahel Sharifymoghaddam, Eric Zhang, Ryan Nguyen, Daniel Campos, Nick Craswell, Jimmy Lin - Ragnarok: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track - ECIR 2025
Hossein A. Rahmani, Nick Craswell, Emine Yilmaz, Bhaskar Mitra, Daniel Campos - Synthetic Test Collections for Retrieval Evaluation - SIGIR 2024
Luke Merrick, Danmei Xu, Gaurav Nuti, Daniel Campos - Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models
Puxuan Yu, Luke Merrick, Gaurav Nuti, Daniel Campos - Arctic-Embed 2.0: Multilingual Retrieval Without Compromise
Daniel Campos, Surya Kallumadi, Corby Rosset, Cheng Xiang Zhai, Alessandro Magnani - Overview of the TREC 2023 Product Product Search Track - TREC 2023
EFFICIENT AND ROBUST WEB SCALE LANGUAGE MODEL BASED RETRIEVAL, GENERATION, AND UNDERSTANDING — University of Illinois Urbana-Champaign Computer Science Doctoral Thesis
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin - Overview of the TREC 2022 Deep Learning Track - TREC 2022
Daniel Campos, ChengXiang Zhai - To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence Models for Improved Inference Efficiency - SustaiNLP 2023 @ ACL 2023
Daniel Campos, Alessandro Magnani, ChengXiang Zhai - Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical Dual Encoders - SustaiNLP 2023 @ ACL 2023
Daniel Campos, ChengXiang Zhai - Dense Sparse Retrieval: Using Sparse Language Models for Inference Efficient Dense Retrieval
Daniel Campos, Alessandro Magnani, ChengXiang Zhai - Noise-Robust Dense Retrieval via Contrastive Alignment Post Training (CAPOT)
Daniel Campos, Alexandre Marques, Tuan Nguyen, Mark Kurtz, ChengXiang Zhai - oBERTa: Improving Sparse Transfer Learning via Improved Initialization, Distillation, and Pruning Regimes - SustaiNLP 2023 @ ACL 2023
Daniel Campos, Daniel Perry, Samir Joshi, Yashmeet Gambhir, Wei Du, Zhengzheng Xing, Aaron Colak - Compressing Cross-Lingual Multi-task Models at Qualtrics - IAAI-23
Eldar Kurtic, Daniel Campos, Tuan Nguyen, Elias Frantar, Mark Kurtz, Benjamin Fineran, Michael Goin, Dan Alistarh - The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models - EMNLP 2022
Daniel Campos, Alexandre Marques, Tuan Nguyen, Mark Kurtz, ChengXiang Zhai - Sparse*BERT: Sparse Models are Robust - Sparsity in Neural Networks Workshop @ ICML 2022
Jimmy Lin, Daniel Campos, Nick Craswell, Bhaskar Mitra, Emine Yilmaz - Fostering Coopetition While Plugging Leaks: The Design and Implementation of the MS MARCO Leaderboards - SIGIR 2022
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin - Overview of the TREC 2021 Deep Learning Track - TREC 2021
Daniel Campos, Heng Ji - IMG2SMI: Translating Molecular Structure Images to Simplified Molecular-input Line-entry System
Daniel Campos - Curriculum Learning for Language Modeling
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen Voorhees and Ian Soboroff - TREC Deep Learning Track: Reusable Test Collections in the Large Data Regime - SIGIR 2021
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Jimmy Lin - MS MARCO: Benchmarking Ranking Models in the Large-Data Regime - SIGIR 2021
Jimmy Lin, Daniel Campos, Nick Craswell, Bhaskar Mitra, Emine Yilmaz - Significant Improvements over the State of the Art? A Case Study of the MS MARCO Document Ranking Leaderboard - SIGIR 2021
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos - Overview of the TREC 2020 Deep Learning Track - TREC 2020
Explorations In Curriculum Learning Methods For Training Language Models — University of Washington Computational Linguistics Master's Thesis
Nick Craswell, Daniel Campos, Bhaskar Mitra, Emine Yilmaz, Bodo Billerbeck - ORCAS: 18 Million Clicked Query-Document Pairs for Analyzing Search - CIKM 2020
Yaobo Liang, Nan Duan, et al., Daniel Campos, Rangan Majumder, Ming Zhou - XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation - EMNLP 2020
Emine Yilmaz, Nick Craswell, Bhaskar Mitra and Daniel Campos - On the Reliability of Test Collections to Evaluating Systems of Different Types - SIGIR 2020
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen M. Voorhees - Overview of the TREC 2019 Deep Learning Track - TREC 2019
Corbin Rosset, Chenyan Xiong, Xia Song, Daniel Campos, Nick Craswell, Saurabh Tiwary and Paul Bennett - Leading Conversational Search by Suggesting Useful Questions - WWW 2020
Manling Li, Ying Lin, et al., Daniel Campos, Heng Ji, et al. - GAIA at SM-KBP 2020 - TAC 2020
Lee Xiong, Chuan Hu, Chenyan Xiong, Daniel Campos, Arnold Overwijk and Xiayu Huang - Open Domain Web Keyphrase Extraction Beyond Language Modeling - EMNLP 2019
Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, Tong Wang - MS MARCO: A Human Generated MAchine Reading COmprehension Dataset — Website, Github
Daniel Campos, Zoe Konrad - Experiments in Inferring Social Networks of Diffusion
University of Illinois Urbana-Champaign (UIUC) - PhD Computer Science 2023. Thesis: Efficient and Robust Web Scale Language Model Based Retrieval, Generation, and Understanding. Advisor: ChengXiang Zhai.
University of Washington - MS Computational Linguistics 2020. Thesis: Explorations in Curriculum Learning Methods for Training Language Models.
Rensselaer Polytechnic Institute - BS Computer Science 2014
Founder & CEO - Zipf AI — Oct 2025–Present
Senior Research Scientist, Tech Lead - Snowflake — May 2023–Oct 2025
Senior Research Scientist - Neeva (acquired by Snowflake) — Dec 2022–May 2023
Applied Scientist Consultant - Walmart Labs — June 2022–Dec 2022
Applied Scientist Consultant - Qualtrics — March 2022–June 2022
Research Scientist Consultant - Mendel AI — Oct 2021–March 2022
Research Scientist Consultant - Neural Magic — Oct 2020–March 2023
Teaching Assistant - UIUC (CS 510, CS 410, CS 124) — Jan 2021–May 2023
Research Assistant - UIUC Blender Lab — June 2020–Dec 2021
Senior PM / Applied Scientist - Microsoft Research & AI, Bing — Aug 2015–Oct 2020
Ripple X Fellow (2022)
Z Fellow (2022)
Gene Golub Fellowship at UIUC (2020–2021)
UIUC Summer Predoctoral Institute Fellow (2020)
RPI Business Model Competition 1st Place (2014)
Harvard iLab Cultural Entrepreneurship Challenge Finalist (2014)
Enhanced Searching Using Fine-Tuned Machine Learning Models - U.S. Patent 12,314,318 - Granted 2025
Enhanced Search Result Generation Using Multi-Document Summarization - U.S. Patent 12,561,375 - Granted 2026
Using a Multi-Task-Trained Neural Network to Guide Interaction with a Query-Processing System via Useful Suggestions - U.S. Patent 11,853,362 - Granted 2023
Keyphrase Extraction Beyond Language Modeling - U.S. Patent 11,657,223 - Granted 2023
Executing Queries with Hallucination Safeguards - U.S. Patent App. 19/034,022 - Filed 2026
NIST TREC RAG Track Co-organizer (2024–2026)
NIST TREC Product Search Track Principal Coordinator (2023–2025)
NIST TREC Deep Learning Track Co-organizer (2018–2023)
ACM SIGIR/SIGKDD Africa Summer School Invited Lecturer (2019, 2020)
Invited Talk: Benchmarking End to End Product Retrieval - SIGIR eCommerce Workshop 2023
Invited Talk: Making LLM Inference Affordable - LLMs in Production Conference 2023
Invited Lecture on Unstructured Pruning - UT Austin VITA Lab
Teaching Assistant & Guest Lecturer - UIUC CS 510: Advanced Information Retrieval & CS 410: Text Information Systems (2021–2023)