Tutorial on Retrieval-Enhanced Machine Learning: Synthesis and Opportunities

F. Diaz, A. Drozdov, T.-E. Kim, A. Salemi, H. Zamani
SIGIR, 2025
Retrieval-Enhanced Machine Learning (REML) refers to the use of information retrieval (IR) methods to support reasoning and inference in machine learning tasks. Although relatively recent, these approaches can substantially improve model performance. This includes improved generalization, knowledge grounding, scalability, freshness, attribution, interpretability, and on-device learning. To date, despite being influenced by work in the information retrieval community, REML research has predominantly been presented in natural language processing (NLP) conferences. Our tutorial addresses this disconnect by introducing core REML concepts and synthesizing the literature from various domains in machine learning (ML), including, but not limited to, NLP. What is unique to our approach is the use of consistent notations to provide researchers with a unified and expandable framework. The tutorial will be presented in lecture format based on an existing manuscript, with supporting materials and a comprehensive reading list available at a website. Building on the momentum of our successful workshop at SIGIR 2023 and our tutorial at SIGIR-AP 2024, this year's tutorial features updated content with an emphasis on retrieval technologies used across the broader ML community. We also highlight their role in emerging, future-facing applications such as language agents and evolving scenarios where the extensive body of knowledge from IR can provide critical insights and capabilities.

bibtex

Copied!
@inproceedings{diaz:reml-tutorial-sigir2025, year = {2025}, url = {https://doi.org/10.1145/3726302.3731695}, title = {The Second Tutorial on Retrieval-Enhanced Machine Learning: Synthesis and Opportunities}, series = {SIGIR '25}, publisher = {Association for Computing Machinery}, pages = {4130--4133}, numpages = {4}, location = {Padua, Italy}, isbn = {9798400715921}, doi = {10.1145/3726302.3731695}, booktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval}, author = {Diaz, Fernando and Drozdov, Andrew and Kim, To Eun and Salemi, Alireza and Zamani, Hamed}, address = {New York, NY, USA} }