Preference-Based Evaluation

This research develops evaluation methods that move beyond traditional scalar metrics to directly model user preferences and system robustness. Key contributions include recall-paired preference (RPP), a metric-free evaluation method that computes preferences between ranked lists while simulating multiple user subpopulations per query. This can be generalized to lexicographic evaluation approaches for both precision (lexiprecision) and recall (lexirecall) that improve discriminative power and sensitivity compared to traditional metrics.

Publications

F. Diaz, M. D. Ekstrand, and B. Mitra
ACM Transactions on Recommender Systems, February 2025
F. Diaz
SIGIR-AP 2024
F. Diaz
EVIA, 2023
F. Diaz, A. Ferraro
SIGIR 2022