Contextual and Dimensional Relevance Judgments for Reusable SERP-level Evaluation

P. Golbus, I. Zitouni, J. Kim, A. Hassan, F. Diaz
WWW 2014
Document-level relevance judgments are a major component in the calculation of effectiveness metrics. Collecting high-quality judgments is therefore a critical step in information retrieval evaluation. However, the nature of and the assumptions underlying relevance judgment collection have not received much attention. In particular, relevance judgments are typically collected for each document in isolation, although users read each document in the context of other documents. In this work, we aim to investigate the nature of relevance judgment collection. We collect relevance labels in both isolated and conditional setting, and ask for judgments in various dimensions of relevance as well as overall relevance. Then we compare the relevance metrics based on various types of judgments with other metrics of quality such as user preference. Our analyses illuminate how these settings for judgment collection affect the quality and the characteristics of the judgments. We also find that the metrics based on conditional judgments show higher correlation with user preference than isolated judgments.

bibtex

Copied!
@inproceedings{golbus:www2014, year = {2014}, url = {http://dx.doi.org/10.1145/2566486.2568015}, title = {Contextual and Dimensional Relevance Judgments for Reusable SERP-level Evaluation}, series = {WWW '14}, publisher = {International World Wide Web Conferences Steering Committee}, pages = {131--142}, numpages = {12}, location = {Seoul, Korea}, isbn = {978-1-4503-2744-2}, doi = {10.1145/2566486.2568015}, booktitle = {Proceedings of the 23rd International Conference on World Wide Web}, author = {Peter B. Golbus and Imed Zitouni and Jin Young Kim and Ahmed Hassan and Fernando Diaz}, address = {Republic and Canton of Geneva, Switzerland}, acmid = {2568015} }