This paper evaluates the automatic creation of personal topic models using two language model-based clustering techniques. The results of these methods are compared with user-defined topic classes of web pages from personal web browsing histories from a 5-week period. The histories and topics were gathered during a naturalistic case study of the online information search and use behavior of two users. This paper further investigates the effectiveness of using display time and retention behaviors as implicit evidence for weighting documents during topic model creation. Results show that agglomerative techniques --- specifically, average-link clustering --- provide the most effective methodology for building topic models while ignoring topic evidence and implicit evidence.
bibtex
Copied!
@inproceedings{kelly:users,
year = {2004},
title = {A User-Centered Approach to Evaluating Topic Models.},
pages = {27-41},
booktitle = {26th European Conference on Information Retrieval Research},
author = {Diane Kelly and Fernando Diaz and Nicholas J. Belkin and James Allan}
}