Workshop on Reproducibility and Replication in Recommender Systems Evaluation

in conjunction with ACM RecSys 2013

Location: Hong Kong, China
Date: October 12, 2013


This workshop aims to gather researchers and practitioners interested in defining clear guidelines for their experimental needs to allow fair comparisons to related work. The workshop will provide an informal setting for exchanging and discussing ideas, sharing experiences and viewpoints. We seek to identify and better understand the current gaps in the implementation of recommender system evaluation methodologies, help lay directions for progress in addressing them, and foster the consolidation and convergence of experimental methods and practice. As a particular focus of interest, the workshop aims to discover which are the main challenges related to reproduction and replication of prior research, along with an exploration of possible directions to overcome these limitations.

Specific questions that the workshop aims to address include the following:

  • How important is the reproducibility and replication of experiments for the community?
  • What are the challenges for replication of evaluation in the RS field? How could we facilitate easier and more accurate comparison with prior work?
  • How can methods and metrics be more clearly and/or formally defined within specific tasks and contexts for which a recommender application is deployed?
  • What parts -if any- of an online experiment could be reproducible (and how)?
  • How should the academic evaluation methodologies be described to improve their relevance, usefulness, and replicability for industrial settings?
  • What type of public resources (data sets, benchmarks) should be available, and how can they be built? Is it possible to have a generic framework for the evaluation (and replication) of recommender systems?
  • To what extent is it possible to reuse experimental methodologies across domains and/or businesses?
  • How do we envision the evaluation of recommender systems in the future and how does this affect the replicability of said systems?

Scope and topics

Papers explicitly dealing with replication of previously published experimental conditions/algorithms/metrics and the resulting analysis are encouraged. In particular, we seek discussions on the difficulties the authors may find in this process, along with their limitations or successes on reproducing the original results.

Within the broader scope of recommender system evaluation, the presented papers and discussions to be held at the workshop will address –though need not be limited to– the following topics:

  • Limitations and challenges of experimental reproducibility and replication
  • Reproducible experimental design
  • Replicability of algorithms
  • Standardization of metrics: definition and computation protocols
  • Evaluation software: frameworks, utilities, services
  • Reproducibility in user-centric studies
  • Datasets and benchmarks
  • Recommender software reuse
  • Replication of already published work
  • Reproducibility within and across domains and organizations
  • Reproducibility and replication guidelines


We invite the submission of papers reporting original research, studies, advances, or experiences in this area. Two submission types are accepted: long papers of up to 8 pages, and short papers up to 4 pages, in the standard ACM SIG proceedings format. Paper submissions and reviews will be handled electronically.

Each paper will be evaluated by at least three reviewers from the Program Committee. The papers will be evaluated for their originality, contribution significance, soundness, clarity, and overall quality. The interest of contributions will be assessed in terms of technical and scientific findings, contribution to the knowledge and understanding of the problem, methodological advancements, or applicative value. Besides, the papers will be evaluated based on their reproducibility in the context of a standard recommender implementation, such as open source frameworks (e.g., LensKit, MyMediaLite, Mahout) and industry products (e.g., Gravity, Mendeley, Plista, Telefonica).

Submission instructions can be found on the Submissions page.