Mark Levy Mark Levy, Data Scientist at Mendeley

Offline evaluation of recommender systems: all pain and no gain?
A large-scale offline evaluation – with a big money prize attached – established recommender systems as a niche discipline worth researching, and one where robust and reproducible experiments would be easy.  But since then critiques within academia have shown up shortcomings in the most appealingly objective evaluation metrics, war stories from the commercial front line have suggested that correlation between offline metrics and bottom line gains in production may be non-existent, and several subsequent academic competitions have come under fierce criticism from both advisors and participants.
In this talk I will draw on practical experience at and Mendeley, as well as insights from others, to offer some opinions about offline evaluation of recommender systems: whether we still need it all, what value we can hope to draw from it, how best to do it if we have to, and how to make the experience less painful than it is right now.


Mark Levy went to Cambridge University on a Maths scholarship but emerged with a degree in Music, and spent the first 15 years of his professional life as a classical musician before retraining in Computer Science. Mark then worked as a researcher at the Centre for Digital Music at Queen Mary, University of London, where he tried to teach computers to understand the verse-chorus structure of songs by analysing their audio content, and then to generate playlists automatically using models that learned jointly from audio information and social tags.  In 2008 he joined the Music Information Retrieval team at, where he designed and developed several music recommendation services, as well as creating an automatic tagging system that generates mood and other interesting musical descriptions using a combination of audio signal analysis, crowd sourcing and machine learning, and which provided the basis for a new set of playlisting services.  Earlier this year he left to join the data science team at Mendeley where he is currently working on two EU funded research projects aiming to provide new services to help scientists easily keep up with each other and with the very latest publications and opportunities in their field.  He still plays very occasionally – if you listen carefully you may hear him in the final episode of The White Queen, currently airing on BBC One Television.
Find him at MendeleyLinkedIn, Twitter, and blogs on MIR and Data science.
[slideshare id=27121536&doc=pain-131012012411-phpapp01]