Home  >   Blog  >   What I Like About Microsoft Recommenders Repository


What I Like About Microsoft Recommenders Repository

  Support by donation   Gift a cup of coffee

I recently found Microsoft's Recommenders repository is particularly useful to understand common discussion points when it comes to recommender systems. The motivation and brief history of the repository can be found in their paper "Microsoft Recommenders: Tools to Accelerate Developing Recommender Systems," which were demonstrated at RecSys 2019 and WWW 2020.

What I like about the repository can be three fold:

  • High-quality, well-written Jupyter notebooks
  • Minimal functionality on its PyPI package
  • Consideration about non-accuracy metrics

High-quality, well-written Jupyter notebooks. Even though the repository contains an installable PyPI package recommenders (!), the most important part is a collection of well-written Jupyter notebooks that enable us to understand how to build recommender systems from data preparation and model training to evaluation and deployment.

microsoft-recommenders-pipeline Source: recommenders/examples at main · microsoft/recommenders · GitHub

Importantly, the notebooks are not just for a series of code snippets + inline comments (like most of the repositories do) but for providing detailed texts/references so we can use the contents as "tutorial." Moreover, as mentioned in the paper, integration tests use papermill for validating the notebooks.

Minimal functionality on its PyPI package. Basically, the recommenders package itself is a set of utility functions that are widely applicable to a variety of scenarios, which makes the repository surprisingly minimal and useful; to avoid reinventing the wheel, the implementation of recommendation algorithms largely relies on the other packages such as PySpark and Surprise, while some minor ones are implemented from scratch (e.g., Restricted Boltzmann Machine).

Consideration about non-accuracy metrics. When we evaluate recommender systems, I cannot emphasize the importance of non-accuracy metrics enough as I wrote in Recommender Diversity is NOT Inversion of Similarity. The Recommenders repository is doing a great job in this regard since there is a dedicated notebook for explaining coverage, novelty, diversity, and serendipity metrics. I hope the package and repository evolve more around these topics moving forward.

Overall, I have an impression that Microsoft Recommenders nicely summarizes a good chunk of techniques every recommendation problems are commonly interested in; if there is someone who is completely new to recommender systems but familiar with Python-based data science & machine learning ecosystem, I'd first recommend to take a look at this repository. One potential area of improvement I can think of is around operationalizing recommenders. Currently, the examples are highly optimized for Azure-based deployment, which makes sense as the repository is owned by Microsoft, but it would be great if they could generalize the insights in a more OSS way.



Recommender Systems Data Science & Analytics Machine Learning

  See also

Recommender Diversity is NOT Inversion of Similarity
Deploying Static Site to GitHub Pages via Travis CI
FluRS: A Python Library for Online Item Recommendation


Last updated: 2022-09-02

  Author: Takuya Kitazawa

Takuya Kitazawa is a freelance software developer, previously working at a Big Tech and Silicon Valley-based start-up company where he wore multiple hats as a full-stack software developer, machine learning engineer, data scientist, and product manager. At the intersection of technological and social aspects of data-driven applications, he is passionate about promoting the ethical use of information technologies through his mentoring, business consultation, and public engagement activities. See CV for more information, or contact at [email protected].

  Support by donation   Gift a cup of coffee


  • Opinions are my own and do not represent the views of organizations I am/was belonging to.
  • I am doing my best to ensure the accuracy and fair use of the information. However, there might be some errors, outdated information, or biased subjective statements because the main purpose of this blog is to jot down my personal thoughts as soon as possible before conducting an extensive investigation. Visitors understand the limitations and rely on any information at their own risk.
  • That said, if there is any issue with the content, please contact me so I can take the necessary action.