Recommender Diversity is NOT Inversion of Similarity

Home > Blog > Recommender Diversity is NOT Inversion of Similarity

2022-02-27

Recommender Diversity is NOT Inversion of Similarity

Support by donation Gift a cup of coffee

In the modern personalization systems, diversifying what's recommended to individual users is crucial not only for maximizing customer satisfaction and business metrics but incorporating proper ethics and fairness into the applications. Here, to make constructive discussion in terms of what defines the goodness of recommendation, quantifying the concept of diversity in the form of metric is an important area of research and development.

I strongly believe the metrics discussed in academia are still far from the reality of diversity in a true sense; it is not an opposite concept of similarity, and diversity-accuracy should not be considered as a simple trade-off relationship.

How to measure diversity

To understand better about how to formulate diversity and how the research domain has evolved, I surveyed several highly cited papers, including but are not limited to:

Originally, in the early 2000s [1], people started looking diversity as a simple inversion of similarity; accurate recommendation is generated by capturing similar users/items by certain metrics $\mathrm{similarity}$, and diversity can be defined as an inversion/opposite of it, like $1 - \mathrm{similarity}$ or $\frac{1}{\mathrm{similarity}}$ (i.e., "dissimilarity").

But it means recommending completely irrelevant items would be the best strategy, which is just useless as a personalization system. Consequently, a need for further sophistication has arisen to better represent diversity as a numeric value. The subsequent studies thus proposed several improvements in this regard, such as aggregated diversity, Gini diversity, entropy-based diversity.

The paper [3] highlights these metrics nicely. Let $U$ and $I$ be a set of users and items, respectively, and $L_N(u)$ a list of top-$N$ recommended items for a user $u$. Here, an aggregated diversity can be calculated as:

$\left| \bigcup\limits_{u \in U} L_N(u) \right|$

Meanwhile, if we focus more on individual items and how many users are recommended a particular item, their diversity can be defined by an entropy-based formulation:

$-\sum_{j = 1}^{|I|} \left( \frac{\left|\{u \mid u \in U \wedge i_j \in L_N(u) \}\right|}{N |U|} \ln \left( \frac{\left|\{u \mid u \in U \wedge i_j \in L_N(u) \}\right|}{N |U|} \right) \right),$

where $i_j$ denotes $j$-th item in the available item set $I$.

Moreover, Gini index, which is normally used to measure a degree of inequality in a distribution of income, can be applied to assess diversity in the context of recommender systems:

$2 \sum_{j = 1}^{|I|} \left( \frac{|I|+1-j}{|I|+1} \cdot \frac{\left|\{u \mid u \in U \wedge i_j \in L_N(u) \}\right|}{N |U|} \right)$

Accuracy comes first, and re-rank later

Once the metrics are defined, we could use them as an objective to optimize the recommendation results towards our desired direction. Importantly, how/when to apply diversity is a key to balance accuracy and diversity; like mentioned before, simply recommending random items could improve diversity, but it's obviously annoying if an e-commerce site recommended shampoo when you are searching a specific book, for example.

A common approach to achieve the goal is to generate recommendations in two phases:

Generate a list of recommendations based on similarity metrics, so the result reflects your preference and historical behaviors.
Post-process/re-rank the initial result based on the diversity metric to prioritize non-trivial results.

This can be called a "bounded" approach, according to [1], and all of the four papers I listed above similarly discussed the possibility of post-processing-based diversification. That is, the initial results from a recommender bounds/limits a problem space to a reasonable range of candidates, and we will then make minimum but significant modification among them to improve diversity.

Dissimilar enough, but not too far

The two-phase approach is aligned with potential solutions to echo chambers and filter bubbles; the studies have discussed about the importance of "focusing on boundary group of users/items" to make the algorithmic recommendation less polarized/skewed.

Meaning, something reasonably similar to the target but different in a large part of criteria plays a crucial role, and this way of thinking is exactly how the bounded approach tries to model.

One interesting research direction in this context is graph optimization-based recommender as [4] proposed. It should be noticed that we all are creating large complex networks on the internet as a result of numerous interactions, and hence finding a sweet spot from the connected graph sounds like a good strategy to dissect the mutual relationships.

It shouldn't be "trade-off"

That said, I found that the studies on recommender diversification commonly assume there is a trade-off between accuracy and diversity; people are implicitly hypothesizing we cannot diversify without sacrificing accuracy. I personally hesitate to accept the assumption because it essentially makes the same statement as "diversity is a simple inversion/opposite of accuracy," which I rejected earlier.

As several social scientific studies have revealed, diversity does have a practical meaning in a positive way, and, if we successfully diversify a resulting set of entities (e.g., users, items) in a true sense, accuracy should naturally increase.

Therefore, blending/harmonizing these two objectives more tightly is a highly important research topic in my opinion, and global optimization on such a multi-modal distribution is an interesting problem to tackle, as we already see relevant discussion in [3] and a RecSys 2021 paper "Towards Unified Metrics for Accuracy and Diversity for Recommender Systems" for instance.

Author: Takuya Kitazawa

Takuya Kitazawa is a freelance software developer, previously working at a Big Tech and Silicon Valley-based start-up company where he wore multiple hats as a full-stack software developer, machine learning engineer, data scientist, and product manager. At the intersection of technological and social aspects of data-driven applications, he is passionate about promoting the ethical use of information technologies through his mentoring, business consultation, and public engagement activities. See CV for more information, or contact at [email protected].

Support by donation Gift a cup of coffee

Disclaimer

Opinions are my own and do not represent the views of organizations I am/was belonging to.
I am doing my best to ensure the accuracy and fair use of the information. However, there might be some errors, outdated information, or biased subjective statements because the main purpose of this blog is to jot down my personal thoughts as soon as possible before conducting an extensive investigation. Visitors understand the limitations and rely on any information at their own risk.
That said, if there is any issue with the content, please contact me so I can take the necessary action.