Recommender systems are among the most ubiquitous Machine Learning applications in the world today. However, the underlying ranking models are plagued by numerous biases that can severely limit the quality of the resulting recommendations. The problem of building unbiased rankers — also known as unbiased learning to rank, ULTR — remains one of the most important research problems within ML and is still far from being solved.
In this post, we’ll take a deep-dive into one particular modeling approach that has relatively recently enabled the industry to control biases very effectively and thus build vastly superior recommender systems: the two-tower model, where one tower learns relevance and another (shallow) tower learns biases.
While two-tower models have probably been used in the industry for several years, the first paper to formally introduce them to the broader ML community was Huawei’s 2019 PAL paper.
PAL (Huawei, 2019) — the OG two-tower model
Huawei’s paper PAL (“position-aware learning to rank”) considers the problem of position bias within the context of the Huawei app store.
Position bias has been observed over and over again in ranking models across the industry. It simply means that users are more likely to click on items that are shown first. This may be because they’re in a hurry, because they blindly trust the ranking algorithm, or other reasons. Here’s a plot demonstrating position bias in Huawei’s data:
Position bias in Huawei’s app store. Items at the top positions get more clicks than those at the bottom positions.
Position bias is a problem because we simply can’t know whether users clicked on the first item because it was indeed the most relevant for them or because it was shown first — and in recommender systems we aim to solve the former learning objective, not the latter.