Do you remember the awkward moment when someone you had a good conversation with forgets your name? In this day and age we have a new standard, an expectation. And when the expectation is not met the feeling is not far off being asked “where do I know you from again?” by some the lady/guy you spent the whole evening with at the pub last week, awkward! — well I don’t actually go to the pub but you get my gist. We are in the era of personalization and personalized content is popping up everywhere — Netflix, Youtube, Amazon, etc. The user demands personalized content, and businesses seek to meet the demands of the users.

In the recent years, many businesses have been employing Machine Learning to develop effective recommender systems to assist in personalizing the users experience. As with all things in life, this feat comes with its challenges. Evaluating the impact of a recommender engine is a major challenge in the development stages, or enhancement stages of a recommender engine. Although we may be sure of the positive impact caused by a recommender system, there’s a much required need to quantify this impact in order to effectively communicate to stakeholders or for when we want to enhance our system in the future.

After a long-winded introduction, I hereby present to you… Normalized Discounted Cumulative Gain (NDCG).

A measure of ranking quality that is often used to measure effectiveness of web search engine algorithms or related applications.

If we are to understand the NDCG metric accordingly we must first understand CG (Cumulative Gain) and DCG (Discounted Cumulative Gain), as well as understanding the two assumptions that we make when we use DCG and its related measures:

  1. Highly relevant documents are more useful when appearing earlier in the search engine results list.
  2. Highly relevant documents are more useful than marginally relevant documents, which are more useful than non-relevant documents

(Source: Wikipedia)


Cumulative Gain (CG)

If every recommendation has a graded relevance score associated with it, CG is the sum of graded relevance values of all results in a search result list — see Figure 1 for how we can express this mathematically.

#machine-learning #artificial-intelligence #deep-learning #data-science #recommendation-system #deep learning

Normalized Discounted Cumulative Gain
2.25 GEEK