归一化均值归一化_归一化折现累积收益

归一化均值归一化

Do you remember the awkward moment when someone you had a good conversation with forgets your name? In this day and age we have a new standard, an expectation. And when the expectation is not met the feeling is not far off being asked “where do I know you from again?” by some the lady/guy you spent the whole evening with at the pub last week, awkward! — well I don’t actually go to the pub but you get my gist. We are in the era of personalization and personalized content is popping up everywhere — Netflix, Youtube, Amazon, etc. The user demands personalized content, and businesses seek to meet the demands of the users.

你还记得尴尬的时刻，当有人你与你忘记一个名字很好的交谈？在这个时代，我们有了新的标准，一种期望。当没有达到期望时，很快就会被问到“我又从哪里认识你？” 上周您在酒吧里和整个晚上在一起的女士/男士中，有些尴尬！ -嗯，我实际上并不去酒吧，但您明白了我的要旨。我们正处于个性化时代，个性化内容随处可见-Netflix，Youtube，Amazon等。用户需要个性化内容，而企业则在努力满足用户的需求。

In the recent years, many businesses have been employing Machine Learning to develop effective recommender systems to assist in personalizing the users experience. As with all things in life, this feat comes with its challenges. Evaluating the impact of a recommender engine is a major challenge in the development stages, or enhancement stages of a recommender engine. Although we may be sure of the positive impact caused by a recommender system, there’s a much required need to quantify this impact in order to effectively communicate to stakeholders or for when we want to enhance our system in the future.

近年来，许多企业一直在使用机器学习来开发有效的推荐系统，以帮助个性化用户体验。与生活中的所有事物一样，这一壮举也伴随着挑战。在推荐引擎的开发阶段或增强阶段，评估推荐引擎的影响是一项重大挑战。尽管我们可以肯定推荐系统会带来的积极影响，但是仍然需要量化这种影响，以便有效地与利益相关者进行交流，或者在将来我们希望增强我们的系统时。

After a long-winded introduction, I hereby present to you… Normalized Discounted Cumulative Gain (NDCG).

在经过漫长的介绍之后，我在此向您介绍…… 归一化累积折扣收益 (NDCG)。

A measure of ranking quality that is often used to measure effectiveness of web search engine algorithms or related applications.
排名质量的度量标准，通常用于度量Web搜索引擎算法或相关应用程序的有效性。

If we are to understand the NDCG metric accordingly we must first understand CG (Cumulative Gain) and DCG (Discounted Cumulative Gain), as well as understanding the two assumptions that we make when we use DCG and its related measures:

如果要相应地理解NDCG度量标准，则必须首先了解CG(累积增益)和DCG(折让累积增益)，以及理解我们在使用DCG及其相关度量时做出的两个假设：

Highly relevant documents are more useful when appearing earlier in the search engine results list.
如果相关性较高的文档出现在搜索引擎结果列表的前面，则将更为有用。
Highly relevant documents are more useful than marginally relevant documents, which are more useful than non-relevant documents
高度相关的文档比边缘相关的文档更有用，后者比不相关的文档更有用

(Source: Wikipedia)
(来源：维基百科)

累积增益(CG) (Cumulative Gain (CG))

If every recommendation has a graded relevance score associated with it, CG is the sum of graded relevance values of all results in a search result list — see Figure 1 for how we can express this mathematically.

如果每个建议都具有与之相关的分级的相关性得分，则CG是搜索结果列表中所有结果的分级的相关性值的总和-有关如何数学表达的信息，请参见图1。

Image for post — Figure 1: Cumulative Gain mathematical expression

The Cumulative Gain at a particular rank position p, where the rel_i is the graded relevance of the result at position i. To demonstrate this in Python we must first let the variable setA be the graded relevance scores of a response to a search query, thereby each graded relevance score is associated with a document.

在特定等级位置p处的累积增益，其中rel_i是位置i处结果的分级相关性。为了在Python中证明这一点，我们必须首先让变量setA为搜索查询响应的分级相关性得分，从而使每个分级相关性得分与文档相关联。

setA = [3, 1, 2, 3, 2, 0]
print(sum(setA))11

The problem with CG is that it does not take into consideration the rank of the result set when determining the usefulness of a result set. In other words, if we was to reorder the graded relevance scores returned in setA we will not get a better insight into the usefulness of the result set since the CG will be unchanged. See the code cell below for an example.

CG的问题在于，在确定结果集的有用性时，它没有考虑结果集的等级。换句话说，如果我们要对在setA返回的分级相关性分数进行重新排序，则由于CG不变，因此我们将无法更好地了解结果集的用途。有关示例，请参见下面的代码单元。

setB = sorted(setA, reverse=True)
print(f"setA: {setA}\tCG setA: {cg_a}\nsetB: {setB}\tCG setB: {sum(setB)}")setA: [3, 1, 2, 3, 2, 0]	CG setA: 11
setB: [3, 3, 2, 2, 1, 0]	CG setB: 11

setB is clearly returning a much more useful set than setA, but the CG measure says that they are returning equally as good results.

显然setB返回的集合比setA有用得多，但是CG度量表明它们返回的结果相同。

折让累计收益 (Discounted Cumulative Gain)

To overcome this we introduce DCG. DCG penalizes highly relevant documents that appear lower in the search by reducing the graded relevance value logarithimically proportional to the position of the result — see Figure 2.

为了克服这个问题，我们引入了DCG。 DCG通过降低与结果位置成对数比例的相关度等级值，对在搜索中显示较低的高度相关的文档进行惩罚-见图2。

Below we have created a function called discountedCumulativeGain to calculate DCG for setA and setB. If this is an effective measurement, setB should have a higher DCG than setA since its results are more useful.

下面我们创建了一个名为函数discountedCumulativeGain计算DCG为setA和setB 。如果这是一种有效的测量方法，则setB DCG应当比setA因为其结果更有用。

import numpy as np 
def discountedCumulativeGain(result):
    dcg = []
    for idx, val in enumerate(result): 
        numerator = 2**val - 1
        # add 2 because python 0-index
        denominator =  np.log2(idx + 2) 
        score = numerator/denominator
        dcg.append(score)
    return sum(dcg)print(f"DCG setA: {discountedCumulativeGain(setA)}\nDCG setB: {discountedCumulativeGain(setB)}")DCG setA: 13.306224081788834
DCG setB: 14.595390756454924

The DCG of setB is higher than setA which aligns with our intuition that setB returned more useful results than setA.

setB的DCG高于setA ，这符合我们的直觉，即setB返回的结果比setA更有用。

归一化折现累积收益 (Normalized Discounted Cumulative Gain)

An issue arises with DCG when we want to compare the search engines performance from one query to the next because search results list can vary in length depending on the query that has been provided. Hence, by normalizing the cumulative gain at each position for a chosen value of p across queries we arrive at NDCG. We perform this by sorting all the relevant documents in the corpus by their relative relevance producing the max possible DCG through position p (a.k.a Ideal Discounted Cumulative Gain) - see Figure 3.

当我们要比较一个查询与下一个查询的搜索引擎性能时，DCG会出现问题，因为搜索结果列表的长度可能会有所不同，具体取决于所提供的查询。因此，通过将查询中p的选定值的每个位置处的累积增益标准化，我们得出NDCG。我们通过将语料库中的所有相关文档按照它们的相对相关性进行排序来执行此操作，从而通过位置p (又称理想折现累积增益)产生最大可能的DCG-见图3。

To perform this metric in python we created the function normalizedDiscountedCumulativeGain to assist with this functionality.

为了在python中执行此指标，我们创建了功能normalizedDiscountedCumulativeGain来辅助此功能。

def normalizedDiscountedCumulativeGain(result, sorted_result): 
    dcg = discountedCumulativeGain(result)
    idcg = discountedCumulativeGain(sorted_result)
    ndcg = dcg / idcg
    return ndcgprint(f"DCG setA: {normalizedDiscountedCumulativeGain(setA, setB)}\nDCG setB: {normalizedDiscountedCumulativeGain(setB, setB)}")DCG setA: 0.9116730277265138
DCG setB: 1.0

The ratios will always be in the range of [0, 1] with 1 being a perfect score — meaning that the DCG is the same as the IDCG. Therefore, the NDCG values can be averaged for all queries to obtain a measure of the average performance of a recommender systems ranking algorithm.

比率将始终在[0，1]范围内，其中1为完美分数-意味着DCG与IDCG相同。因此，可以对所有查询的NDCG值取平均值，以获得对推荐系统排名算法的平均性能的度量。

NDCG的局限性 (Limitations of NDCG)

(source: Wikipedia)
(来源：维基百科)

The NDCG does not penalize for bad documents in the results
NDCG不会对结果中的不良文件进行处罚
Does not penalize missing documents in the results
不惩罚结果中缺少的文件
May not be suitable to measure performance of queiries that may often have several equally good results
可能不适合测量可能经常具有几个同样好的结果的查询的性能

结语 (Wrap Up)

The main difficulty that we face when using NDCG is that often times we don’t know the ideal ordering of results when only partial relevance feedback is available. However, the NDCG has proven to be an effect metric to evaluate ranking quality for various problems, for example the Personalized Web Search Challenge, AirBnB New User Booking Challenge, and Personalize Expedia Hotel Searches — ICDM 2013 to name a few.

使用NDCG时，我们面临的主要困难是，当只有部分相关性反馈可用时，我们常常不知道结果的理想排序。但是，事实证明，NDCG是评估各种问题排名质量的一种效果指标，例如个性化Web搜索挑战赛， AirBnB新用户预订挑战赛和个性化Expedia酒店搜索-ICDM 2013等。

Thank you for reading to the end of this post. If you’d like to get in contact with me, I am most accessible on LinkedIn.

感谢您阅读这篇文章的结尾。如果您想与我联系，可以在LinkedIn上访问我。