方差偏差权衡
The bias-variance tradeoff is one of the most important but overlooked and misunderstood topics in ML. So, here we want to cover the topic in a simple and short way as possible.
偏差-方差折衷是机器学习中最重要但被忽视和误解的主题之一。 因此,在这里我们想以一种简单而简短的方式涵盖这个主题。
Let’s start with basics and see why it is important and how this concept is to be used. We want to keep this crisp so we’ll talk in pointers at times. By the end of this, you would know:
让我们从基础开始,看看为什么它很重要以及如何使用此概念。 我们希望保持清晰,因此我们有时会谈指针。 到此为止,您将知道:
- Bias 偏压
- Variance 方差
- Their relationship 他们的关系
- importance of their tradeoff 权衡的重要性
- how to analyze the model condition and take necessary steps 如何分析模型条件并采取必要步骤
So what this Bias- Variance tradeoff exactly has to do with performance?
那么,这种偏差与偏差的权衡与性能究竟有什么关系呢?
You build a model. The model doesn’t perform well. You want to improve the performance but don’t know where to start.
您建立模型。 该模型的效果不佳。 您想提高性能,但不知道从哪里开始。
A diagnosis is important as it pin-points the areas of improvement. You need to clearly identify the components which are leading to a poor model.
诊断很重要,因为它可以指出需要改进的地方。 您需要清楚地识别导致模型不良的组件。
Issue: Bad model performance
问题 :模型性能不佳
Focus area for the fix: Prediction error
修复的重点领域:预测错误
Before jumping to the topic, just know this.
在跳到主题之前,请先知道这一点。
Total Error = Bias^2 + Variance + Irreducible error
Total Error = Bias^2 + Variance + Irreducible error
a. Total error = Prediction error that we are trying to minimize
一个。 总误差 =我们试图最小化的预测误差
b. Bias error = Difference between the average prediction of model and the correct prediction
b。 偏差误差 =模型的平均预测与正确预测之间的差异
c. Variance = Variability of a model prediction for a given data point (difference in results for the same data point if training data is changed).
C。 方差 =给定数据点的模型预测的方差 (如果更改训练数据,则同一数据点的结果差异)。
d. Irreducible error = It is the inherent error of data which is caused by the distribution of data and other specification. It is just the way the data is and basically nothing can be done about it.
d。 不可减少的错误 =这是由数据的分布和其他规范引起的数据固有的错误。 这只是数据的方式,基本上它无法做任何事情。
Okay, these are formal definitions. How to visualize them and understand them in normal terms.
好的,这些是正式的定义。 如何可视化它们并以常规术语理解它们。
Goal — Low Bias, Low Variance
目标—低偏见,低方差
Let’s see each of the possible combinations and understand each of them practically with the above representation.
让我们看一下每种可能的组合,并通过上述表示实际理解它们。
a. High Bias, high Variance: Worst Case- Results not close to the target(High Bias) and not even consistent in any direction(High variance).
一个。 高偏差,高方差: 最坏的情况-结果不接近目标(高偏差),甚至在任何方向上都不一致(高偏差)。
b. High Bias, low variance: Results not close to the target (High Bias) but consistent in one direction(Low Variance).
b。 高偏差,低方差:结果不接近目标(高偏差),但在一个方向上一致(低方差)。
c. Low Bias, high Variance: Results close to the target (Low Bias) but not consistent around the target(High Variance).
C。 低偏差,高方差:结果接近目标(低偏差),但在目标周围(高方差)不一致。
d.Low Bias, low variance: Best Case- Results close to the target (Low Bias) and consistent around the target(Low Variance).
d。 低偏差,低方差: 最佳情况-结果接近目标(低偏差),并且在目标附近(低方差)保持一致。
Now the question is why it is a tradeoff. Why not simply go and get low bias low variance. This is because of the way bias and variance are related, each comes at the cost of other. When you try to improve one, the other gets worse. Like if you cook on low flame, it takes forever. You increase the flame, food starts burning. You have to find a point where both are balanced.
现在的问题是,为什么要进行权衡。 为什么不简单地去获得低偏差低方差。 这是因为偏差和方差之间是相关的,每个都以其他为代价。 当您尝试改善一个时,另一个会变得更糟。 就像您在低火上烹饪一样,它需要永远。 您增加火焰,食物开始燃烧。 您必须找到一个平衡点。
Ideal model: Learns the underlying patterns in training data just optimally and creates a generalized algorithm that can work with similar unseen data as well.
理想模型:以最佳方式学习训练数据中的基础模式,并创建一种通用算法,该算法也可以处理相似的看不见的数据。
Overfitting: The model makes a very highly fitting algorithm tailored for the training data specifically. Thus, it cannot stand variations that come with unseen data.
过度拟合:模型针对训练数据制定了非常适合的算法。 因此,它无法忍受看不见的数据带来的变化。
An overfitting model can be understood as a “Frog in the well” who became too comfortable in the present scenario(training data) but its present understanding won’t help to survive a different surrounding(test data).
过度拟合模型可以理解为“井中的青蛙”,他在当前场景(训练数据)中变得太自在了,但其目前的理解无助于在不同的环境中生存(测试数据)。
Underfitting: The model makes a very loose-fitting algorithm that can’t even work for the training data as it couldn’t learn the patterns as it oversimplified everything. Thus it cannot give correct answers.
欠拟合:模型提出了一种非常宽松的算法,该算法甚至不能用于训练数据,因为它过于简化了所有操作,因此无法学习模式。 因此,它不能给出正确的答案。
An underfitting model is a person who thinks he learned a skill by just taking the intro session and learning buzz words or he became a cricket player just because he knows how to hit a ball.
不称职的模特是一个人,他认为自己只是通过参加入门课程并学习流行语来学习技能,或者仅仅因为他知道如何击球而成为板球运动员。
You can read the detailed explanation below:
您可以阅读以下详细说明:
https://medium.com/analytics-vidhya/understanding-how-machine-learning-is-just-like-the-human-learning-process-801a0bca3e56
https://medium.com/analytics-vidhya/understanding-how-machine-learning-is-just-like-the-human-learning-process-801a0bca3e56
The goal was to build a model that gives-
目的是建立一个模型,使-
Right results most of the times.
在大多数情况下,正确的结果。
Models with Overfitting have high variance and ones with Underfitting have a high bias.
具有过度拟合的模型具有较高的方差,而具有欠拟合的模型具有较高的偏差。
What do I keep in mind regarding these to solve them in real-time?
我要牢记这些以实时解决这些问题?
- Identify whether your model suffers from overfitting or underfitting. Use the train-test accuracy of the model for this. 确定您的模型是过度拟合还是拟合不足。 为此,请使用模型的训练测试精度。
- Take measures as follows once the issue is identified. 一旦发现问题,请采取以下措施。
a. Problem: High Variance(This will be solved the way overfitting is solved)
一个。 问题:高方差(将通过解决过度拟合的方式解决)
Let’s see each solution and how exactly it is solving the issue.
让我们看看每种解决方案以及它如何解决问题。
Add more training data: You have learned very data specific. Here’s more data for increasing your general understanding so that it is no longer data specific.
添加更多培训数据 :您已经学到了非常具体的数据。 这里有更多数据可用于增强您的一般理解,从而不再是特定于数据的数据。
Data augmentation: I don’t have much data. Let me modify current data to create more variations and present them to you for your better understanding.
数据扩充 :我没有太多数据。 让我修改当前数据以创建更多变体,然后将其呈现给您,以使您更好地理解。
Reduce the complexity of the model: You have learned unnecessary stuff. These specific details are not required. Retain only what can be applied everywhere and let go of rest to simplify.
降低模型的复杂性 :您已经学到了不必要的东西。 这些特定的细节不是必需的。 仅保留可以在任何地方应用的内容,并放手休息以简化操作。
Bagging(stands for Bootstrap Aggregating): You are giving different answers every time I change the training data a little. Let me randomly sample the data and give to you all the samples. You create predictors and train on each sample and get all the different results you can. Put together all learning by aggregating all the results and give me one final answer which will remain consistent.
Bagging(代表B ootstrap Agg的注册) :每次我稍微改变训练数据,您都会给出不同的答案。 让我随机抽样数据,然后给您所有样本。 您可以创建预测变量并对每个样本进行训练,并获得所有不同的结果。 通过汇总所有结果来汇总所有学习内容,并给我一个最终答案,它将保持一致。
Note: The different predictors need to have minimum correlation so that they make “different errors”(not to be confused with the model, we have 1 model having different predictors that gives results of different samples).
注意:不同的预测变量需要具有最小的相关性,以使它们产生“不同的误差”(不要与模型混淆,我们有1个具有不同预测变量的模型可以得出不同样本的结果)。
b. Problem: High Bias(This will be solved the way underfitting is solved).
b。 问题 :高偏差(这将通过解决欠拟合的方式解决)。
Add features: You gave a result that Person A won’t be able to repay the loan because he is old(feature). You are saying this because an old Person B couldn’t repay it. But you also need to see their annual income, past history, etc(other features) and then decide.
添加功能:您得出的结果是,人A由于年龄大(功能)而无法偿还贷款。 您之所以这样说,是因为老人B无法偿还。 但是,您还需要查看他们的年收入,过去的历史记录等(其他功能),然后做出决定。
Use a model with higher complexity: We need to replace you with someone who can understand the relation between different parts of data and how they work together better than you.
使用具有更高复杂性的模型:我们需要以能够理解数据不同部分之间的关系以及它们如何更好地协同工作的人来代替您。
Boosting: I don’t trust you. You create predictors and ask them each to answer. We’ll ask each predictor about the logic they used to get their partially right answers. Whenever we get some part right, we’ll add that logic in the rule. Each one will have their shortcoming but together, they will cover up for each other. They’ll work as a team to finally create a well-fitting complex rule.
提振:我不相信你。 您创建预测变量,并要求它们各自回答。 我们将询问每个预测变量有关他们用来获得部分正确答案的逻辑。 每当我们得到正确的部分时,我们都会在规则中添加该逻辑。 每个人都会有自己的缺点,但是他们会互相掩盖。 他们将作为一个团队工作,以最终创建一个合适的复杂规则。
Note: The team of weak learners should have a minimum correlation between them, otherwise everyone would have the right answers for the same sections and some sections will be left answered incorrectly.
注意:弱学习者团队之间的相关性应最低,否则每个人对相同部分的回答都是正确的,而某些部分的回答将不正确。
Hope this helped to understand the topic and gave the understanding to leverage the concept as well.
希望这有助于理解该主题,并给予理解以利用该概念。
Let us know your feedback. Thanks for reading!
让我们知道您的反馈。 谢谢阅读!
Sources:
资料来源:
Fig 1, Fig 2: http://scott.fortmann-roe.com/docs/BiasVariance.html
图1,图2: http : //scott.fortmann-roe.com/docs/BiasVariance.htm
翻译自: https://medium.com/analytics-vidhya/bias-variance-tradeoff-a-quick-introduction-a4b55e56fa24
方差偏差权衡
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391343.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!