方差偏差权衡_偏差偏差权衡:快速介绍

方差偏差权衡

The bias-variance tradeoff is one of the most important but overlooked and misunderstood topics in ML. So, here we want to cover the topic in a simple and short way as possible.

偏差-方差折衷是机器学习中最重要但被忽视和误解的主题之一。 因此,在这里我们想以一种简单而简短的方式涵盖这个主题。

Let’s start with basics and see why it is important and how this concept is to be used. We want to keep this crisp so we’ll talk in pointers at times. By the end of this, you would know:

让我们从基础开始,看看为什么它很重要以及如何使用此概念。 我们希望保持清晰,因此我们有时会谈指针。 到此为止,您将知道:

  • Bias

    偏压
  • Variance

    方差
  • Their relationship

    他们的关系
  • importance of their tradeoff

    权衡的重要性
  • how to analyze the model condition and take necessary steps

    如何分析模型条件并采取必要步骤

So what this Bias- Variance tradeoff exactly has to do with performance?

那么,这种偏差与偏差的权衡与性能究竟有什么关系呢?

You build a model. The model doesn’t perform well. You want to improve the performance but don’t know where to start.

您建立模型。 该模型的效果不佳。 您想提高性能,但不知道从哪里开始。

A diagnosis is important as it pin-points the areas of improvement. You need to clearly identify the components which are leading to a poor model.

诊断很重要,因为它可以指出需要改进的地方。 您需要清楚地识别导致模型不良的组件。

Issue: Bad model performance

问题 :模型性能不佳

Focus area for the fix: Prediction error

修复的重点领域:预测错误

Before jumping to the topic, just know this.

在跳到主题之前,请先知道这一点。

Total Error = Bias^2 + Variance + Irreducible error

Total Error = Bias^2 + Variance + Irreducible error

a. Total error = Prediction error that we are trying to minimize

一个。 总误差 =我们试图最小化的预测误差

b. Bias error = Difference between the average prediction of model and the correct prediction

b。 偏差误差 =模型的平均预测与正确预测之间的差异

c. Variance = Variability of a model prediction for a given data point (difference in results for the same data point if training data is changed).

C。 方差 =给定数据点的模型预测的方差 (如果更改训练数据,则同一数据点的结果差异)。

d. Irreducible error = It is the inherent error of data which is caused by the distribution of data and other specification. It is just the way the data is and basically nothing can be done about it.

d。 不可减少的错误 =这是由数据的分布和其他规范引起的数据固有的错误。 这只是数据的方式,基本上它无法做任何事情。

Okay, these are formal definitions. How to visualize them and understand them in normal terms.

好的,这些是正式的定义。 如何可视化它们并以常规术语理解它们。

Goal — Low Bias, Low Variance

目标—低偏见,低方差

Image for post
Fig 1: Bias-Variance representation
图1:偏差-方差表示

Let’s see each of the possible combinations and understand each of them practically with the above representation.

让我们看一下每种可能的组合,并通过上述表示实际理解它们。

a. High Bias, high Variance: Worst Case- Results not close to the target(High Bias) and not even consistent in any direction(High variance).

一个。 高偏差,高方差: 最坏的情况-结果不接近目标(高偏差),甚至在任何方向上都不一致(高偏差)。

b. High Bias, low variance: Results not close to the target (High Bias) but consistent in one direction(Low Variance).

b。 高偏差,低方差:结果不接近目标(高偏差),但在一个方向上一致(低方差)。

c. Low Bias, high Variance: Results close to the target (Low Bias) but not consistent around the target(High Variance).

C。 低偏差,高方差:结果接近目标(低偏差),但在目标周围(高方差)不一致。

d.Low Bias, low variance: Best Case- Results close to the target (Low Bias) and consistent around the target(Low Variance).

d。 低偏差,低方差: 最佳情况-结果接近目标(低偏差),并且在目标附近(低方差)保持一致。

Now the question is why it is a tradeoff. Why not simply go and get low bias low variance. This is because of the way bias and variance are related, each comes at the cost of other. When you try to improve one, the other gets worse. Like if you cook on low flame, it takes forever. You increase the flame, food starts burning. You have to find a point where both are balanced.

现在的问题是,为什么要进行权衡。 为什么不简单地去获得低偏差低方差。 这是因为偏差和方差之间是相关的,每个都以其他为代价。 当您尝试改善一个时,另一个会变得更糟。 就像您在低火上烹饪一样,它需要永远。 您增加火焰,食物开始燃烧。 您必须找到一个平衡点。

Image for post
Fig. 2
图2

Ideal model: Learns the underlying patterns in training data just optimally and creates a generalized algorithm that can work with similar unseen data as well.

理想模型:以最佳方式学习训练数据中的基础模式,并创建一种通用算法,该算法也可以处理相似的看不见的数据。

Overfitting: The model makes a very highly fitting algorithm tailored for the training data specifically. Thus, it cannot stand variations that come with unseen data.

过度拟合:模型针对训练数据制定了非常适合的算法。 因此,它无法忍受看不见的数据带来的变化。

An overfitting model can be understood as a “Frog in the well” who became too comfortable in the present scenario(training data) but its present understanding won’t help to survive a different surrounding(test data).

过度拟合模型可以理解为“井中的青蛙”,他在当前场景(训练数据)中变得太自在了,但其目前的理解无助于在不同的环境中生存(测试数据)。

Underfitting: The model makes a very loose-fitting algorithm that can’t even work for the training data as it couldn’t learn the patterns as it oversimplified everything. Thus it cannot give correct answers.

欠拟合:模型提出了一种非常宽松的算法,该算法甚至不能用于训练数据,因为它过于简化了所有操作,因此无法学习模式。 因此,它不能给出正确的答案。

An underfitting model is a person who thinks he learned a skill by just taking the intro session and learning buzz words or he became a cricket player just because he knows how to hit a ball.

不称职的模特是一个人,他认为自己只是通过参加入门课程并学习流行语来学习技能,或者仅仅因为他知道如何击球而成为板球运动员。

You can read the detailed explanation below:

您可以阅读以下详细说明:

https://medium.com/analytics-vidhya/understanding-how-machine-learning-is-just-like-the-human-learning-process-801a0bca3e56

https://medium.com/analytics-vidhya/understanding-how-machine-learning-is-just-like-the-human-learning-process-801a0bca3e56

The goal was to build a model that gives-

目的是建立一个模型,使-

  • Right results most of the times.

    在大多数情况下,正确的结果。

Models with Overfitting have high variance and ones with Underfitting have a high bias.

具有过度拟合的模型具有较高的方差,而具有欠拟合的模型具有较高的偏差。

What do I keep in mind regarding these to solve them in real-time?

我要牢记这些以实时解决这些问题?

  • Identify whether your model suffers from overfitting or underfitting. Use the train-test accuracy of the model for this.

    确定您的模型是过度拟合还是拟合不足。 为此,请使用模型的训练测试精度。
Image for post
  • Take measures as follows once the issue is identified.

    一旦发现问题,请采取以下措施。

a. Problem: High Variance(This will be solved the way overfitting is solved)

一个。 问题:高方差(将通过解决过度拟合的方式解决)

Let’s see each solution and how exactly it is solving the issue.

让我们看看每种解决方案以及它如何解决问题。

  • Add more training data: You have learned very data specific. Here’s more data for increasing your general understanding so that it is no longer data specific.

    添加更多培训数据 :您已经学到了非常具体的数据。 这里有更多数据可用于增强您的一般理解,从而不再是特定于数据的数据。

  • Data augmentation: I don’t have much data. Let me modify current data to create more variations and present them to you for your better understanding.

    数据扩充 :我没有太多数据。 让我修改当前数据以创建更多变体,然后将其呈现给您,以使您更好地理解。

  • Reduce the complexity of the model: You have learned unnecessary stuff. These specific details are not required. Retain only what can be applied everywhere and let go of rest to simplify.

    降低模型的复杂性 :您已经学到了不必要的东西。 这些特定的细节不是必需的。 仅保留可以在任何地方应用的内容,并放手休息以简化操作。

  • Bagging(stands for Bootstrap Aggregating): You are giving different answers every time I change the training data a little. Let me randomly sample the data and give to you all the samples. You create predictors and train on each sample and get all the different results you can. Put together all learning by aggregating all the results and give me one final answer which will remain consistent.

    Bagging(代表B ootstrap Agg的注册) :每次我稍微改变训练数据,您都会给出不同的答案。 让我随机抽样数据,然后给您所有样本。 您可以创建预测变量并对每个样本进行训练,并获得所有不同的结果。 通过汇总所有结果来汇总所有学习内容,并给我一个最终答案,它将保持一致。

Note: The different predictors need to have minimum correlation so that they make “different errors”(not to be confused with the model, we have 1 model having different predictors that gives results of different samples).

注意:不同的预测变量需要具有最小的相关性,以使它们产生“不同的误差”(不要与模型混淆,我们有1个具有不同预测变量的模型可以得出不同样本的结果)。

b. Problem: High Bias(This will be solved the way underfitting is solved).

b。 问题 :高偏差(这将通过解决欠拟合的方式解决)。

  • Add features: You gave a result that Person A won’t be able to repay the loan because he is old(feature). You are saying this because an old Person B couldn’t repay it. But you also need to see their annual income, past history, etc(other features) and then decide.

    添加功能:您得出的结果是,人A由于年龄大(功能)而无法偿还贷款。 您之所以这样说,是因为老人B无法偿还。 但是,您还需要查看他们的年收入,过去的历史记录等(其他功能),然后做出决定。

  • Use a model with higher complexity: We need to replace you with someone who can understand the relation between different parts of data and how they work together better than you.

    使用具有更高复杂性的模型:我们需要以能够理解数据不同部分之间的关​​系以及它们如何更好地协同工作的人来代替您。

  • Boosting: I don’t trust you. You create predictors and ask them each to answer. We’ll ask each predictor about the logic they used to get their partially right answers. Whenever we get some part right, we’ll add that logic in the rule. Each one will have their shortcoming but together, they will cover up for each other. They’ll work as a team to finally create a well-fitting complex rule.

    提振:我不相信你。 您创建预测变量,并要求它们各自回答。 我们将询问每个预测变量有关他们用来获得部分正确答案的逻辑。 每当我们得到正确的部分时,我们都会在规则中添加该逻辑。 每个人都会有自己的缺点,但是他们会互相掩盖。 他们将作为一个团队工作,以最终创建一个合适的复杂规则。

Note: The team of weak learners should have a minimum correlation between them, otherwise everyone would have the right answers for the same sections and some sections will be left answered incorrectly.

注意:弱学习者团队之间的相关性应最低,否则每个人对相同部分的回答都是正确的,而某些部分的回答将不正确。

Hope this helped to understand the topic and gave the understanding to leverage the concept as well.

希望这有助于理解该主题,并给予理解以利用该概念。

Let us know your feedback. Thanks for reading!

让我们知道您的反馈。 谢谢阅读!

Sources:

资料来源:

Fig 1, Fig 2: http://scott.fortmann-roe.com/docs/BiasVariance.html

图1,图2: http : //scott.fortmann-roe.com/docs/BiasVariance.htm

翻译自: https://medium.com/analytics-vidhya/bias-variance-tradeoff-a-quick-introduction-a4b55e56fa24

方差偏差权衡

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391343.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

win10 uwp 让焦点在点击在页面空白处时回到textbox中

原文:win10 uwp 让焦点在点击在页面空白处时回到textbox中在网上 有一个大神问我这样的问题:在做UWP的项目,怎么能让焦点在点击在页面空白处时回到textbox中? 虽然我的小伙伴认为他这是一个 xy 问题,但是我还是回答他这个问题。 首…

重学TCP协议(1) TCP/IP 网络分层以及TCP协议概述

1. TCP/IP 网络分层 TCP/IP协议模型(Transmission Control Protocol/Internet Protocol),包含了一系列构成互联网基础的网络协议,是Internet的核心协议,通过20多年的发展已日渐成熟,并被广泛应用于局域网和…

分节符缩写p_p值的缩写是什么?

分节符缩写pp是概率吗? (Is p for probability?) Technically, p-value stands for probability value, but since all of statistics is all about dealing with probabilistic decision-making, that’s probably the least useful name we could give it.从技术…

[测试题]打地鼠

Description 小明听说打地鼠是一件很好玩的游戏,于是他也开始打地鼠。地鼠只有一只,而且一共有N个洞,编号为1到N排成一排,两边是墙壁,小明当然不可能百分百打到,因为他不知道地鼠在哪个洞。小明只能在白天打…

重学TCP协议(2) TCP 报文首部

1. TCP 报文首部 1.1 源端口和目标端口 每个TCP段都包含源端和目的端的端口号,用于寻找发端和收端应用进程。这两个值加上IP首部中的源端IP地址和目的端IP地址唯一确定一个TCP连接 端口号分类 熟知端口号(well-known port)已登记的端口&am…

机器学习 预测模型_使用机器学习模型预测心力衰竭的生存时间-第一部分

机器学习 预测模型数据科学 , 机器学习 (Data Science, Machine Learning) 前言 (Preface) Cardiovascular diseases are diseases of the heart and blood vessels and they typically include heart attacks, strokes, and heart failures [1]. According to the …

重学TCP协议(3) 端口号及MTU、MSS

1. 端口相关的命令 1.1 查看端口是否打开 使用 nc 和 telnet 这两个命令可以非常方便的查看到对方端口是否打开或者网络是否可达。如果对端端口没有打开,使用 telnet 和 nc 命令会出现 “Connection refused” 错误 1.2 查看监听端口的进程 使用 netstat sudo …

Diffie Hellman密钥交换

In short, the Diffie Hellman is a widely used technique for securely sending a symmetric encryption key to another party. Before proceeding, let’s discuss why we’d want to use something like the Diffie Hellman in the first place. When transmitting data o…

如何通过建造餐厅来了解Scala差异

I understand that type variance is not fundamental to writing Scala code. Its been more or less a year since Ive been using Scala for my day-to-day job, and honestly, Ive never had to worry much about it. 我了解类型差异并不是编写Scala代码的基础。 自从我在日…

组织在召唤:如何免费获取一个js.org的二级域名

之前我是使用wangduanduan.github.io作为我的博客地址,后来觉得麻烦,有把博客关了。最近有想去折腾折腾。先看效果:wdd.js.org 如果你不了解js.org可以看看我的这篇文章:一个值得所有前端开发者关注的网站js.org 前提 已经有了github pages的…

linkedin爬虫_您应该在LinkedIn上关注的8个人

linkedin爬虫Finding great mentors are hard to come by these days. With so much information and so many opinions flooding the internet, finding an authority in a specific field can be quite tough.这些天很难找到优秀的导师。 互联网上充斥着如此众多的信息和众多…

重学TCP协议(4) 三次握手

1. 三次握手 请求端(通常称为客户)发送一个 S Y N段指明客户打算连接的服务器的端口,以及初始序号。这个S Y N段为报文段1。服务器发回包含服务器的初始序号的 S Y N报文段(报文段2)作为应答。同时,将确认序…

java温故笔记(二)java的数组HashMap、ConcurrentHashMap、ArrayList、LinkedList

为什么80%的码农都做不了架构师?>>> HashMap 摘要 HashMap是Java程序员使用频率最高的用于映射(键值对)处理的数据类型。随着JDK(Java Developmet Kit)版本的更新,JDK1.8对HashMap底层的实现进行了优化,例…

前置交换机数据交换_我们的数据科学交换所

前置交换机数据交换The DNC Data Science team builds and manages dozens of models that support a broad range of campaign activities. Campaigns rely on these model scores to optimize contactability, volunteer recruitment, get-out-the-vote, and many other piec…

在Centos中安装mysql

下载mysql这里是通过安装Yum源rpm包的方式安装,所以第一步是先下载rpm包 1.打开Mysql官网 https://www.mysql.com/, 点击如图选中的按钮 点击如图框选的按钮 把页面拉倒最下面,选择对应版本下载,博主这里用的是CentOS7 下载完成后上传到服务器,由于是yum源的安装包,所以…

Docker 入门(1)虚拟化和容器

1 虚拟化 虚拟化是为一些组件(例如虚拟应用、服务器、存储和网络)创建基于软件的(或虚拟)表现形式的过程。它是降低所有规模企业的 IT 开销,同时提高其效率和敏捷性的最有效方式。 1.1 虚拟化用于程序跨平台兼容 要…

量子相干与量子纠缠_量子分类

量子相干与量子纠缠My goal here was to build a quantum deep neural network for classification tasks, but all the effort involved in calculating errors, updating weights, training a model, and so forth turned out to be completely unnecessary. The above circu…

Python -- xlrd,xlwt,xlutils 读写同一个Excel

最近开始学习python,想做做简单的自动化测试,需要读写excel,然后就找到了xlrd来读取Excel文件,使用xlwt来生成Excel文件(可以控制Excel中单元格的格式),需要注意的是,用xlrd读取excel是不能对其进行操作的&…

知识力量_网络分析的力量

知识力量The most common way to store data is in what we call relational form. Most systems get analyzed as collections of independent data points. It looks something like this:存储数据的最常见方式是我们所谓的关系形式。 大多数系统作为独立数据点的集合进行分析…

SCCM PXE客户端无法加载DP(分发点)映像

上一篇文章我们讲到了一个比较典型的PXE客户端无法找到操作系统映像的故障,今天再和大家一起分享一个关于 PXE客户端无法加载分发点映像的问题。具体的报错截图如下:从报错中我们可以看到,PXE客户端已经成功的找到了SCCM服务器,并…