fastai学习:04_mnist_basics Questionnaire

1.How is a grayscale image represented on a computer? How about a color image?
灰度图:单通道,0-256
彩色图:三通道RGB或HSV,0-256
2.How are the files and folders in the MNIST_SAMPLE dataset structured? Why?
分为训练集和测试集,分别包含3和7文件夹,文件夹中为手写数字的图像
3.Explain how the “pixel similarity” approach to classifying digits works.
利用求平均的方式定义理想的3和7模型,与3相近的识别为3,与7相近的识别为7
4.What is a list comprehension? Create one now that selects odd numbers from a list and doubles them.
List Comprehensions:A list comprehension looks like this: new_list = [f(o) for o in a_list if o>0]. This will return every element of a_list that is greater than 0, after passing it to the function f. There are three parts here: the collection you are iterating over (a_list), an optional filter (if o>0), and something to do to each element (f(o)).
list1 = range(20)
list2 = [2*n for n in list1 if n%2 == 1]
5.What is a “rank-3 tensor”?
For every pixel position, we want to compute the average over all the images of the intensity of that pixel. To do this we first combine all the images in this list into a single three-dimensional tensor. The most common way to describe such a tensor is to call it a rank-3 tensor. We often need to stack up individual tensors in a collection into a single tensor.
6.What is the difference between tensor rank and shape? How do you get the rank from the shape?
The length of a tensor’s shape is its rank.
Rank is the number of axes or dimensions in a tensor; shape is the size of each axis of a tensor.
7.What are RMSE and L1 norm?
RSME:均方根误差
L1 norm:L1范数
8.How can you apply a calculation on thousands of numbers at once, many thousands of times faster than a Python loop?
使用tensor进行计算
9.Create a 3×3 tensor or array containing the numbers from 1 to 9. Double it. Select the bottom-right four numbers.
data = tensor(range(1,10)).view(3,3)
data[1:,1:]
10.What is broadcasting?
That is, it will automatically expand the tensor with the smaller rank to have the same size as the one with the larger rank. Broadcasting is an important capability that makes tensor code much easier to write.
11.Are metrics generally calculated using the training set, or the validation set? Why?
validation set
As we’ve discussed, we want to calculate our metric over a validation set. This is so that we don’t inadvertently overfit—that is, train a model to work well only on our training data. This is not really a risk with the pixel similarity model we’re using here as a first try, since it has no trained components, but we’ll use a validation set anyway to follow normal practices and to be ready for our second try later.
12.What is SGD?
随机梯度下降,Stochastic Gradient Descent
We’ll explain stochastic gradient descent (SGD), the mechanism for learning by updating weights automatically. We’ll discuss the choice of a loss function for our basic classification task, and the role of mini-batches. We’ll also describe the math that a basic neural network is actually doing. Finally, we’ll put all these pieces together.
13.Why does SGD use mini-batches?
在计算过程中,如果计算全部数据,会导致计算开销过大,如果计算单个数据,结果不精确,也会导致计算出的梯度不稳定。而一次性计算一批数据可以兼顾二者,且在gpu上进行计算时,mini-batches的计算效率更高。
14.What are the seven steps in SGD for machine learning?
(1)Initialize the weights.初始化参数
(2)For each image, use these weights to predict whether it appears to be a 3 or a 7.计算预测值
(3)Based on these predictions, calculate how good the model is (its loss).计算损失函数
(4)Calculate the gradient, which measures for each weight, how changing that weight would change the loss.计算梯度
(5)Step (that is, change) all the weights based on that calculation.更新权重
(6)Go back to the step 2, and repeat the process.重复迭代过程
(7)Iterate until you decide to stop the training process (for instance, because the model is good enough or you don’t want to wait any longer).停止
15.How do we initialize the weights in a model?
随机设置初始参数
16.What is “loss”?
损失函数,越小代表模型越好
17.Why can’t we always use a high learning rate?
学习率过高可能会导致损失函数震荡
18.What is a “gradient”?
梯度,The gradients tell us how much we have to change each weight to make our model better. It is essentially a measure of how the loss function changes with changes of the weights of the model (the derivative).
19.Do you need to know how to calculate gradients yourself?
不需要,pytorch能自动计算
20.Why can’t we use accuracy as a loss function?
预测准确性只有在预测结果变化时才会改变,对于模型的细微改变,如预测的置信度变化时,并不会改变预测结果,此时梯度为0,无法进一步对模型进行优化。

21.Draw the sigmoid function. What is special about its shape?
在这里插入图片描述 取值范围为0-1
22.What is the difference between a loss function and a metric?
loss function用于模型训练,metric用于衡量模型性能
23.What is the function to calculate new weights using a learning rate?
optimizer step function
24.What does the DataLoader class do?
The DataLoader class can take any Python collection and turn it into an iterator over many batches.
25.Write pseudocode showing the basic steps taken in each epoch for SGD.
for x, y in dl:
pred = model(x)
loss = loss_func(pred, y)
loss.background()
parameter -= parameter.grad * lr
26.Create a function that, if passed two arguments [1,2,3,4] and ‘abcd’, returns [(1, ‘a’), (2, ‘b’), (3, ‘c’), (4, ‘d’)]. What is special about that output data structure?
def func(x, y): return list(zip(x, y))
27.What does view do in PyTorch?
reshape
28.What are the “bias” parameters in a neural network? Why do we need them?
避免输入为0时输出永远为0,也可以拓展模型适用性
29.What does the @ operator do in Python?
矩阵乘法
30.What does the backward method do?
计算梯度
31.Why do we have to zero the gradients?
PyTorch 会将变量的梯度添加到之前存储的梯度中。如果多次调用训练循环函数,而不将梯度归零,则当前损失的梯度将被添加到先前存储的梯度值中
32.What information do we have to pass to Learner?
dataloader, model, optimize function, loss function, metrics
33.Show Python or pseudocode for the basic steps of a training loop.
def train_epoch(model, lr, params):
for xb, yb in dl:
calc_grad(xb, yb, model)
for p in params:
p.data -= p.grad * lr
p.grad.zero_()
for i in range(20):
train_epoch(model, lr, params)
34.What is “ReLU”? Draw a plot of it for values from -2 to +2.
Relu,负数->0,其他->本身
在这里插入图片描述
35.What is an “activation function”?
激活函数,用于给神经网络模型提供非线性
36.What’s the difference between F.relu and nn.ReLU?
F.relu是一个Python函数,nn.Relu是pytorch类
37.The universal approximation theorem shows that any function can be approximated as closely as needed using just one nonlinearity. So why do we normally use more?
There are practical performance benefits to using more than one nonlinearity. We can use a deeper model with less number of parameters, better performance, faster training, and less compute/memory requirements.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/550495.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

fastai学习:05_pet_breeds Questionnaire

1.Why do we first resize to a large size on the CPU, and then to a smaller size on the GPU? 首先,在训练模型时,我们希望能够将图片的尺寸统一,整理为张量,传入GPU,我们还希望最大限度地减少执行不同增强计算的…

fastai学习:06_multicat Questionnarie

1.How could multi-label classification improve the usability of the bear classifier? 可以对不存在的熊进行分类 2.How do we encode the dependent variable in a multi-label classification problem? One-hot encoding: Using a vector of zeros, with a one in each…

【论文阅读笔记】Detecting Camouflaged Object in Frequency Domain

1.论文介绍 Detecting Camouflaged Object in Frequency Domain 基于频域的视频目标检测 2022年发表于CVPR [Paper] [Code] 2.摘要 隐藏目标检测(COD)旨在识别完美嵌入其环境中的目标,在医学,艺术和农业等领域有各种下游应用。…

ubuntu中使用firefox浏览器播放bilibili的h5网页视频

安装好系统后,直接firefox打开bilibili显示没有flash插件 找了一圈没有发现自动播放h5的选项 搜索了一下发现可能是需要解码器 sudo apt-get install ubuntu-restricted-extras就能看了

ubuntu挂起唤醒后十几秒钟就自动熄屏一次

昨天晚上笔记本没关机,ubuntu挂起一晚上,今天早上打开电脑,发现每过十几秒钟就自动熄屏一次,重启之后好了,不知道什么原因 搜索了一下说可能是DPMS的问题,用xset -dpms可以关闭电源管理选项 但是本来的设置…

python3 上传文件到目标机器_Python3 +服务器搭建私人云盘,再也不怕限速了

先来看看效果电脑访问手机访问Windows版本搭建(1).首先你需要在你的电脑上或者服务器上安装Python3.X。(2).然后通过如下指令来安装updog库,网上有很多关于updog的介绍,我这里就不详细说pip3 install updog(3).静静的等他安装完成,然后执行以…

Ubuntu下绘图软件krita64位无中文问题

ubuntu20 sudo apt install krita-l10n 就有了 参考:https://bbs.deepin.org/post/181669

tableau度量值计算_Tableau图表界面组成介绍

声明:内容来源拉勾教育数据分析训练营课程视频1 Tableau工作表基本界面基础概念:维度、度量、聚合、粒度。维度: 维度包含定量值(例如名称、日期或地理数据),可以使用维度进行分类、分段以及揭示数据中的详细信息。维度影响视图中的详细级别。…

小强升职记思维导图_你学会用 “思维导图” 学英语了吗?

今天我们来讲讲目前比较火爆的“思维导图学习法”。思维导图又叫“MIND MAP”,是英国人托尼博赞发明的一种思维工具。托尼博赞本人在心理学、语言学、数学以及科学方向均获得过学位,而且他还创造了世界脑力奥林匹克运动。虽然大师已逝,但是这…

ubuntu下创建软件图标和直接点文件打开

ubuntu中有一些从github上下载的软件或者是appimage软件,能够使用,但是不在应用程序中显示,也不能直接点文件来打开程序 以cajviewer为例子,下下来是CAJViewer-x86_64-buildubuntu1604-210401.AppImage 打开目录/usr/share/appli…

hive币涨幅空间大吗_自动消防水炮只能安装在大空间场所吗

在大家不了解或者不清楚自动消防水炮的时候,经过一些厂家解释或者了解产品后,都知道是一种能够自动跟踪定位火焰并在短时内灭火的喷水系统,而且适用于安装在一些高大空间场所中,那么这是不是意味着,只能在大空间场所安…

可以直接考甲级吗_函授本科可以考四级吗

函授本科是可以考英语四级的。但必须经过学生所在学校同意,才可以在本校报名参加考试。函授本科可以考四级吗目前来说不管是函授大专还是本科,是可以考英语四级的,但应经所报考的学校同意,可在成人高考报考学校报名参加考试。函授本科用处大不大?函授本…

duration转为时间戳_Flink Table APIamp;SQL编程指南之时间属性(3)

Flink总共有三种时间语义:Processing time(处理时间)、Event time(事件时间)以及Ingestion time(摄入时间)。关于这些时间语义的具体解释,可以参考另一篇文章Flink的时间与watermarks详解。本文主要讲解Flink Table API & SQL中基于时间的算子如何定…

旅游系统_旅游标识系统,必须真的“旅游化”

标识是为游客传递路线,指明景点位置、起安全警示作用以及传达公园发展理念的标识(牌)或标识物,是公园的重要组成部分,有助于旅游者顺利完成游览过程,获得满意的旅游体验。好的完善的标识系统,可以起到画龙点睛的作用&a…

如何在linux下启动和关闭oracle服务

1.前言 确保我们能够访问oracle数据库包含两部分,一个是oracle实例,一个是监听,两个同时开启,我们才能正常的使用数据库,因此我们在关闭和启动oracle服务时,也需要同时操作实例和监听。能够操作linux的工具…

exfat为什么不适合机械硬盘_为什么有人说小排量车不适合跑高速,多少排量的车适合?...

阅读本文前,请您先点击上面的蓝色字体“梅赛德斯丶Benz”,再点击“关注”,这样您就可以继续免费收到祝福了。每天都有分享,完全是免费订阅,请放心关注。 哈喽,小伙伴们关注“梅塞德斯丶Benz”每…

调用第三方接口的几种请求方式

第一种方式: String url4"https://www.showmebug.com/open_api/v1/interviews"; jsonnew JSONObject(); json.put("candidate_name", "张三");//传递的参数 MediaType mediaType MediaType.parse("application/json;charsetut…

rust石头墙几个c4_石头在景观中的运用

石材的运用横跨几个世纪,经久不衰。在景观设计中仍然是一个受欢迎的材料。运用好了可以很好的彰显景观的特性。石头的优点持续一生;非常耐用;容易使用;可以用在墙壁装修,铺路,以及重复使用;有不…

java通过POI技术将html转成word

private static void inputStreamToWord() throws IOException {String content "<html>" "<head>你好</head>" "<body>" "<table>" "<tr>" "<td>信息1</td>" …

h5禁用浏览器下载视频_Flash正式被全球禁用,只有中国版还活着

这个弹窗常用 Chrome 或 Edge 浏览器的用户应该很熟悉&#xff0c;基本上每月都能看到几次。说起来 Adobe Flash Player 也是老朋友&#xff0c;这个 F 红标从 4399 小游戏到视频网站默认播放器&#xff0c;二十年来几乎伴随互联网一代人成长。图源自小众软件但技术总在进步&am…