fastai学习:02_production Questionnaire

1.Where do text models currently have a major deficiency?
Deep learning is currently not good at generating correct responses! We don’t currently have a reliable way to, for instance, combine a knowledge base of medical information with a deep learning model for generating medically correct natural language responses.We don’t currently have a reliable way to, for instance, combine a knowledge base of medical information with a deep learning model for generating medically correct natural language responses.
2.What are possible negative societal implications of text generation models?
It is so easy to create content that appears to a layman to be compelling, but actually is entirely incorrect.
Another concern is that context-appropriate, highly compelling responses on social media could be used at massive scale—thousands of times greater than any troll farm previously seen—to spread disinformation, create unrest, and encourage conflict.
3.In situations where a model might make mistakes, and those mistakes could be harmful, what is a good alternative to automating a process?
人工审查
4.What kind of tabular data is deep learning particularly good at?
Natural language (book titles, reviews, etc.) and high-cardinality categorical columns.
5.What’s a key downside of directly using a deep learning model for recommendation systems?
Nearly all machine learning approaches have the downside that they only tell you what products a particular user might like, rather than what recommendations would be helpful for a user. Many kinds of recommendations for products a user might like may not be at all helpful—for instance, if the user is already familiar with the products, or if they are simply different packagings of products they have already purchased (such as a boxed set of novels, when they already have each of the items in that set).
6.What are the steps of the Drivetrain Approach?

7.How do the steps of the Drivetrain Approach map to a recommendation system?
The objective of a recommendation engine is to drive additional sales by surprising and delighting the customer with recommendations of items they would not have purchased without the recommendation. The lever is the ranking of the recommendations. New data must be collected to generate recommendations that will cause new sales. This will require conducting many randomized experiments in order to collect data about a wide range of recommendations for a wide range of customers. This is a step that few organizations take; but without it, you don’t have the information you need to actually optimize recommendations based on your true objective (more sales!).
8.Create an image recognition model using data you curate, and deploy it on the web.
下载了dogvscatkaggle进行尝试
9.What is DataLoaders?
A fastai class that stores multiple DataLoader objects you pass to it, normally a train and a valid, although it’s possible to have as many as you like. The first two are made available as properties.
10.What four things do we need to tell fastai to create DataLoaders?
What kinds of data we are working with
How to get the list of items
How to label these items
How to create the validation set
11.What does the splitter parameter to DataBlock do?
划分数据集,训练集,验证集,valid_pct
12.How do we ensure ait always gives the same validation set?
设置随机种子,保证划分方式相同,seed
13.What letters are often used to signify the independent and dependent variables?
x independent
y dependent
14.What’s the difference between the crop, pad, and squish resize approaches? When might you choose one over the others?
crop is the default Resize() method, and it crops the images to fit a square shape of the size requested, using the full width or height. This can result in losing some important details. For instance, if we were trying to recognize the breed of dog or cat, we may end up cropping out a key part of the body or the face necessary to distinguish between similar breeds.
pad is an alternative Resize() method, which pads the matrix of the image’s pixels with zeros (which shows as black when viewing the images). If we pad the images then we have a whole lot of empty space, which is just wasted computation for our model, and results in a lower effective resolution for the part of the image we actually use.
squish is another alternative Resize() method, which can either squish or stretch the image. This can cause the image to take on an unrealistic shape, leading to a model that learns that things look different to how they actually are, which we would expect to result in lower accuracy.
Which resizing method to use therefore depends on the underlying problem and dataset. For example, if the features in the dataset images take up the whole image and cropping may result in loss of information, squishing or padding may be more useful.
Another better method is RandomResizedCrop, in which we crop on a randomly selected region of the image. So every epoch, the model will see a different part of the image and will learn accordingly.
15.What is data augmentation? Why is it needed?
Data augmentation refers to creating random variations of our input data, such that they appear different, but do not actually change the meaning of the data. Examples of common data augmentation techniques for images are rotation, flipping, perspective warping, brightness changes and contrast changes. For natural photo images such as the ones we are using here, a standard set of augmentations that we have found work pretty well are provided with the aug_transforms function. Because our images are now all the same size, we can apply these augmentations to an entire batch of them using the GPU, which will save a lot of time.
16.What is the difference between item_tfms and batch_tfms?
item_tfms are transformations applied to a single data sample x on the CPU. Resize() is a common transform because the mini-batch of input images to a cnn must have the same dimensions. Assuming the images are RGB with 3 channels, then Resize() as item_tfms will make sure the images have the same width and height.
batch_tfms are applied to batched data samples (aka individual samples that have been collated into a mini-batch) on the GPU. They are faster and more efficient than item_tfms. A good example of these are the ones provided by aug_transforms(). Inside are several batch-level augmentations that help many models.
17.What is a confusion matrix?
展示预测结果的混淆矩阵
18.What does export save?
Export saves both the architecture, as well as the trained parameters of the neural network architecture. It also saves how the DataLoaders are defined.
19.What is it called when we use a model for getting predictions, instead of training?
Inference
20.What are IPython widgets?
IPython widgets are JavaScript and Python combined functionalities that let us build and interact with GUI components directly in a Jupyter notebook.
21.When might you want to use CPU for deployment? When might GPU be better?
GPUs are best for doing identical work in parallel. If you will be analyzing single pieces of data at a time (like a single image or single sentence), then CPUs may be more cost effective instead, especially with more market competition for CPU servers versus GPU servers. GPUs could be used if you collect user responses into a batch at a time, and perform inference on the batch. This may require the user to wait for model predictions. Additionally, there are many other complexities when it comes to GPU inference, like memory management and queuing of the batches.
22.What are the downsides of deploying your app to a server, instead of to a client (or edge) device such as a phone or PC?
Your application will require a network connection, and there will be some latency each time the model is called.
Also, if your application uses sensitive data then your users may be concerned about an approach which sends that data to a remote server, so sometimes privacy considerations will mean that you need to run the model on the edge device.
Managing the complexity and scaling the server can create additional overhead too, whereas if your model runs on the edge devices then each user is bringing their own compute resources, which leads to easier scaling with an increasing number of users.
23.What are three examples of problems that could occur when rolling out a bear warning system in practice?
Working with video data instead of images
Handling nighttime images, which may not appear in this dataset
Dealing with low-resolution camera images
Ensuring results are returned fast enough to be useful in practice
Recognizing bears in positions that are rarely seen in photos that people post online (for example from behind, partially covered by bushes, or when a long way away from the camera)
24.What is “out-of-domain data”?
That is to say, there may be data that our model sees in production which is very different to what it saw during training. There isn’t really a complete technical solution to this problem; instead, we have to be careful about our approach to rolling out the technology.
25.What is “domain shift”?
One very common problem is domain shift, where the type of data that our model sees changes over time. For instance, an insurance company may use a deep learning model as part of its pricing and risk algorithm, but over time the types of customers that the company attracts, and the types of risks they represent, may change so much that the original training data is no longer relevant.
26.What are the three steps in the deployment process?
在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/550496.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

fastai学习:04_mnist_basics Questionnaire

1.How is a grayscale image represented on a computer? How about a color image? 灰度图:单通道,0-256 彩色图:三通道RGB或HSV,0-256 2.How are the files and folders in the MNIST_SAMPLE dataset structured? Why? 分为…

fastai学习:05_pet_breeds Questionnaire

1.Why do we first resize to a large size on the CPU, and then to a smaller size on the GPU? 首先,在训练模型时,我们希望能够将图片的尺寸统一,整理为张量,传入GPU,我们还希望最大限度地减少执行不同增强计算的…

fastai学习:06_multicat Questionnarie

1.How could multi-label classification improve the usability of the bear classifier? 可以对不存在的熊进行分类 2.How do we encode the dependent variable in a multi-label classification problem? One-hot encoding: Using a vector of zeros, with a one in each…

【论文阅读笔记】Detecting Camouflaged Object in Frequency Domain

1.论文介绍 Detecting Camouflaged Object in Frequency Domain 基于频域的视频目标检测 2022年发表于CVPR [Paper] [Code] 2.摘要 隐藏目标检测(COD)旨在识别完美嵌入其环境中的目标,在医学,艺术和农业等领域有各种下游应用。…

ubuntu中使用firefox浏览器播放bilibili的h5网页视频

安装好系统后,直接firefox打开bilibili显示没有flash插件 找了一圈没有发现自动播放h5的选项 搜索了一下发现可能是需要解码器 sudo apt-get install ubuntu-restricted-extras就能看了

ubuntu挂起唤醒后十几秒钟就自动熄屏一次

昨天晚上笔记本没关机,ubuntu挂起一晚上,今天早上打开电脑,发现每过十几秒钟就自动熄屏一次,重启之后好了,不知道什么原因 搜索了一下说可能是DPMS的问题,用xset -dpms可以关闭电源管理选项 但是本来的设置…

python3 上传文件到目标机器_Python3 +服务器搭建私人云盘,再也不怕限速了

先来看看效果电脑访问手机访问Windows版本搭建(1).首先你需要在你的电脑上或者服务器上安装Python3.X。(2).然后通过如下指令来安装updog库,网上有很多关于updog的介绍,我这里就不详细说pip3 install updog(3).静静的等他安装完成,然后执行以…

Ubuntu下绘图软件krita64位无中文问题

ubuntu20 sudo apt install krita-l10n 就有了 参考:https://bbs.deepin.org/post/181669

tableau度量值计算_Tableau图表界面组成介绍

声明:内容来源拉勾教育数据分析训练营课程视频1 Tableau工作表基本界面基础概念:维度、度量、聚合、粒度。维度: 维度包含定量值(例如名称、日期或地理数据),可以使用维度进行分类、分段以及揭示数据中的详细信息。维度影响视图中的详细级别。…

小强升职记思维导图_你学会用 “思维导图” 学英语了吗?

今天我们来讲讲目前比较火爆的“思维导图学习法”。思维导图又叫“MIND MAP”,是英国人托尼博赞发明的一种思维工具。托尼博赞本人在心理学、语言学、数学以及科学方向均获得过学位,而且他还创造了世界脑力奥林匹克运动。虽然大师已逝,但是这…

ubuntu下创建软件图标和直接点文件打开

ubuntu中有一些从github上下载的软件或者是appimage软件,能够使用,但是不在应用程序中显示,也不能直接点文件来打开程序 以cajviewer为例子,下下来是CAJViewer-x86_64-buildubuntu1604-210401.AppImage 打开目录/usr/share/appli…

hive币涨幅空间大吗_自动消防水炮只能安装在大空间场所吗

在大家不了解或者不清楚自动消防水炮的时候,经过一些厂家解释或者了解产品后,都知道是一种能够自动跟踪定位火焰并在短时内灭火的喷水系统,而且适用于安装在一些高大空间场所中,那么这是不是意味着,只能在大空间场所安…

可以直接考甲级吗_函授本科可以考四级吗

函授本科是可以考英语四级的。但必须经过学生所在学校同意,才可以在本校报名参加考试。函授本科可以考四级吗目前来说不管是函授大专还是本科,是可以考英语四级的,但应经所报考的学校同意,可在成人高考报考学校报名参加考试。函授本科用处大不大?函授本…

duration转为时间戳_Flink Table APIamp;SQL编程指南之时间属性(3)

Flink总共有三种时间语义:Processing time(处理时间)、Event time(事件时间)以及Ingestion time(摄入时间)。关于这些时间语义的具体解释,可以参考另一篇文章Flink的时间与watermarks详解。本文主要讲解Flink Table API & SQL中基于时间的算子如何定…

旅游系统_旅游标识系统,必须真的“旅游化”

标识是为游客传递路线,指明景点位置、起安全警示作用以及传达公园发展理念的标识(牌)或标识物,是公园的重要组成部分,有助于旅游者顺利完成游览过程,获得满意的旅游体验。好的完善的标识系统,可以起到画龙点睛的作用&a…

如何在linux下启动和关闭oracle服务

1.前言 确保我们能够访问oracle数据库包含两部分,一个是oracle实例,一个是监听,两个同时开启,我们才能正常的使用数据库,因此我们在关闭和启动oracle服务时,也需要同时操作实例和监听。能够操作linux的工具…

exfat为什么不适合机械硬盘_为什么有人说小排量车不适合跑高速,多少排量的车适合?...

阅读本文前,请您先点击上面的蓝色字体“梅赛德斯丶Benz”,再点击“关注”,这样您就可以继续免费收到祝福了。每天都有分享,完全是免费订阅,请放心关注。 哈喽,小伙伴们关注“梅塞德斯丶Benz”每…

调用第三方接口的几种请求方式

第一种方式: String url4"https://www.showmebug.com/open_api/v1/interviews"; jsonnew JSONObject(); json.put("candidate_name", "张三");//传递的参数 MediaType mediaType MediaType.parse("application/json;charsetut…

rust石头墙几个c4_石头在景观中的运用

石材的运用横跨几个世纪,经久不衰。在景观设计中仍然是一个受欢迎的材料。运用好了可以很好的彰显景观的特性。石头的优点持续一生;非常耐用;容易使用;可以用在墙壁装修,铺路,以及重复使用;有不…

java通过POI技术将html转成word

private static void inputStreamToWord() throws IOException {String content "<html>" "<head>你好</head>" "<body>" "<table>" "<tr>" "<td>信息1</td>" …