机器学习 啤酒数据集_啤酒数据集上的神经网络

机器学习 啤酒数据集

Artificial neural networks (ANNs), usually simply called neural networks (NNs), are computing systems vaguely inspired by the biological neural networks that constitute animal brains.

人工神经网络(ANN)通常简称为神经网络(NNs),是由构成动物大脑的生物神经网络模糊地启发了计算系统。

An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron that receives a signal then processes it and can signal neurons connected to it. The “signal” at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called edges. Neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold. Typically, neurons are aggregated into layers. Different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times.

人工神经网络基于称为人工神经元的连接单元或节点的集合,这些单元或节点可以对生物脑中的神经元进行松散建模。 每个连接都像生物大脑中的突触一样,可以将信号传输到其他神经元。 接收信号的人工神经元随后对其进行处理,并可以向与之相连的神经元发出信号。 连接处的“信号”是实数,每个神经元的输出通过其输入之和的某些非线性函数来计算。 这些连接称为边。 神经元和边缘通常具有随着学习的进行而调整的权重。 权重增加或减小连接处信号的强度。 神经元可以具有阈值,使得仅当总信号超过该阈值时才发送信号。 通常,神经元聚集成层。 不同的层可以对它们的输入执行不同的变换。 信号可能从第一层(输入层)传播到最后一层(输出层),可能是在多次遍历这些层之后。

Neural networks learn (or are trained) by processing examples, each of which contains a known “input” and “result,” forming probability-weighted associations between the two, which are stored within the data structure of the net itself. The training of a neural network from a given example is usually conducted by determining the difference between the processed output of the network (often a prediction) and a target output. This is the error. The network then adjusts it’s weighted associations according to a learning rule and using this error value. Successive adjustments will cause the neural network to produce output which is increasingly similar to the target output. After a sufficient number of these adjustments the training can be terminated based upon certain criteria. This is known as [[supervised learning]].

神经网络通过处理示例来学习(或训练),每个示例都包含一个已知的“输入”和“结果”,形成两者之间的概率加权关联,这些关联存储在网络本身的数据结构中。 给定示例中的神经网络训练通常是通过确定网络的处理输出(通常是预测)与目标输出之间的差异来进行的。 这是错误。 然后,网络根据学习规则并使用此错误值来调整其加权关联。 连续的调整将导致神经网络产生越来越类似于目标输出的输出。 在进行了足够数量的这些调整后,可以基于某些标准终止训练。 这称为[[监督学习]]。

开始干活 (Let’s work)

安装套件 (Install Packages)

packages <- c("xts","zoo","PerformanceAnalytics", "GGally", "ggplot2", "ellipse", "plotly")
newpack = packages[!(packages %in% installed.packages()[,"Package"])]
if(length(newpack)) install.packages(newpack)
a=lapply(packages, library, character.only=TRUE)

加载数据集 (Load dataset)

beer <- read.csv("MyData.csv")
head(beer)
Image for post
summary(beer)Clase               Color        BoilGravity        IBU        
Length:1000 Min. : 1.99 Min. : 1.0 Min. : 0.00
Class :character 1st Qu.: 5.83 1st Qu.:27.0 1st Qu.: 32.90
Mode :character Median : 7.79 Median :33.0 Median : 47.90
Mean :13.45 Mean :33.8 Mean : 51.97
3rd Qu.:12.57 3rd Qu.:39.0 3rd Qu.: 67.77
Max. :50.00 Max. :90.0 Max. :144.53
ABV
Min. : 2.390
1st Qu.: 5.240
Median : 5.990
Mean : 6.093
3rd Qu.: 6.810
Max. :10.380

虹膜数据集的可视化 (Visualization of Iris Data Set)

You can also embed plots, for example:

您还可以嵌入图,例如:

pairs(beer[2:5], 
main = "Craft Beer Data -- 5 types",
pch = 21, bg = c("red", "green", "blue", "orange", "yellow"))
png
library(GGally)
pm <- ggpairs(beer,lower=list(combo=wrap("facethist",
binwidth=0.5)),title="Craft Beer", mapping=aes(color=Clase))
pm
png
library(PerformanceAnalytics)
chart.Correlation2 <- function (R, histogram = TRUE, method = NULL, ...)
{
x = checkData(R, method = "matrix")
if (is.null(method)) #modified
method = 'pearson'

use.method <- method #added
panel.cor <- function(x, y, digits = 2, prefix = "",
use = "pairwise.complete.obs",
method = use.method, cex.cor, ...)
{ #modified
usr <- par("usr")
on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- cor(x, y, use = use, method = method)
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste(prefix, txt, sep = "")
if (missing(cex.cor))
cex <- 0.8/strwidth(txt)
test <- cor.test(as.numeric(x), as.numeric(y), method = method)
Signif <- symnum(test$p.value, corr = FALSE, na = FALSE,
cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1),
symbols = c("***","**", "*", ".", " "))
text(0.5, 0.5, txt, cex = cex * (abs(r) + 0.3)/1.3)
text(0.8, 0.8, Signif, cex = cex, col = 2)
}
f <- function(t)
{
dnorm(t, mean = mean(x), sd = sd.xts(x))
}
dotargs <- list(...)
dotargs$method <- NULL
rm(method)
hist.panel = function(x, ... = NULL)
{
par(new = TRUE)
hist(x, col = "light gray", probability = TRUE, axes = FALSE,
main = "", breaks = "FD")
lines(density(x, na.rm = TRUE), col = "red", lwd = 1)
rug(x)
}
if (histogram)
pairs(x, gap = 0, lower.panel = panel.smooth,
upper.panel = panel.cor, diag.panel = hist.panel)
else pairs(x, gap = 0, lower.panel = panel.smooth, upper.panel = panel.cor)
}
#if method option not set default is 'pearson'
chart.Correlation2(beer[,2:5], histogram=TRUE, pch="21")
png
library(plotly)
pm <- GGally::ggpairs(beer, aes(color = Clase), lower=list(combo=wrap("facethist",
binwidth=0.5)))
class(pm)
pm
  1. ‘gg’

    'gg'
  2. ‘ggmatrix’

    'ggmatrix'
png

建立和训练啤酒数据神经网络 (Setup and Train the Neural Network for Beer Data)

Neural Network emulates how the human brain works by having a network of neurons that are interconnected and sending stimulating signal to each other.

神经网络通过使相互连接的神经元网络相互发送刺激信号来模拟人脑的工作方式。

In the Neural Network model, each neuron is equivalent to a logistic regression unit. Neurons are organized in multiple layers where every neuron at layer i connects out to every neuron at layer i+1 and nothing else.

在神经网络模型中,每个神经元都等效于逻辑回归单元。 神经元是多层组织的,其中第i层的每个神经元都与第i + 1层的每个神经元相连。

The tuning parameters in Neural network includes the number of hidden layers, number of neurons in each layer, as well as the learning rate.

神经网络中的调整参数包括隐藏层数,每层神经元数以及学习率。

There are no fixed rules to set these parameters and depends a lot in the problem domain. My default choice is to use a single hidden layer and set the number of neurons to be the same as the input variables. The number of neurons at the output layer depends on how many binary outputs need to be learned. In a classification problem, this is typically the number of possible values at the output category.

没有固定的规则来设置这些参数,并且在问题域中有很大关系。 我的默认选择是使用单个隐藏层,并将神经元数量设置为与输入变量相同。 输出层神经元的数量取决于需要学习多少个二进制输出。 在分类问题中,这通常是输出类别中可能值的数量。

The learning happens via an iterative feedback mechanism where the error of training data output is used to adjusted the corresponding weights of input. This adjustment will be propagated back to previous layers and the learning algorithm is known as back-propagation.

通过迭代反馈机制进行学习,在该机制中,训练数据输出的错误用于调整输入的相应权重。 此调整将传播回先前的层,学习算法称为反向传播。

library(neuralnet)beer <- beer%>%
select("IBU","ABV","Color","BoilGravity","Clase")
head(beer)
Image for post
# Binarize the categorical output
beer <- cbind(beer, beer$Clase == 'ALE')
beer <- cbind(beer, beer$Clase == 'IPA')
beer <- cbind(beer, beer$Clase == 'PALE')
beer <- cbind(beer, beer$Clase == 'STOUT')
beer <- cbind(beer, beer$Clase == 'PORTER')
names(beer)[6] <- 'ALE'
names(beer)[7] <- 'IPA'
names(beer)[8] <- 'PALE'
names(beer)[9] <- 'STOUT'
names(beer)[10] <- 'PORTER'
head(beer)
Image for post
set.seed(101)
beer.train.idx <- sample(x = nrow(beer), size = nrow(beer)*0.5)
beer.train <- beer[beer.train.idx,]
beer.valid <- beer[-beer.train.idx,]

啤酒数据神经网络的可视化 (Visulization of the Neural Network on Beer Data)

Here is the plot of the Neural network we learn

这是我们学习的神经网络图

Neural network is very good at learning non-linear function and also multiple outputs can be learnt at the same time. However, the training time is relatively long and it is also susceptible to local minimum traps. This can be mitigated by doing multiple rounds and pick the best learned model.

神经网络非常擅长学习非线性函数,并且可以同时学习多个输出。 但是,训练时间相对较长,并且也容易受到局部最小陷阱的影响。 可以通过多次尝试并选择最佳的学习模型来缓解这种情况。

nn <- neuralnet(ALE+IPA+PALE+STOUT+PORTER ~ IBU+ABV+Color+BoilGravity, data=beer.train, hidden=c(5))plot(nn, rep = "best")
png

结果 (Result)

beer.prediction <- compute(nn, beer.valid[-5:-10])
idx <- apply(beer.prediction$net.result, 1, which.max)
predicted <- c('ALE','IPA', 'PALE', 'STOUT', 'PORTER')[idx]
table(predicted, beer.valid$Clase)predicted ALE IPA PALE PORTER STOUT
ALE 17 3 12 0 0
IPA 1 203 21 0 2
PALE 29 26 84 1 0
STOUT 0 4 0 30 67

Accuracy of model is calculated as follows

模型的精度计算如下

((17+203+84+0+67)/nrow(beer.valid))*100

74.2

74.2

# nn$result.matrixstr(nn)List of 14
$ call : language neuralnet(formula = ALE + IPA + PALE + STOUT + PORTER ~ IBU + ABV + Color + BoilGravity, data = beer.train, hidden = c(5))
$ response : logi [1:500, 1:5] FALSE FALSE FALSE FALSE FALSE FALSE ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:500] "841" "825" "430" "95" ...
.. ..$ : chr [1:5] "ALE" "IPA" "PALE" "STOUT" ...
$ covariate : num [1:500, 1:4] 62.3 27.1 39 72.3 67.8 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:500] "841" "825" "430" "95" ...
.. ..$ : chr [1:4] "IBU" "ABV" "Color" "BoilGravity"
$ model.list :List of 2
..$ response : chr [1:5] "ALE" "IPA" "PALE" "STOUT" ...
..$ variables: chr [1:4] "IBU" "ABV" "Color" "BoilGravity"
$ err.fct :function (x, y)
..- attr(*, "type")= chr "sse"
$ act.fct :function (x)
..- attr(*, "type")= chr "logistic"
$ linear.output : logi TRUE
$ data :'data.frame': 500 obs. of 10 variables:
..$ IBU : num [1:500] 62.3 27.1 39 72.3 67.8 ...
..$ ABV : num [1:500] 5.9 5.07 6.57 5.7 6.86 5.21 4.22 5.57 5.76 7.76 ...
..$ Color : num [1:500] 5.61 32.07 39.92 9.62 8.29 ...
..$ BoilGravity: int [1:500] 37 25 40 37 31 28 19 27 30 44 ...
..$ Clase : chr [1:500] "IPA" "PORTER" "STOUT" "PALE" ...
..$ ALE : logi [1:500] FALSE FALSE FALSE FALSE FALSE FALSE ...
..$ IPA : logi [1:500] TRUE FALSE FALSE FALSE TRUE FALSE ...
..$ PALE : logi [1:500] FALSE FALSE FALSE TRUE FALSE TRUE ...
..$ STOUT : logi [1:500] FALSE FALSE TRUE FALSE FALSE FALSE ...
..$ PORTER : logi [1:500] FALSE TRUE FALSE FALSE FALSE FALSE ...
$ exclude : NULL
$ net.result :List of 1
..$ : num [1:500, 1:5] 0.00942 0.01859 0.01845 0.00916 0.00478 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:500] "841" "825" "430" "95" ...
.. .. ..$ : NULL
$ weights :List of 1
..$ :List of 2
.. ..$ : num [1:5, 1:5] -10.8295 0.0944 0.9985 -0.1776 0.0445 ...
.. ..$ : num [1:6, 1:5] 0.0576 -0.058 -0.4324 0.4371 -0.0437 ...
$ generalized.weights:List of 1
..$ : num [1:500, 1:20] -0.08239 -0.000124 -0.000822 -0.082905 -0.093232 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:500] "841" "825" "430" "95" ...
.. .. ..$ : NULL
$ startweights :List of 1
..$ :List of 2
.. ..$ : num [1:5, 1:5] -0.5 1.832 -0.329 0.261 -1.112 ...
.. ..$ : num [1:6, 1:5] 0.341 1.107 0.689 0.471 -1.64 ...
$ result.matrix : num [1:58, 1] 8.02e+01 8.76e-03 7.37e+04 -1.08e+01 9.44e-02 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:58] "error" "reached.threshold" "steps" "Intercept.to.1layhid1" ...
.. ..$ : NULL
- attr(*, "class")= chr "nn"beer.net <- neuralnet(ALE+IPA+PALE+STOUT+PORTER ~ IBU+ABV+Color+BoilGravity,
data=beer.train, hidden=c(5), err.fct = "ce",
linear.output = F, lifesign = "minimal",
threshold = 0.1)hidden: 5 thresh: 0.1 rep: 1/1 steps:
86036
error: 431.94881
time: 24.02 secsplot(beer.net, rep="best")
png

预测结果 (Predicting Result)

beer.prediction <- compute(beer.net, beer.valid[-5:-10])
idx <- apply(beer.prediction$net.result, 1, which.max)
predicted <- c('ALE','IPA', 'PALE', 'STOUT', 'PORTER')[idx]
table(predicted, beer.valid$Clase)predicted ALE IPA PALE PORTER STOUT
ALE 26 4 9 0 0
IPA 0 197 30 1 3
PALE 21 33 78 0 0
PORTER 0 1 0 10 6
STOUT 0 1 0 20 60

Accuracy of model is calculated as follows

模型的精度计算如下

((26+197+78+10+60)/nrow(beer.valid))*100

74.2

74.2

结论 (Conclusion)

As you can see the accuracy is equal!

如您所见,精度是相等的!

I hope it will help you to develop your training.

我希望它能帮助您发展培训。

永不放弃! (Never give up!)

See you in Linkedin!

在Linkedin上见!

翻译自: https://medium.com/@zumaia/neural-network-on-beer-dataset-55d62a0e7c32

机器学习 啤酒数据集

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388491.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

ER TO SQL语句

ER TO SQL语句的转换&#xff0c;在数据库设计生命周期的位置如下所示。 一、转换的类别 从ER图转化得到关系数据库中的SQL表&#xff0c;一般可分为3类&#xff1a; 1&#xff09;转化得到的SQL表与原始实体包含相同信息内容。该类转化一般适用于&#xff1a; 二元“多对多”关…

dede 5.7 任意用户重置密码前台

返回了重置的链接&#xff0c;还要把&amp删除了&#xff0c;就可以重置密码了 结果只能改test的密码&#xff0c;进去过后&#xff0c;这个居然是admin的密码&#xff0c;有点头大&#xff0c;感觉这样就没有意思了 我是直接上传的一句话&#xff0c;用菜刀连才有乐趣 OK了…

nasa数据库cm1数据集_获取下一个地理项目的NASA数据

nasa数据库cm1数据集NASA provides an extensive library of data points that they’ve captured over the years from their satellites. These datasets include temperature, precipitation and more. NASA hosts this data on a website where you can search and grab in…

r语言处理数据集编码_在强调编码语言或工具之前,请学习这3个基本数据概念

r语言处理数据集编码重点 (Top highlight)I got an Instagram DM the other day that really got me thinking. This person explained that they were a data analyst by trade, and had years of experience. But, they also said that they felt that their technical skill…

HTML和CSS面试问题总结,html和css面试总结

html和cssw3c 规范结构化标准语言样式标准语言行为标准语言1) 盒模型常见的盒模型有w3c盒模型(又名标准盒模型)box-sizing:content-box和IE盒模型(又名怪异盒模型)box-sizing:border-box。标准盒子模型&#xff1a;宽度内容的宽度(content) border padding margin低版本IE盒子…

山师计算机专业研究生怎么样,山东师范大学有计算机专业硕士吗?

山东师范大学位于山东省济南市&#xff0c;学校是一所综合性高等师范院校。该院校深受广大报考专业硕士学员的欢迎&#xff0c;因此很多学员想要知道山东师范大学有没有计算机专业硕士&#xff1f;山东师范大学是有计算机专业硕士的。下面就和大家介绍一下培养目标有哪些&#…

使用TensorFlow概率预测航空乘客人数

TensorFlow Probability uses structural time series models to conduct time series forecasting. In particular, this library allows for a “scenario analysis” form of modelling — whereby various forecasts regarding the future are made.TensorFlow概率使用结构…

python画激活函数图像

导入必要的库 import math import matplotlib.pyplot as plt import numpy as np import matplotlib as mpl mpl.rcParams[axes.unicode_minus] False 绘制softmax函数图像 fig plt.figure(figsize(6,4)) ax fig.add_subplot(111) x np.linspace(-10,10) y sigmoid(x)ax.s…

pdf.js插件使用记录,在线打开pdf

pdf.js插件使用记录&#xff0c;在线打开pdf 原文:pdf.js插件使用记录&#xff0c;在线打开pdf天记录一个js库&#xff1a;pdf.js。主要是实现在线打开pdf功能。因为项目需求需要能在线查看pdf文档&#xff0c;所以就研究了一下这个控件。 有些人很好奇&#xff0c;在线打开pdf…

程序员 sql面试_非程序员SQL使用指南

程序员 sql面试Today, the word of the moment is DATA, this little combination of 4 letters is transforming how all companies and their employees work, but most people don’t really know how data behaves or how to access it and they also think that this is j…

r a/b 测试_R中的A / B测试

r a/b 测试什么是A / B测试&#xff1f; (What is A/B Testing?) A/B testing is a method used to test whether the response rate is different for two variants of the same feature. For instance, you may want to test whether a specific change to your website lik…

Java基础回顾

内容&#xff1a; 1、Java中的数据类型 2、引用类型的使用 3、IO流及读写文件 4、对象的内存图 5、this的作用及本质 6、匿名对象 1、Java中的数据类型 Java中的数据类型有如下两种&#xff1a; 基本数据类型: 4类8种 byte(1) boolean(1) short(2) char(2) int(4) float(4) l…

计算机部分应用显示模糊,win10系统打开部分软件字体总显示模糊的解决方法-电脑自学网...

win10系统打开部分软件字体总显示模糊的解决方法。方法一&#xff1a;win10软件字体模糊1、首先&#xff0c;在Win10的桌面点击鼠标右键&#xff0c;选择“显示设置”。2、在“显示设置”的界面下方&#xff0c;点击“高级显示设置”。3、在“高级显示设置”的界面中&#xff0…

Tomcat调节

Tomcat默认可以使用的内存为128MB&#xff0c;在较大型的应用项目中&#xff0c;这点内存是不够的&#xff0c;需要调大,并且Tomcat本身不能直接在计算机上运行&#xff0c;需要依赖于硬件基础之上的操作系统和一个java虚拟机。 AD&#xff1a; 这里向大家描述一下如何使用Tom…

turtle 20秒画完小猪佩奇“社会人”

转载&#xff1a;https://blog.csdn.net/csdnsevenn/article/details/80650456 图片源自网络 作者 丁彦军 如需转载&#xff0c;请联系原作者授权。 今年社交平台上最火的带货女王是谁&#xff1f;范冰冰&#xff1f;杨幂&#xff1f;Angelababy&#xff1f;不&#xff0c;是猪…

最佳子集aic选择_AutoML的起源:最佳子集选择

最佳子集aic选择As there is a lot of buzz about AutoML, I decided to write about the original AutoML; step-wise regression and best subset selection. Then I decided to ignore step-wise regression because it is bad and should probably stop being taught. That…

Java虚拟机内存溢出

最近在看周志明的《深入理解Java虚拟机》&#xff0c;虽然刚刚开始看&#xff0c;但是觉得还是一本不错的书。对于和我一样对于JVM了解不深&#xff0c;有志进一步了解的人算是一本不错的书。注明&#xff1a;不是书托&#xff0c;同样是华章出的书&#xff0c;质量要比《深入剖…

用户输入汉字时计算机首先将,用户输入汉字时,计算机首先将汉字的输入码转换为__________。...

用户的蓄的形能器常见式有。输入时计算机首先输入包括药物具有基的酚羟。汉字换物包腺皮括质激肾上素药。对既荷又有线有相间负负荷时&#xff0c;将汉倍作为等选取相负效三相负荷乘荷最大&#xff0c;将汉相负荷换荷应先将线间负算为&#xff0c;效三相负荷时在计算等&#xf…

从最终用户角度来看外部结构_从不同角度来看您最喜欢的游戏

从最终用户角度来看外部结构The complete python code and Exploratory Data Analysis Notebook are available at my github profile;完整的python代码和Exploratory Data Analysis Notebook可在我的github个人资料中找到 &#xff1b; Pokmon is a Japanese media franchise,…

apache+tomcat配置

无意间看到tomcat 6集群的内容&#xff0c;就尝试配置了一下&#xff0c;还是遇到很多问题&#xff0c;特此记录。apache服务器和tomcat的连接方法其实有三种:JK、http_proxy和ajp_proxy。本文主要介绍最为常见的JK。 环境&#xff1a;PC2台&#xff1a;pc1(IP 192.168.88.118…