时间序列分析 lstm
Neural networks can be a hard concept to wrap your head around. I think this is mostly due to the fact that they can be used for so many different things such as classification, identification or just simply regression.
神经网络可能是一个难以理解的概念。 我认为这主要是由于它们可以用于许多不同的事情,例如分类,识别或仅用于回归。
In this article, we will look at how easy it is to set up a simple LSTM model. All you need is your helpful friend KERAS and some array of numbers to throw into it.
在本文中,我们将探讨建立一个简单的LSTM模型有多么容易。 您所需要的只是您乐于助人的朋友KERAS和一些数字。
First thing we always do? Import!
我们总是做的第一件事? 进口!
import math
import pandas as pd
import numpy as np#keras models
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM#scaler
from sklearn.preprocessing import MinMaxScaler#analysis tool
from sklearn.metrics import mean_squared_error
Next we need some set of numbers to play with. There are a couple ways you can go about doing this, but the best way is to use some data that actually has meaning and to do that I recommend going to Kaggle.
接下来,我们需要一些数字来处理。 您可以通过几种方法来执行此操作,但是最好的方法是使用一些确实有意义的数据,并且建议我转到Kaggle。
Once you have your .csv file downloaded we need to put it into a dataframe and only take the feature that has all the values we want to play with. Mine is the first column.
下载完.csv文件后,我们需要将其放入数据框,仅使用具有我们要使用的所有值的功能。 我的是第一列。
df = pd.read_csv('test_df_w_timeshift.csv', usecols=[1])
dataset = df.values
#normalize dataset
scaler = MinMaxScaler(feature_range=(0,1))
dataset = scaler.fit_transform(dataset)
We pull the values from the file and normalize them using the MinMaxScaler from sklearn.
我们从文件中提取值,并使用sklearn的MinMaxScaler将其标准化。
The next thing to do is to separate the data into two groups, the first is a set a training data for our LSTM model to learn from. I like to use about eighty percent of the data to train to, however you can play with this number to see how much of the data is actually needed to train with before the model gives you a satisfactory result.
接下来要做的是将数据分为两组,第一组是一组训练数据,供我们的LSTM模型学习。 我喜欢使用大约80%的数据进行训练,但是您可以使用该数字来查看在模型给您满意的结果之前实际需要训练多少数据。
#split into train and test sets
train_size = int(len(dataset) * 0.80)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
The next thing we do to the data is create a secondary feature, if you will, from that data that is basically how far back we wish to look to see how much the current value has changed from the value before, lets say, three values ago.
我们接下来要对数据进行的操作是创建一个辅助功能(如果可以的话),从该数据开始,基本上是我们想回溯的距离,以查看当前值与之前(假设)三个值相比有多少变化。前。
This create_dataset method was written by Jason Brownlee, it is rewritten below, and a link to his LSTM time series model is at the bottom. I recommend checking it out!
这个create_dataset方法是由Jason Brownlee编写的,在下面进行了重写,并且在底部是指向他的LSTM时间序列模型的链接。 我建议检查一下!
#https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/
def create_dataset(dataset, lookback=1):
dataX, dataY = [], []
for i in range(len(dataset) - lookback - 1):
a = dataset[i: i + lookback, 0]
dataX.append(a)
dataY.append(dataset[i + lookback, 0])
return np.array(dataX), np.array(dataY)
So we use the method above to add a secondary column to be analyzed by the LSTM model and create a ‘X’ and a ‘Y’ for both the training data and the testing data.
因此,我们使用上述方法添加了要由LSTM模型分析的辅助列,并为训练数据和测试数据创建了“ X”和“ Y”。
# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], trainX.shape[1], 1))
testX = np.reshape(testX, (testX.shape[0], testX.shape[1], 1)
Once the data is all set up, we simply add it to the model to be analyzed.
数据全部设置好之后,我们只需将其添加到要分析的模型中即可。
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True, return_sequences=True))
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(5):
model.fit(trainX, trainY, epochs=1, batch_size=batch_size, verbose=2, shuffle=False)
model.reset_states()
Once the model has run a set number of times, in this case its five, we can ask the model to predict.
一旦模型运行了设定的次数(在本例中为5次),我们可以要求模型进行预测。
trainPredict = model.predict(trainX, batch_size=batch_size)
model.reset_states()
testPredict = model.predict(testX, batch_size=batch_size)
Don’t forget to reverse our transformation from the beginning!
不要忘记从一开始就扭转我们的转型!
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
To see how well your model did you can use mean squared error below.
要查看您的模型效果如何,您可以在下面使用均方误差。
trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
If you wish to see it visually, feel free to plot it using matplotlib.
如果您希望直观地看到它,请随时使用matplotlib对其进行绘制。
import matplotlib as plt
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()
翻译自: https://medium.com/@trevohearn/lstm-a-time-series-analysis-b90517fcac9e
时间序列分析 lstm
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/392129.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!