- 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
- 🍖 原作者:K同学啊
前言
- LSTM模型一直是一个很经典的模型,这个模型当然也很复杂,一般需要先学习RNN、GRU模型之后再学,GRU、LSTM的模型讲解将在这两天发布更新,其中:
-  - 深度学习基础–一文搞懂RNN
 
-  - 深度学习基础–GRU学习笔记(李沐《动手学习深度学习》)
 
- 这一篇:是基于LSTM模型火灾预测研究,讲述了如何构建时间数据、模型如何构建、pytorch中LSTM的API、动态调整学习率等=,最后用RMSE、R2做评估;
- 欢迎收藏 + 关注,本人将会持续更新
文章目录
- 1、导入数据与数据展示
- 1、导入库
- 2、导入数据
- 3、数据可视化
- 4、相关性分析(热力图展示)
- 5、特征提取
- 2、时间数据构建
- 1、数据标准化
- 2、构建时间数据集
- 3、划分数据集和加载数据集
- 1、数据划分
- 3、模型构建
- 4、模型训练
- 1、训练集函数
- 2、测试集函数
- 3、模型训练
- 5、结果展示
- 1、损失函数
- 2、预测展示
- 3、R2评估
1、导入数据与数据展示
1、导入库
import torch  
import torch.nn as nn 
import pandas as pd 
import numpy as np 
import seaborn as sns 
import matplotlib.pylab as plt # 设置分辨率
plt.rcParams['savefig.dpi'] = 500  # 图片分辨率
plt.rcParams['figure.dpi'] = 500 # 分辨率device = "cpu"device
'cpu'
2、导入数据
data_df = pd.read_csv('./woodpine2.csv')data_df.head()
| Time | Tem1 | CO 1 | Soot 1 | |
|---|---|---|---|---|
| 0 | 0.000 | 25.0 | 0.0 | 0.0 | 
| 1 | 0.228 | 25.0 | 0.0 | 0.0 | 
| 2 | 0.456 | 25.0 | 0.0 | 0.0 | 
| 3 | 0.685 | 25.0 | 0.0 | 0.0 | 
| 4 | 0.913 | 25.0 | 0.0 | 0.0 | 
数据位实验数据,数据是定时收集的:
- Time: 时间从 0.000 开始,每隔大约 0.228 的间隔递增。
- Tem1: 是温度(Temperature)的缩写,单位可能是摄氏度 (°C)。
- CO: 是指一氧化碳 (Carbon Monoxide) 的浓度。
- Soot: 是指烟炱或炭黑 (Soot) 的浓度。
# 数据信息查询
data_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5948 entries, 0 to 5947
Data columns (total 4 columns):#   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  0   Time    5948 non-null   float641   Tem1    5948 non-null   float642   CO 1    5948 non-null   float643   Soot 1  5948 non-null   float64
dtypes: float64(4)
memory usage: 186.0 KB
# 数据缺失值
data_df.isnull().sum()
Time      0
Tem1      0
CO 1      0
Soot 1    0
dtype: int64
3、数据可视化
时间是每隔固定时间收集的,故有用特征为:温度、CO、Soot
_, ax = plt.subplots(1, 3, constrained_layout=True, figsize=(14, 3)) # constrained_layout=True  自动调整子图sns.lineplot(data=data_df['Tem1'], ax=ax[0])
sns.lineplot(data=data_df['CO 1'], ax=ax[1])
sns.lineplot(data=data_df['Soot 1'], ax=ax[2])
plt.show()
 
4、相关性分析(热力图展示)
columns = ['Tem1', 'CO 1', 'Soot 1']plt.figure(figsize=(8, 6))
sns.heatmap(data=data_df[columns].corr(), annot=True, fmt=".2f")
plt.show()
 
# 统计分析
data_df.describe()
| Time | Tem1 | CO 1 | Soot 1 | |
|---|---|---|---|---|
| count | 5948.000000 | 5948.000000 | 5948.000000 | 5948.000000 | 
| mean | 226.133238 | 152.534919 | 0.000035 | 0.000222 | 
| std | 96.601445 | 77.026019 | 0.000022 | 0.000144 | 
| min | 0.000000 | 25.000000 | 0.000000 | 0.000000 | 
| 25% | 151.000000 | 89.000000 | 0.000015 | 0.000093 | 
| 50% | 241.000000 | 145.000000 | 0.000034 | 0.000220 | 
| 75% | 310.000000 | 220.000000 | 0.000054 | 0.000348 | 
| max | 367.000000 | 307.000000 | 0.000080 | 0.000512 | 
当我看到相关性为1的时候,我也惊呆了,后面查看了统计量,还是没发现出来,但是看上面的可视化图展示,我信了,随着温度升高,CO化碳、Soot浓度一起升高,这个也符合火灾的场景,数据没啥问题。
5、特征提取
# 由于时间间隔一样,故这里去除
data = data_df.iloc[:, 1:]data.head(3)
| Tem1 | CO 1 | Soot 1 | |
|---|---|---|---|
| 0 | 25.0 | 0.0 | 0.0 | 
| 1 | 25.0 | 0.0 | 0.0 | 
| 2 | 25.0 | 0.0 | 0.0 | 
data.tail(3)
| Tem1 | CO 1 | Soot 1 | |
|---|---|---|---|
| 5945 | 292.0 | 0.000077 | 0.000491 | 
| 5946 | 291.0 | 0.000076 | 0.000489 | 
| 5947 | 290.0 | 0.000076 | 0.000487 | 
特征间数据差距较大,故需要做标准化
2、时间数据构建
1、数据标准化
from sklearn.preprocessing import MinMaxScalersc = MinMaxScaler()for col in ['Tem1', 'CO 1', 'Soot 1']:data[col] = sc.fit_transform(data[col].values.reshape(-1, 1))# 查看维度
data.shape
(5948, 3)
2、构建时间数据集
LSTM 模型期望输入数据的形状是 (样本数, 时间步长, 特征数),本文数据:
- 样本数:5948
- 时间步长:本文设置为8 - 即是:取特征每8行(Tem1, CO 1, Soot 1)为一个时间段,第9个时间段的Tem1为y(温度),火灾预测本质也是预测温度
 
- 特征数:3
width_x = 8
width_y = 1# 构建时间数据X, y(解释在上)
X, y = [], []# 设置开始构建数据位置
start_position = 0for _, _ in data.iterrows():in_end = start_position + width_xout_end = in_end + width_y if out_end < len(data):# 采集时间数据集X_ = np.array(data.iloc[start_position : in_end, :])y_ = np.array(data.iloc[in_end : out_end, 0])X.append(X_)y.append(y_)start_position += 1# 转化为数组
X = np.array(X)
# y也要构建出适合维度的变量
y = np.array(y).reshape(-1, 1, 1)X.shape, y.shape
((5939, 8, 3), (5939, 1, 1))
3、划分数据集和加载数据集
1、数据划分
# 取前5000个数据位训练集,后面为测试集
X_train = torch.tensor(np.array(X[:5000, ]), dtype=torch.float32)
X_test = torch.tensor(np.array(X[5000:, ]), dtype=torch.float32)y_train = torch.tensor(np.array(y[:5000, ]), dtype=torch.float32)
y_test = torch.tensor(np.array(y[5000:, ]), dtype=torch.float32)X_train.shape, y_train.shape 
(torch.Size([5000, 8, 3]), torch.Size([5000, 1, 1]))
数据集构建:
- TensorDataset 是 PyTorch 中的一个类,用于将两个或多个张量组合成一个数据集。每个样本由一个输入张量和一个目标张量组成(构建的数据集中,每一个输入对应一个输出)
from torch.utils.data import TensorDataset, DataLoaderbatch_size = 64train_dl = DataLoader(TensorDataset(X_train, y_train),batch_size=batch_size,shuffle=True)test_dl = DataLoader(TensorDataset(X_test, y_test),batch_size=batch_size,shuffle=False)
3、模型构建
nn.LSTM 的 API
*构造函数
torch.nn.LSTM(input_size, hidden_size, num_layers=1, bias=True, batch_first=False, dropout=0, bidirectional=False, proj_size=0)
- input_size(- int):每个时间步输入特征的数量。
- hidden_size(- int):LSTM 层中隐藏状态(h)的特征数。这也是 LSTM 输出的特征数量,除非指定了- proj_size。
- num_layers(- int, 可选):LSTM 层的数量。默认值为 1。
- bias(- bool, 可选):如果为- True,则使用偏置项;否则不使用。默认值为- True。
- batch_first(- bool, 可选):如果为- True,则输入和输出张量的形状为- (batch, seq, feature);否则为- (seq, batch, feature)。默认值为- False。
- dropout(- float, 可选):除了最后一层之外的所有 LSTM 层之后应用的 dropout 概率。如果- num_layers = 1,则不会应用 dropout。默认值为 0。
- bidirectional(- bool, 可选):如果为- True,则变为双向 LSTM。默认值为- False。
- proj_size(- int, 可选):如果大于 0,则 LSTM 会将隐藏状态投影到一个不同维度的空间。这减少了模型参数的数量,并且可以加速训练。默认值为 0,表示没有投影。
输入
- input(- tensor):形状为- (seq_len, batch, input_size)或者如果- batch_first=True则为- (batch, seq_len, input_size)。
- (h_0, c_0)(- tuple, 可选):包含两个张量- (h_0, c_0),分别代表初始的隐藏状态和细胞状态。它们的形状均为- (num_layers * num_directions, batch, hidden_size)。如果没有提供,那么所有状态都会被初始化为零。
其中:
- 单向 LSTM (bidirectional=False):此时 num_directions=1。LSTM 只按照时间序列的顺序从前向后处理数据,即从第一个时间步到最后一个时间步。
- 双向 LSTM (bidirectional=True):此时 num_directions=2。双向 LSTM 包含两个独立的 LSTM 层,一个按正常的时间顺序从前向后处理数据,另一个则反过来从后向前处理数据。这样做可以让模型同时捕捉到过去和未来的信息,对于某些任务(如自然语言处理中的语义理解)特别有用。
输出(两个)
- output(- tensor):包含了最后一个时间步的输出特征(- h_t)。如果- batch_first=True,则形状为- (batch, seq_len, num_directions * hidden_size);否则为- (seq_len, batch, num_directions * hidden_size)。注意,如果- proj_size > 0,则输出的最后一个维度将是- num_directions * proj_size。
- (h_n, c_n)(- tuple):包含两个张量- (h_n, c_n),分别代表所有时间步后的最终隐藏状态和细胞状态。它们的形状均为- (num_layers * num_directions, batch, hidden_size)。同样地,如果- proj_size > 0,则- h_n的最后一个维度将是- proj_size。
'''
模型采用两个lstm层:3->320:lstm->320:lstm(进一步提取时间特征)->1:linear
'''class model_lstm(nn.Module):def __init__(self):super().__init__()self.lstm1 = nn.LSTM(input_size=3, hidden_size=320, num_layers=1, batch_first=True)self.lstm2 = nn.LSTM(input_size=320, hidden_size=320, num_layers=1, batch_first=True)self.fc = nn.Linear(320, 1)def forward(self, x):out, hidden = self.lstm1(x)out, _ = self.lstm2(out)out = self.fc(out)   # 这个时候,输出维度(batch_size, sequence_length, output_size), 这里是(64, 8, 1)return out[:, -1, :].view(-1, 1, 1)  # 取最后一条数据  (64, 1, 1), 在pytorch中如果一个维度是1,可能会自动压缩,所以这里需要再次形状重塑model = model_lstm().to(device)
model
model_lstm((lstm1): LSTM(3, 320, batch_first=True)(lstm2): LSTM(320, 320, batch_first=True)(fc): Linear(in_features=320, out_features=1, bias=True)
)
# 先做测试
model(torch.rand(30, 8, 3)).shape
torch.Size([30, 1, 1])
4、模型训练
1、训练集函数
def train(train_dl, model, loss_fn, optimizer, lr_scheduler=None):size = len(train_dl.dataset)num_batchs = len(train_dl)train_loss = 0for X, y in train_dl:X, y = X.to(device), y.to(device)pred = model(X)loss = loss_fn(pred, y)optimizer.zero_grad()loss.backward()optimizer.step()train_loss += loss.item()if lr_scheduler is not None:lr_scheduler.step()print("learning rate = {:.5f}".format(optimizer.param_groups[0]['lr']), end="  ")train_loss /= num_batchsreturn train_loss
2、测试集函数
def test(test_dl, model, loss_fn):size = len(test_dl.dataset)num_batchs = len(test_dl)test_loss = 0with torch.no_grad():for X, y in test_dl:X, y = X.to(device), y.to(device)pred = model(X)loss = loss_fn(pred, y)test_loss += loss.item()test_loss /= num_batchsreturn test_loss
3、模型训练
# 设置超参数
loss_fn = nn.MSELoss()
lr = 1e-1
opt = torch.optim.SGD(model.parameters(), lr=lr, weight_decay=1e-4) # weight_decay 实际上是在应用 L2 正则化(也称为权重衰减)epochs = 50# 动态调整学习率
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(opt, epochs, last_epoch=-1)train_loss = []
test_loss = []for epoch in range(epochs):model.train()epoch_train_loss = train(train_dl, model, loss_fn, opt, lr_scheduler)model.eval()epoch_test_loss = test(test_dl, model, loss_fn)train_loss.append(epoch_train_loss)test_loss.append(epoch_test_loss)template = ('Epoch:{:2d}, Train_loss:{:.5f}, Test_loss:{:.5f}')     print(template.format(epoch+1, epoch_train_loss,  epoch_test_loss))
learning rate = 0.09990  Epoch: 1, Train_loss:0.00320, Test_loss:0.00285
learning rate = 0.09961  Epoch: 2, Train_loss:0.00022, Test_loss:0.00084
learning rate = 0.09911  Epoch: 3, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09843  Epoch: 4, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.09755  Epoch: 5, Train_loss:0.00015, Test_loss:0.00072
learning rate = 0.09649  Epoch: 6, Train_loss:0.00015, Test_loss:0.00059
learning rate = 0.09524  Epoch: 7, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09382  Epoch: 8, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09222  Epoch: 9, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.09045  Epoch:10, Train_loss:0.00015, Test_loss:0.00066
learning rate = 0.08853  Epoch:11, Train_loss:0.00015, Test_loss:0.00077
learning rate = 0.08645  Epoch:12, Train_loss:0.00015, Test_loss:0.00071
learning rate = 0.08423  Epoch:13, Train_loss:0.00015, Test_loss:0.00071
learning rate = 0.08187  Epoch:14, Train_loss:0.00015, Test_loss:0.00061
learning rate = 0.07939  Epoch:15, Train_loss:0.00015, Test_loss:0.00056
learning rate = 0.07679  Epoch:16, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.07409  Epoch:17, Train_loss:0.00015, Test_loss:0.00056
learning rate = 0.07129  Epoch:18, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.06841  Epoch:19, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.06545  Epoch:20, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.06243  Epoch:21, Train_loss:0.00015, Test_loss:0.00069
learning rate = 0.05937  Epoch:22, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.05627  Epoch:23, Train_loss:0.00015, Test_loss:0.00064
learning rate = 0.05314  Epoch:24, Train_loss:0.00015, Test_loss:0.00072
learning rate = 0.05000  Epoch:25, Train_loss:0.00015, Test_loss:0.00061
learning rate = 0.04686  Epoch:26, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.04373  Epoch:27, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.04063  Epoch:28, Train_loss:0.00015, Test_loss:0.00059
learning rate = 0.03757  Epoch:29, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.03455  Epoch:30, Train_loss:0.00015, Test_loss:0.00060
learning rate = 0.03159  Epoch:31, Train_loss:0.00015, Test_loss:0.00067
learning rate = 0.02871  Epoch:32, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.02591  Epoch:33, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.02321  Epoch:34, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.02061  Epoch:35, Train_loss:0.00015, Test_loss:0.00067
learning rate = 0.01813  Epoch:36, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.01577  Epoch:37, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.01355  Epoch:38, Train_loss:0.00015, Test_loss:0.00064
learning rate = 0.01147  Epoch:39, Train_loss:0.00014, Test_loss:0.00063
learning rate = 0.00955  Epoch:40, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00778  Epoch:41, Train_loss:0.00015, Test_loss:0.00060
learning rate = 0.00618  Epoch:42, Train_loss:0.00014, Test_loss:0.00063
learning rate = 0.00476  Epoch:43, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00351  Epoch:44, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00245  Epoch:45, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.00157  Epoch:46, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.00089  Epoch:47, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00039  Epoch:48, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00010  Epoch:49, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00000  Epoch:50, Train_loss:0.00015, Test_loss:0.00063
5、结果展示
1、损失函数
import matplotlib.pyplot as plt 
from datetime import datetime 
current_time = datetime.now() # 获取当前时间 plt.figure(figsize=(5, 3),dpi=120)   
plt.plot(train_loss    , label='LSTM Training Loss') 
plt.plot(test_loss, label='LSTM Validation Loss')   
plt.title('Training and Validation Loss') 
plt.xlabel(current_time) # 打卡请带上时间戳,否则代码截图无效 
plt.legend() 
plt.show()
 
效果不错,收敛了
2、预测展示
predicted_y_lstm = sc.inverse_transform(model(X_test).detach().numpy().reshape(-1,1))                    # 测试集输入模型进行预测 
y_test_1         = sc.inverse_transform(y_test.reshape(-1,1)) 
y_test_one       = [i[0] for i in y_test_1] 
predicted_y_lstm_one = [i[0] for i in predicted_y_lstm]   
plt.figure(figsize=(5, 3),dpi=120) # 画出真实数据和预测数据的对比曲线 
plt.plot(y_test_one[:2000], color='red', label='real_temp') 
plt.plot(predicted_y_lstm_one[:2000], color='blue', label='prediction')   
plt.title('Title') 
plt.xlabel('X') 
plt.ylabel('Y') 
plt.legend() 
plt.show()
 
3、R2评估
from sklearn import metrics 
""" 
RMSE :均方根误差  ----->  对均方误差开方 
R2   :决定系数,可以简单理解为反映模型拟合优度的重要的统计量 
""" 
RMSE_lstm  = metrics.mean_squared_error(predicted_y_lstm_one, y_test_1)**0.5 
R2_lstm    = metrics.r2_score(predicted_y_lstm_one, y_test_1)   
print('均方根误差: %.5f' % RMSE_lstm) 
print('R2: %.5f' % R2_lstm)
均方根误差: 0.00001
R2: 0.82422
rmse、r2都不错,但是拟合度还可以再提高