(PyTorch)TCN和RNN/LSTM/GRU结合实现时间序列预测

I. 前言

前面已经写了一系列有关LSTM时间序列预测的文章：

深入理解PyTorch中LSTM的输入和输出（从input输入到Linear输出）
PyTorch搭建LSTM实现时间序列预测（负荷预测）
PyTorch中利用LSTMCell搭建多层LSTM实现时间序列预测
PyTorch搭建LSTM实现多变量时间序列预测（负荷预测）
PyTorch搭建双向LSTM实现时间序列预测（负荷预测）
PyTorch搭建LSTM实现多变量多步长时间序列预测（一）：直接多输出
PyTorch搭建LSTM实现多变量多步长时间序列预测（二）：单步滚动预测
PyTorch搭建LSTM实现多变量多步长时间序列预测（三）：多模型单步预测
PyTorch搭建LSTM实现多变量多步长时间序列预测（四）：多模型滚动预测
PyTorch搭建LSTM实现多变量多步长时间序列预测（五）：seq2seq
PyTorch中实现LSTM多步长时间序列预测的几种方法总结（负荷预测）
PyTorch-LSTM时间序列预测中如何预测真正的未来值
PyTorch搭建LSTM实现多变量输入多变量输出时间序列预测（多任务学习）
PyTorch搭建ANN实现时间序列预测（风速预测）
PyTorch搭建CNN实现时间序列预测（风速预测）
PyTorch搭建CNN-LSTM混合模型实现多变量多步长时间序列预测（负荷预测）
PyTorch搭建Transformer实现多变量多步长时间序列预测（负荷预测）
PyTorch时间序列预测系列文章总结（代码使用方法）
TensorFlow搭建LSTM实现时间序列预测（负荷预测）
TensorFlow搭建LSTM实现多变量时间序列预测（负荷预测）
TensorFlow搭建双向LSTM实现时间序列预测（负荷预测）
TensorFlow搭建LSTM实现多变量多步长时间序列预测（一）：直接多输出
TensorFlow搭建LSTM实现多变量多步长时间序列预测（二）：单步滚动预测
TensorFlow搭建LSTM实现多变量多步长时间序列预测（三）：多模型单步预测
TensorFlow搭建LSTM实现多变量多步长时间序列预测（四）：多模型滚动预测
TensorFlow搭建LSTM实现多变量多步长时间序列预测（五）：seq2seq
TensorFlow搭建LSTM实现多变量输入多变量输出时间序列预测（多任务学习）
TensorFlow搭建ANN实现时间序列预测（风速预测）
TensorFlow搭建CNN实现时间序列预测（风速预测）
TensorFlow搭建CNN-LSTM混合模型实现多变量多步长时间序列预测（负荷预测）
PyG搭建图神经网络实现多变量输入多变量输出时间序列预测
PyTorch搭建GNN-LSTM和LSTM-GNN模型实现多变量输入多变量输出时间序列预测
PyG Temporal搭建STGCN实现多变量输入多变量输出时间序列预测
时序预测中Attention机制是否真的有效？盘点LSTM/RNN中24种Attention机制+效果对比
详解Transformer在时序预测中的Encoder和Decoder过程：以负荷预测为例
(PyTorch)TCN和RNN/LSTM/GRU结合实现时间序列预测

时间卷积网络TCN和CNN都是一种利用卷积操作提取特征的模型，CNN是通过卷积层来提取图像中的特征，而TCN则通过时序卷积层来处理时间序列数据。TCN强调如何使用非常深的网络（residual）和膨胀卷积的组合来扩大感受野进而捕捉更广泛的上下文信息。

有关TCN的原理部分不做过多讲解，原理比较简单，下面直接讲解代码。

II. TCN

class Chomp1d(nn.Module):def __init__(self, chomp_size):super(Chomp1d, self).__init__()self.chomp_size = chomp_sizedef forward(self, x):"""裁剪的模块，裁剪多出来的padding"""return x[:, :, :-self.chomp_size].contiguous()class TemporalBlock(nn.Module):def __init__(self, n_inputs, n_outputs, kernel_size, stride, dilation, padding, dropout=0.2):"""相当于一个Residual block:param n_inputs: int, 输入通道数:param n_outputs: int, 输出通道数:param kernel_size: int, 卷积核尺寸:param stride: int, 步长，一般为1:param dilation: int, 膨胀系数:param padding: int, 填充系数:param dropout: float, dropout比率"""super(TemporalBlock, self).__init__()self.conv1 = weight_norm(nn.Conv1d(n_inputs, n_outputs, kernel_size,stride=stride, padding=padding, dilation=dilation))# 经过conv1，输出的size其实是(Batch, input_channel, seq_len + padding)self.chomp1 = Chomp1d(padding)  # 裁剪掉多出来的padding部分，维持输出时间步为seq_lenself.relu1 = nn.ReLU()self.dropout1 = nn.Dropout(dropout)self.conv2 = weight_norm(nn.Conv1d(n_outputs, n_outputs, kernel_size,stride=stride, padding=padding, dilation=dilation))self.chomp2 = Chomp1d(padding)  # 裁剪掉多出来的padding部分，维持输出时间步为seq_lenself.relu2 = nn.ReLU()self.dropout2 = nn.Dropout(dropout)self.net = nn.Sequential(self.conv1, self.chomp1, self.relu1, self.dropout1,self.conv2, self.chomp2, self.relu2, self.dropout2)self.downsample = nn.Conv1d(n_inputs, n_outputs, 1) if n_inputs != n_outputs else Noneself.relu = nn.ReLU()self.init_weights()def init_weights(self):"""参数初始化:return:"""self.conv1.weight.data.normal_(0, 0.01)self.conv2.weight.data.normal_(0, 0.01)if self.downsample is not None:self.downsample.weight.data.normal_(0, 0.01)def forward(self, x):""":param x: size of (Batch, input_channel, seq_len):return:"""out = self.net(x)res = x if self.downsample is None else self.downsample(x)return self.relu(out + res)class TCN(nn.Module):def __init__(self, num_inputs, channels, kernel_size=2, dropout=0.2):""":param num_inputs: int， 输入通道数:param channels: list，每层的hidden_channel数，例如[25,25,25,25]表示有4个隐层，每层hidden_channel数为25:param kernel_size: int, 卷积核尺寸:param dropout: float, drop_out比率"""super(TCN, self).__init__()super().__init__()layers = []num_levels = len(channels)for i in range(num_levels):dilation_size = 2 ** i  # 膨胀系数：1，2，4，8……in_channels = num_inputs if i == 0 else channels[i - 1]  # 确定每一层的输入通道数out_channels = channels[i]  # 确定每一层的输出通道数layers += [TemporalBlock(in_channels, out_channels, kernel_size, stride=1, dilation=dilation_size,padding=(kernel_size - 1) * dilation_size, dropout=dropout)]self.network = nn.Sequential(*layers)def forward(self, x):""":param x: size of (Batch, input_channel, seq_len):return: size of (Batch, output_channel, seq_len)"""x = self.network(x)return x

可以看到这里TCN输入的尺寸是(batch_size, input_channel, seq_len)，输出尺寸是(batch_size, output_channel, seq_len)。这与前面讲的文章大致类似，如果需要直接利用TCN得到输出，可以取输出的最后一个时间步，然后经过一个nn.Linear即可得到预测结果，即：

self.fc = nn.Linear(channels[-1], output_size)
x = x[:, :, -1]
x = self.fc(x)

III. TCN-RNN/LSTM/GRU

TCN的输出尺寸为(batch_size, output_channel, seq_len)，这天然满足了RNN类模型的输入要求，因此将时序数据先经过TCN再经过RNN等模型是很自然的想法。

3.1 TCN-RNN

TCN-RNN模型搭建如下：

class TCN_RNN(nn.Module):def __init__(self):super(TCN_RNN, self).__init__()self.tcn = TCN(num_inputs=7, channels=[32, 32, 32])self.rnn = nn.RNN(input_size=32, hidden_size=64,num_layers=2, batch_first=True)self.fc = nn.Linear(64, 1)def forward(self, x):x = x.permute(0, 2, 1)  # b i sx = self.tcn(x)  # b h sx = x.permute(0, 2, 1)  # b s hx, _ = self.rnn(x)  # b, s, hx = x[:, -1, :]x = self.fc(x)  # b output_sizereturn x

由于我们构建的输入为(batch_size, seq_len, input_size)，而TCN要求的输入为(batch_size, input_channel, seq_len)，因此首先需要进行一个permute操作。经过TCN后，输出为(batch_size, output_channel, seq_len)，其中output_channel为channels=[32, 32, 32]中最后一个数，即32。

接着RNN的输入应该为(batch_size, seq_len, output_channel)，因此还需要经过一个permute。最后利用一个nn.Linear得到这个batch的预测结果。

3.2 TCN-LSTM

相比TCN-RNN，TCN-LSTM只是进行了简单替换：

class TCN_LSTM(nn.Module):def __init__(self):super(TCN_LSTM, self).__init__()self.tcn = TCN(num_inputs=7, channels=[32, 32, 32])self.lstm = nn.LSTM(input_size=32, hidden_size=64,num_layers=2, batch_first=True)self.fc = nn.Linear(64, 1)def forward(self, x):x = x.permute(0, 2, 1)  # b i sx = self.tcn(x)  # b h sx = x.permute(0, 2, 1)  # b s hx, _ = self.lstm(x)  # b, s, hx = x[:, -1, :]x = self.fc(x)  # b output_sizereturn x

3.3 TCN-GRU

TCN-GRU类似：

class TCN_GRU(nn.Module):def __init__(self):super(TCN_GRU, self).__init__()self.tcn = TCN(num_inputs=7, channels=[32, 32, 32])self.gru = nn.GRU(input_size=32, hidden_size=64,num_layers=2, batch_first=True)self.fc = nn.Linear(64, 1)def forward(self, x):x = x.permute(0, 2, 1)  # b i sx = self.tcn(x)  # b h sx = x.permute(0, 2, 1)  # b s hx, _ = self.gru(x)  # b, s, hx = x[:, -1, :]x = self.fc(x)  # b output_sizereturn x