文章目录
- 1 数据编码
- 2 网络搭建
- 3 网络配置,训练
- 4 结果预测
- 5 翻车现场
学习参考来自:
- Fizz Buzz in Tensorflow
- https://github.com/wmn7/ML_Practice/tree/master/2019_06_10
- Fizz Buzz in Pytorch
I need you to print the numbers from 1 to 100, except that if the number is divisible by 3 print “fizz”, if it’s divisible by 5 print “buzz”, and if it’s divisible by 15 print “fizzbuzz”.
编程题很简单,我们用 MLP 实现试试
思路,训练集数据101~1024,对其进行某种规则的编码,标签为经分类 one-hot 编码后的标签
测试集,1~100
don’t say so much, show me the code.
1 数据编码
import numpy as np
import torch
import torch.nn as nn
import torch.utils.data as Datadef binary_encode(i, num_digits):"""将每个input转换为binary digits(转换为二进制的表示, 最多可是表示2^num_digits):param i::param num_digits::return:"""return np.array([i >> d & 1 for d in range(num_digits)])
编码形式,依次除以 2 0 , 1 , 2 , 3 , . . . 2^{0,1,2,3,...} 20,1,2,3,...,结果按位与 1
m & 1,结果为 0 表示 m 为偶数, 结果为 1 表示 m 为奇数
> > m >> m >>m 右移表示除以 2 m 2^m 2m
第一位就能表示奇偶了,所有数字编码都不一样
eg,101 进行 num_digits=10
编码后结果为 1 0 1 0 0 1 1 0 0 0
步骤
101 / 1 = 101 奇数 1
101 / 2 = 50 偶数 0
101 / 4 = 25 奇数 1
101 / 8 = 12 偶数 0
101 / 16 = 6 偶数 0
101 / 32 = 3 奇数 1
101 / 64 = 1 奇数 1
101 / 128 = 0 偶数 0
101 / 256= 0 偶数 0
101 / 512= 0 偶数 0
标签,0,1,2,3 四个类别
def fizz_buzz_encode(i):"""将output转换为lebel:param i::return:"""if i % 15 == 0: # fizzbuzzreturn 3elif i % 5 == 0: # buzzreturn 2elif i % 3 == 0: # fizzreturn 1else:return 0
编码长度设定,数据集 101 ~ 1024
NUM_DIGITS = 10
trX = np.array([binary_encode(i, NUM_DIGITS) for i in range(101, 2**NUM_DIGITS)]) # 101~1024
trY = np.array([fizz_buzz_encode(i) for i in range(101, 2**NUM_DIGITS)])# print(len(trX), len(trY)) # 923 923
# print(trX[:5])
"""
[[1 0 1 0 0 1 1 0 0 0][0 1 1 0 0 1 1 0 0 0][1 1 1 0 0 1 1 0 0 0][0 0 0 1 0 1 1 0 0 0][1 0 0 1 0 1 1 0 0 0]]
"""
# print(trY[:5]) # [0 1 0 0 3]
2 网络搭建
搭建简单的 MLP 网络
class FizzBuzzModel(nn.Module):def __init__(self, in_features, out_classes, hidden_size, n_hidden_layers):super(FizzBuzzModel,self).__init__()layers = []for i in range(n_hidden_layers):layers.append(nn.Linear(hidden_size,hidden_size))# layers.append(nn.Dropout(0.5))layers.append(nn.BatchNorm1d(hidden_size))layers.append(nn.ReLU())self.inputLayer = nn.Linear(in_features, hidden_size)self.relu = nn.ReLU()self.layers = nn.Sequential(*layers) # 重复的搭建隐藏层self.outputLayer = nn.Linear(hidden_size, out_classes)def forward(self, x):x = self.inputLayer(x)x = self.relu(x)x = self.layers(x)out = self.outputLayer(x)return out
初始化网络,看看网络结构
# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')# define the model
simpleModel = FizzBuzzModel(NUM_DIGITS, 4, 150, 3).to(device)
print(simpleModel)
"""
FizzBuzzModel((inputLayer): Linear(in_features=10, out_features=150, bias=True)(relu): ReLU()(layers): Sequential((0): Linear(in_features=150, out_features=150, bias=True)(1): BatchNorm1d(150, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(2): ReLU()(3): Linear(in_features=150, out_features=150, bias=True)(4): BatchNorm1d(150, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(5): ReLU()(6): Linear(in_features=150, out_features=150, bias=True)(7): BatchNorm1d(150, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(8): ReLU())(outputLayer): Linear(in_features=150, out_features=4, bias=True)
)
"""
输入 10, 输出4,隐藏层维度 150,隐藏层重复了 3 次
3 网络配置,训练
定义下超参数,损失函数,优化器,载入数据训练,输出训练精度与损失
# Loss and optimizer
learning_rate = 0.05
criterion = nn.CrossEntropyLoss()
# optimizer = torch.optim.Adam(simpleModel.parameters(), lr=learning_rate)
optimizer = torch.optim.SGD(simpleModel.parameters(), lr=learning_rate)# 使用batch进行训练
FizzBuzzDataset = Data.TensorDataset(torch.from_numpy(trX).float().to(device),torch.from_numpy(trY).long().to(device))loader = Data.DataLoader(dataset=FizzBuzzDataset,batch_size=128*5,shuffle=True)# 进行训练
simpleModel.train()
epochs = 3000for epoch in range(1, epochs):for step, (batch_x, batch_y) in enumerate(loader):out = simpleModel(batch_x) # 前向传播loss = criterion(out, batch_y) # 计算损失optimizer.zero_grad() # 梯度清零loss.backward() # 反向传播optimizer.step() # 随机梯度下降correct = 0total = 0_, predicted = torch.max(out.data, 1)total += batch_y.size(0)correct += (predicted == batch_y).sum().item()acc = 100*correct/totalprint('Epoch : {:0>4d} | Loss : {:<6.4f} | Train Accuracy : {:<6.2f}%'.format(epoch, loss, acc))"""
Epoch : 0001 | Loss : 1.5343 | Train Accuracy : 14.63 %
Epoch : 0002 | Loss : 1.9779 | Train Accuracy : 42.58 %
Epoch : 0003 | Loss : 2.4198 | Train Accuracy : 53.41 %
Epoch : 0004 | Loss : 1.7360 | Train Accuracy : 53.41 %
Epoch : 0005 | Loss : 1.3161 | Train Accuracy : 49.73 %
Epoch : 0006 | Loss : 1.4866 | Train Accuracy : 22.75 %
Epoch : 0007 | Loss : 1.3993 | Train Accuracy : 25.57 %
Epoch : 0008 | Loss : 1.2428 | Train Accuracy : 28.49 %
Epoch : 0009 | Loss : 1.1906 | Train Accuracy : 44.31 %
Epoch : 0010 | Loss : 1.1929 | Train Accuracy : 52.44 %
...
Epoch : 2990 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2991 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2992 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2993 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2994 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2995 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2996 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2997 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2998 | Loss : 0.0000 | Train Accuracy : 100.00%
Epoch : 2999 | Loss : 0.0000 | Train Accuracy : 100.00%
"""
训练集上精度是 OK 的,能到 100%,下面看看测试集上的精度
4 结果预测
把 one-hot 标签转化成 fizz buzz 的形式
def fizz_buzz_decode(i, prediction):return [str(i), "fizz", "buzz", "fizzbuzz"][prediction]
载入测试集,开始预测
simpleModel.eval()
# 进行预测
testX = np.array([binary_encode(i, NUM_DIGITS) for i in range(1, 101)])
predicts = simpleModel(torch.from_numpy(testX).float().to(device))
# 预测的结果
_, res = torch.max(predicts, 1)
print(res)
"""
tensor([0, 0, 0, 1, 0, 0, 0, 2, 1, 0, 1, 3, 3, 1, 1, 0, 0, 0, 0, 0, 0, 3, 1, 0,0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,1, 1, 1, 1, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 1, 0, 1, 1, 1, 0,0, 0, 0, 0], device='cuda:0')
"""# 格式的转换
predictions = [fizz_buzz_decode(i, prediction) for (i, prediction) in zip(range(1, 101), res)]
print(predictions)
"""
['1', '2', '3', 'fizz', '5', '6', '7', 'buzz', 'fizz', '10', 'fizz', 'fizzbuzz', 'fizzbuzz', 'fizz', 'fizz', '16', '17', '18', '19', '20', '21', 'fizzbuzz', 'fizz', '24', '25', '26', '27', '28', '29', '30', 'fizz', '32', '33', '34', '35', '36', '37', '38', '39', 'fizz', '41', 'fizz', '43', '44', '45', 'fizz', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', 'fizz', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', 'fizz', 'fizz', 'fizz', 'fizz', 'buzz', 'buzz', 'fizz', '80', '81', '82', '83', '84', '85', '86', '87', 'fizzbuzz', '89', '90', 'fizz', '92', 'fizz', 'fizz', 'fizz', '96', '97', '98', '99', '100']
"""
5 翻车现场
对比下标签
labels = []
for i in range(1, 101):if i % 15 == 0: # fizzbuzzlabels.append("fizzbuzz")elif i % 5 == 0: # buzzlabels.append("buzz")elif i % 3 == 0: # fizzlabels.append("fizz")else:labels.append(str(i))
print(labels)
print(labels == predictions)"""
['1', '2', 'fizz', '4', 'buzz', 'fizz', '7', '8', 'fizz', 'buzz', '11', 'fizz', '13', '14', 'fizzbuzz', '16', '17', 'fizz', '19', 'buzz', 'fizz', '22', '23', 'fizz', 'buzz', '26', 'fizz', '28', '29', 'fizzbuzz', '31', '32', 'fizz', '34', 'buzz', 'fizz', '37', '38', 'fizz', 'buzz', '41', 'fizz', '43', '44', 'fizzbuzz', '46', '47', 'fizz', '49', 'buzz', 'fizz', '52', '53', 'fizz', 'buzz', '56', 'fizz', '58', '59', 'fizzbuzz', '61', '62', 'fizz', '64', 'buzz', 'fizz', '67', '68', 'fizz', 'buzz', '71', 'fizz', '73', '74', 'fizzbuzz', '76', '77', 'fizz', '79', 'buzz', 'fizz', '82', '83', 'fizz', 'buzz', '86', 'fizz', '88', '89', 'fizzbuzz', '91', '92', 'fizz', '94', 'buzz', 'fizz', '97', '98', 'fizz', 'buzz']
False
"""
哈哈哈, False 翻车了,尝试了很多次,很难 True