标准化（Normalization）和归一化实现

概念：

原因：
由于进行分类器或模型的建立与训练时，输入的数据范围可能比较大，同时样本中各数据可能量纲不一致，这样的数据容易对模型训练或分类器的构建结果产生影响，因此需要对其进行标准化处理，去除数据的单位限制，将其转化为无量纲的纯数值，便于不同单位或量级的指标能够进行比较和加权。

其中最典型的就是数据的归一化处理，即将数据统一映射到[0,1]区间上。

z-score标准化（零均值归一化zero-mean normalization）：
• 经过处理后的数据均值为0，标准差为1（正态分布）
• 其中μ是样本的均值， σ是样本的标准差

代码实现

import numpy as np
import matplotlib.pyplot as plt
#归一化的两种方式
def Normalization1(x):'''归一化（0~1）''''''x_=(x−x_min)/(x_max−x_min)'''return [(float(i)-min(x))/float(max(x)-min(x)) for i in x]
def Normalization2(x):'''归一化（-1~1）''''''x_=(x−x_mean)/(x_max−x_min)'''return [(float(i)-np.mean(x))/(max(x)-min(x)) for i in x]
#标准化
def z_score(x):'''x∗=(x−μ)/σ'''x_mean=np.mean(x)s2=sum([(i-np.mean(x))*(i-np.mean(x)) for i in x])/len(x)return [(i-x_mean)/s2 for i in x]l=[-10, 5, 5, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 15, 15, 30]
l1=[]
# for i in l:
#     i+=2
#     l1.append(i)
# print(l1)
cs=[]
for i in l:c=l.count(i)cs.append(c)
print(cs)
n=Normalization2(l)
z=z_score(l)
print(n)
print(z)
'''
蓝线为原始数据，橙线为z
'''
plt.plot(l,cs)
plt.plot(z,cs)
plt.show()