今天正好需要用到归一化,这里使用max_min归一化,写个文章记录一下。
代码如下:
data_zong1 = pd.DataFrame()
data_original = [data_1,data_2,data_3,data_4,data_5]
for a in range(len(data_original)):
data_11 = data_original[a].copy()
max_data_zong = []
min_data_zong = []
for i in range(data_11.shape[1]):
max_data = np.max(data_11.iloc[:,i])
max_data_zong.append(max_data)
min_data = np.min(data_11.iloc[:, i])
min_data_zong.append(min_data)
for j in range(data_11.shape[0]):
data_11.iloc[j, i] = (data_11.iloc[j, i] - min_data) / (max_data-min_data)
### 进行列名的脱敏
data_11.columns = ['p1','p2','p3','p4','p5','p6','p7','p8','p9',
'p10','p11','p12','p13','p14','p15','y']
data_11.to_csv('./normal_data'+str(a+1)+'.csv', index=False)
max_data_zong1 = pd.DataFrame(max_data_zong, columns=['pici'+str(a+1)+'_max'])
min_data_zong1 = pd.DataFrame(min_data_zong, columns=['pici'+str(a+1)+'_min'])
data_zong1 = pd.concat([data_zong1,max_data_zong1,min_data_zong1],axis =1)
data_zong1.to_csv('./max-min.csv', index=False)
进行归一化后,保存每一列的最大值和最小值。
注
:data_1,data_2,data_3,data_4,data_5 代表多个数据集。
参考:
https://blog.csdn.net/weixin_42499236/article/details/83998592