天池XGBoost
地址
重写柱状图代码:我没考虑复杂度,只考虑直观理解
原文统计地点是否降雨来画柱状图实在是太麻烦了,我重写了一下。最麻烦的就是数据处理。我的思路是:
-
首先取下雨的全部数据
data[data['RainTomorrow'] == 'Yes']
-
然后对地点进行分组
groupby('Location').size()
。不加size()没输出的,没加的结果:<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000001FEF989EEE0>
-
最后加上列数量和列名
reset_index(name='Count')
。这句不加的结果是这样:Location Adelaide 513 Albany 665 Albury 454 AliceSprings 169 BadgerysCreek 425
-
总结123,下雨和没下雨的数据就为:
data_LocYes = data[data['RainTomorrow'] == 'Yes'].groupby('Location').size().reset_index(name='Count') data_LocNo = data[data['RainTomorrow'] == 'No'].groupby('Location').size().reset_index(name='Count')
输出结果:
Location Count 0 Adelaide 513 1 Albany 665 2 Albury 454 3 AliceSprings 169 4 BadgerysCreek 425
-
然后进行可视化:
plt.figure(figsize=(15,15))plt.subplot(1,2,1) plt.title('RainTomorrow') sns.barplot(y = data_LocYes['Location'], x = data_LocYes['Count'], color = "red")plt.subplot(1,2,2) plt.title('Not RainTomorrow') sns.barplot(y = data_LocNo['Location'], x = data_LocNo['Count'], color = "blue")plt.show()