本内容来自《跟着迪哥学Python数据分析与机器学习实战》,该篇博客将其内容进行了整理,加上了自己的理解,所做小笔记。若有侵权,联系立删。
迪哥说以下的许多函数方法都不用死记硬背,多查API多看文档,确实,跟着迪哥混就完事了~~~
Matplotlib菜鸟教程
Matplotlib官网API
以下代码段均在Jupyter Notebook下进行运行操作
每天过一遍,腾讯阿里明天见~
一、常规绘图方法
导入工具包,一般用plt来当作Matplotlib的别名
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import math
import random
%matplotlib inline
画一个简单的折线图,只需要把二维数据点对应好即可
给定横坐标[1,2,3,4,5],纵坐标[1,4,9,16,25],并且指明x轴与y轴的名称分别为xlabel和ylabel
plt.plot([1,2,3,4,5],[1,4,9,16,25])
plt.xlabel('xlabel',fontsize=16)
plt.ylabel('ylabel')
"""
Text(0, 0.5, 'ylabel')
"""
Ⅰ,细节设置
字符 | 类型 |
---|---|
- | 实线 |
-. | 虚点线 |
. | 点 |
o | 圆点 |
^ | 上三角点 |
v | 下三角点 |
< | 左三角点 |
> | 右三角点 |
2 | 上三叉点 |
1 | 下三叉点 |
3 | 左三叉点 |
4 | 右三叉点 |
p | 五角点 |
h | 六边形点1 |
H | 六边形点2 |
+ | 加号点 |
D | 实心正菱形点 |
d | 实心瘦菱形点 |
_ | 横线点 |
– | 虚线 |
: | 点线 |
, | 像素点 |
s | 正方点 |
* | 星形点 |
x | 乘号点 |
字符 | 颜色 | 英文全称 |
---|---|---|
b | 蓝色 | blue |
g | 绿色 | green |
r | 红色 | red |
c | 青色 | cyan |
m | 品红色 | magenta |
y | 黄色 | yellow |
k | 黑色 | black |
w | 白色 | white |
fontsize表示字体的大小
plt.plot([1,2,3,4,5],[1,4,9,16,25],'-.')plt.xlabel('xlabel',fontsize=16)
plt.ylabel('ylabel',fontsize=16)
"""
Text(0, 0.5, 'ylabel')
"""
plt.plot([1,2,3,4,5],[1,4,9,16,25],'-.',color='r')
"""
[<matplotlib.lines.Line2D at 0x23bf91a4be0>]
"""
多次调用plot()函数可以加入多次绘图的结果
颜色和线条参数也可以写在一起,例如,“r–”表示红色的虚线
yy = np.arange(0,10,0.5)
plt.plot(yy,yy,'r--')
plt.plot(yy,yy**2,'bs')
plt.plot(yy,yy**3,'go')
"""
[<matplotlib.lines.Line2D at 0x23bf944ffa0>]
"""
linewidth设置线条宽度
x = np.linspace(-10,10)
y = np.sin(x)plt.plot(x,y,linewidth=3.0)
"""
[<matplotlib.lines.Line2D at 0x23bfb63f9a0>]
"""
plt.plot(x,y,color='b',linestyle=':',marker='o',markerfacecolor='r',markersize=10)
"""
[<matplotlib.lines.Line2D at 0x23bfb6baa00>]
"""
alpha表示透明程度
line = plt.plot(x,y)plt.setp(line,color='r',linewidth=2.0,alpha=0.4)
"""
[None, None, None]
"""
Ⅱ,子图与标注
subplot(211)表示要画的图整体是2行1列的,一共包括两幅子图,最后的1表示当前绘制顺序是第一幅子图
subplot(212)表示还是这个整体,只是在顺序上要画第2个位置上的子图
整体表现为竖着排列
plt.subplot(211)
plt.plot(x,y,color='r')
plt.subplot(212)
plt.plot(x,y,color='b')
"""
[<matplotlib.lines.Line2D at 0x23bfc84acd0>]
"""
横着排列,那就是1行2列了
plt.subplot(121)
plt.plot(x,y,color='r')
plt.subplot(122)
plt.plot(x,y,color='b')
"""
[<matplotlib.lines.Line2D at 0x23bfc8fc1c0>]
"""
不仅可以创建一行或者一列,还可以创建多行多列
plt.subplot(321)
plt.plot(x,y,color='r')
plt.subplot(324)
plt.plot(x,y,color='b')
"""
[<matplotlib.lines.Line2D at 0x23bfca43ee0>]
"""
在图上加一些标注
plt.plot(x,y,color='b',linestyle=':',marker='o',markerfacecolor='r',markersize=10)
plt.xlabel('x:---')
plt.ylabel('y:---')plt.title('beyondyanyu:---')#图题plt.text(0,0,'beyondyanyu')#在指定位置添加注释plt.grid(True)#显示网格#添加箭头,需要给出起始点和终止点的位置以及箭头的各种属性
plt.annotate('beyondyanyu',xy=(-5,0),xytext=(-2,0.3),arrowprops=dict(facecolor='red',shrink=0.05,headlength=20,headwidth=20))
"""
Text(-2, 0.3, 'beyondyanyu')
"""
有时为了整体的美感和需求也可以把网格隐藏起来,通过plt.gca()来获得当前图表,然后改变其属性值
x = range(10)
y = range(10)
fig = plt.gca()
plt.plot(x,y)
fig.axes.get_xaxis().set_visible(False)
fig.axes.get_yaxis().set_visible(False)
随机创建一些数据
x = np.random.normal(loc=0.0,scale=1.0,size=300)
width = 0.5
bins = np.arange(math.floor(x.min())-width, math.ceil(x.max())+width, width)
ax = plt.subplot(111)
ax.spines['top'].set_visible(False)#去掉上方的坐标轴线
ax.spines['right'].set_visible(False)##去掉右方的坐标轴线plt.tick_params(bottom='off',top='off',left='off',right='off')#可以选择是否隐藏坐标轴上的锯齿线plt.grid()#加入网格plt.hist(x,alpha=0.5,bins=bins)#绘制直方图
"""
(array([ 0., 0., 1., 3., 2., 16., 29., 50., 50., 61., 48., 21., 10.,6., 3.]),array([-4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5, 0. , 0.5,1. , 1.5, 2. , 2.5, 3. ]),<BarContainer object of 15 artists>)
"""
在x轴上,如果字符太多,横着写容易堆叠在一起了,这时可以斜着写
x = range(10)
y = range(10)
labels = ['beyondyanyu' for i in range(10)]
fig,ax = plt.subplots()
plt.plot(x,y)
plt.title('beyondyanyu')
ax.set_xticklabels(labels,rotation=45,horizontalalignment='right')
"""
[Text(-2.0, 0, 'beyondyanyu'),Text(0.0, 0, 'beyondyanyu'),Text(2.0, 0, 'beyondyanyu'),Text(4.0, 0, 'beyondyanyu'),Text(6.0, 0, 'beyondyanyu'),Text(8.0, 0, 'beyondyanyu'),Text(10.0, 0, 'beyondyanyu')]
"""
绘制多个线条或者多个类别数据,使用legend()函数给出颜色和类别的对应关系
loc='best’相当于让工具包自己找一个合适的位置来显示图表中颜色所对应的类别
x = np.arange(10)
for i in range(1,4):plt.plot(x,i*x**2,label='Group %d '%i)
plt.legend(loc='best')
"""
<matplotlib.legend.Legend at 0x23b811ee3d0>
"""
help函数,可以直接打印出所有可调参数
print(help(plt.legend))
"""
Help on function legend in module matplotlib.pyplot:legend(*args, **kwargs)Place a legend on the axes.Call signatures::legend()legend(labels)legend(handles, labels)The call signatures correspond to three different ways how to usethis method.**1. Automatic detection of elements to be shown in the legend**The elements to be added to the legend are automatically determined,when you do not pass in any extra arguments.In this case, the labels are taken from the artist. You can specifythem either at artist creation or by calling the:meth:`~.Artist.set_label` method on the artist::line, = ax.plot([1, 2, 3], label='Inline label')ax.legend()or::line, = ax.plot([1, 2, 3])line.set_label('Label via method')ax.legend()Specific lines can be excluded from the automatic legend elementselection by defining a label starting with an underscore.This is default for all artists, so calling `.Axes.legend` withoutany arguments and without setting the labels manually will result inno legend being drawn.**2. Labeling existing plot elements**To make a legend for lines which already exist on the axes(via plot for instance), simply call this function with an iterableof strings, one for each legend item. For example::ax.plot([1, 2, 3])ax.legend(['A simple line'])Note: This way of using is discouraged, because the relation betweenplot elements and labels is only implicit by their order and caneasily be mixed up.**3. Explicitly defining the elements in the legend**For full control of which artists have a legend entry, it is possibleto pass an iterable of legend artists followed by an iterable oflegend labels respectively::legend((line1, line2, line3), ('label1', 'label2', 'label3'))Parameters----------handles : sequence of `.Artist`, optionalA list of Artists (lines, patches) to be added to the legend.Use this together with *labels*, if you need full control on whatis shown in the legend and the automatic mechanism described aboveis not sufficient.The length of handles and labels should be the same in thiscase. If they are not, they are truncated to the smaller length.labels : list of str, optionalA list of labels to show next to the artists.Use this together with *handles*, if you need full control on whatis shown in the legend and the automatic mechanism described aboveis not sufficient.Returns-------`~matplotlib.legend.Legend`Other Parameters----------------loc : str or pair of floats, default: :rc:`legend.loc` ('best' for axes, 'upper right' for figures)The location of the legend.The strings``'upper left', 'upper right', 'lower left', 'lower right'``place the legend at the corresponding corner of the axes/figure.The strings``'upper center', 'lower center', 'center left', 'center right'``place the legend at the center of the corresponding edge of theaxes/figure.The string ``'center'`` places the legend at the center of the axes/figure.The string ``'best'`` places the legend at the location, among the ninelocations defined so far, with the minimum overlap with other drawnartists. This option can be quite slow for plots with large amounts ofdata; your plotting speed may benefit from providing a specific location.The location can also be a 2-tuple giving the coordinates of the lower-leftcorner of the legend in axes coordinates (in which case *bbox_to_anchor*will be ignored).For back-compatibility, ``'center right'`` (but no other location) can alsobe spelled ``'right'``, and each "string" locations can also be given as anumeric value:=============== =============Location String Location Code=============== ============='best' 0'upper right' 1'upper left' 2'lower left' 3'lower right' 4'right' 5'center left' 6'center right' 7'lower center' 8'upper center' 9'center' 10=============== =============bbox_to_anchor : `.BboxBase`, 2-tuple, or 4-tuple of floatsBox that is used to position the legend in conjunction with *loc*.Defaults to `axes.bbox` (if called as a method to `.Axes.legend`) or`figure.bbox` (if `.Figure.legend`). This argument allows arbitraryplacement of the legend.Bbox coordinates are interpreted in the coordinate system given by*bbox_transform*, with the default transformAxes or Figure coordinates, depending on which ``legend`` is called.If a 4-tuple or `.BboxBase` is given, then it specifies the bbox``(x, y, width, height)`` that the legend is placed in.To put the legend in the best location in the bottom rightquadrant of the axes (or figure)::loc='best', bbox_to_anchor=(0.5, 0., 0.5, 0.5)A 2-tuple ``(x, y)`` places the corner of the legend specified by *loc* atx, y. For example, to put the legend's upper right-hand corner in thecenter of the axes (or figure) the following keywords can be used::loc='upper right', bbox_to_anchor=(0.5, 0.5)ncol : int, default: 1The number of columns that the legend has.prop : None or `matplotlib.font_manager.FontProperties` or dictThe font properties of the legend. If None (default), the current:data:`matplotlib.rcParams` will be used.fontsize : int or {'xx-small', 'x-small', 'small', 'medium', 'large', 'x-large', 'xx-large'}The font size of the legend. If the value is numeric the size will be theabsolute font size in points. String values are relative to the currentdefault font size. This argument is only used if *prop* is not specified.labelcolor : str or listSets the color of the text in the legend. Can be a valid color string(for example, 'red'), or a list of color strings. The labelcolor canalso be made to match the color of the line or marker using 'linecolor','markerfacecolor' (or 'mfc'), or 'markeredgecolor' (or 'mec').numpoints : int, default: :rc:`legend.numpoints`The number of marker points in the legend when creating a legendentry for a `.Line2D` (line).scatterpoints : int, default: :rc:`legend.scatterpoints`The number of marker points in the legend when creatinga legend entry for a `.PathCollection` (scatter plot).scatteryoffsets : iterable of floats, default: ``[0.375, 0.5, 0.3125]``The vertical offset (relative to the font size) for the markerscreated for a scatter plot legend entry. 0.0 is at the base thelegend text, and 1.0 is at the top. To draw all markers at thesame height, set to ``[0.5]``.markerscale : float, default: :rc:`legend.markerscale`The relative size of legend markers compared with the originallydrawn ones.markerfirst : bool, default: TrueIf *True*, legend marker is placed to the left of the legend label.If *False*, legend marker is placed to the right of the legend label.frameon : bool, default: :rc:`legend.frameon`Whether the legend should be drawn on a patch (frame).fancybox : bool, default: :rc:`legend.fancybox`Whether round edges should be enabled around the `~.FancyBboxPatch` whichmakes up the legend's background.shadow : bool, default: :rc:`legend.shadow`Whether to draw a shadow behind the legend.framealpha : float, default: :rc:`legend.framealpha`The alpha transparency of the legend's background.If *shadow* is activated and *framealpha* is ``None``, the default value isignored.facecolor : "inherit" or color, default: :rc:`legend.facecolor`The legend's background color.If ``"inherit"``, use :rc:`axes.facecolor`.edgecolor : "inherit" or color, default: :rc:`legend.edgecolor`The legend's background patch edge color.If ``"inherit"``, use take :rc:`axes.edgecolor`.mode : {"expand", None}If *mode* is set to ``"expand"`` the legend will be horizontallyexpanded to fill the axes area (or *bbox_to_anchor* if definesthe legend's size).bbox_transform : None or `matplotlib.transforms.Transform`The transform for the bounding box (*bbox_to_anchor*). For a valueof ``None`` (default) the Axes':data:`~matplotlib.axes.Axes.transAxes` transform will be used.title : str or NoneThe legend's title. Default is no title (``None``).title_fontsize : int or {'xx-small', 'x-small', 'small', 'medium', 'large', 'x-large', 'xx-large'}, default: :rc:`legend.title_fontsize`The font size of the legend's title.borderpad : float, default: :rc:`legend.borderpad`The fractional whitespace inside the legend border, in font-size units.labelspacing : float, default: :rc:`legend.labelspacing`The vertical space between the legend entries, in font-size units.handlelength : float, default: :rc:`legend.handlelength`The length of the legend handles, in font-size units.handletextpad : float, default: :rc:`legend.handletextpad`The pad between the legend handle and text, in font-size units.borderaxespad : float, default: :rc:`legend.borderaxespad`The pad between the axes and legend border, in font-size units.columnspacing : float, default: :rc:`legend.columnspacing`The spacing between columns, in font-size units.handler_map : dict or NoneThe custom dictionary mapping instances or types to a legendhandler. This *handler_map* updates the default handler mapfound at `matplotlib.legend.Legend.get_legend_handler_map`.Notes-----Some artists are not supported by this function. See:doc:`/tutorials/intermediate/legend_guide` for details.Examples--------.. plot:: gallery/text_labels_and_annotations/legend.pyNone
"""
loc参数中还可以指定特殊位置
fig = plt.figure()
ax = plt.subplot(111)x = np.arange(10)
for i in range(1,4):plt.plot(x,i*x**2,label='Group %d'%i)
ax.legend(loc='upper center',bbox_to_anchor=(0.5,1.15),ncol=3)
"""
<matplotlib.legend.Legend at 0x23b8119db50>
"""
Ⅲ,风格设置
查看一下Matplotlib有哪些能调用的风格
plt.style.available
"""
['Solarize_Light2','_classic_test_patch','bmh','classic','dark_background','fast','fivethirtyeight','ggplot','grayscale','seaborn','seaborn-bright','seaborn-colorblind','seaborn-dark','seaborn-dark-palette','seaborn-darkgrid','seaborn-deep','seaborn-muted','seaborn-notebook','seaborn-paper','seaborn-pastel','seaborn-poster','seaborn-talk','seaborn-ticks','seaborn-white','seaborn-whitegrid','tableau-colorblind10']
"""
默认的风格代码
x = np.linspace(-10,10)
y = np.sin(x)
plt.plot(x,y)
"""
[<matplotlib.lines.Line2D at 0x23bfce29b80>]
"""
可以通过plt.style.use()函数来改变当前风格
plt.style.use('dark_background')
plt.plot(x,y)
"""
[<matplotlib.lines.Line2D at 0x23bfcf07fd0>]
"""
plt.style.use('bmh')
plt.plot(x,y)
"""
[<matplotlib.lines.Line2D at 0x23bfceca550>]
"""
plt.style.use('ggplot')
plt.plot(x,y)
"""
[<matplotlib.lines.Line2D at 0x23bfcfc5f10>]
"""
二、常规图表绘制
Ⅰ,条形图
np.random.seed(0)
x = np.arange(5)
y = np.random.randint(-5,5,5)#随机创建一些数据
fig,axes = plt.subplots(ncols=2)
v_bars = axes[0].bar(x,y,color='red')#条形图
h_bars = axes[1].bar(x,y,color='red')#横着画#通过子图索引分别设置各种细节
axes[0].axhline(0,color='gray',linewidth=2)
axes[1].axhline(0,color='gray',linewidth=2)plt.show()
在绘图过程中,有时需要考虑误差棒,以表示数据或者实验的偏离情况,做法也很简单,在bar()函数中,已经有现成的yerr和xerr参数,直接赋值即可:
mean_values = [1,2,3,4,5]#数值
variance = [0.2,0.4,0.6,0.8,1.0]#误差棒
bar_label = ['bar1','bar2','bar3','bar4','bar5']#名字
x_pos = list(range(len(bar_label)))#指定位置plt.bar(x_pos,mean_values,yerr=variance,alpha=0.3)#带有误差棒的条形图
max_y = max(zip(mean_values,variance))#可以自己设置x轴和y轴的取值范围
plt.ylim([0,(max_y[0]+max_y[1])*1.2])plt.ylabel('variable y')#y轴标签
plt.xticks(x_pos,bar_label)#x轴标签plt.show()
可以加入更多对比细节,先把条形图绘制出来,细节都可以慢慢添加:
data = range(200,225,5)#数据
bar_labels = ['a','b','c','d','e']#要对比的类型名称
fig = plt.figure(figsize=(10,8))#指定画图区域的大小
y_pos = np.arange(len(data))#一会儿要横着画图,所以要在y轴上找每个起始位置
plt.yticks(y_pos,bar_labels,fontsize=16)#在y轴写上各个类别名称
bars = plt.barh(y_pos,data,alpha=0.5,color='g')#绘制条形图,指定颜色和透明度
plt.vlines(min(data),-1,len(data)+0.5,linestyles='dashed')#画一条竖线,至少需要3个参数,即x轴位置
for b,d in zip(bars,data):#在对应位置写上注释,这里写了随意计算的结果plt.text(b.get_width()+b.get_width()*0.05,b.get_y()+b.get_height()/2,'{0:.2%}'.format(d/min(data)))plt.show()
把条形图画得更个性一些,也可以让各种线条看起来不同
patterns = ('-','+','x','\\','*','o','O','.')#这些图形对应这些绘图结果
mean_value = range(1,len(patterns)+1)#让条形图数值递增,看起来舒服点
x_pos = list(range(len(mean_value)))#竖着画,得有每一个线条的位置
bars = plt.bar(x_pos,mean_value,color='white')#把条形图画出来
for bar,pattern in zip(bars,patterns):#通过参数设置条的样式bar.set_hatch(pattern)plt.show()
Ⅱ,盒装图
盒图(boxplot)主要由最小值(min)、下四分位数(Q1)、中位数(median)、上四分位数(Q3)、最大值(max)五部分组成
在每一个小盒图中,从下到上就分别对应之前说的5个组成部分,计算方法如下:
IQR=Q3–Q1,即上四分位数与下四分位数之间的差
min=Q1–1.5×IQR,正常范围的下限
max=Q3+1.5×IQR,正常范围的上限
方块代表异常点或者离群点,离群点就是超出上限或下限的数据点
boxplot()函数就是主要绘图部分
sym参数用来展示异常点的符号,可以用正方形,也可以用加号
vert参数表示是否要竖着画,它与条形图一样,也可以横着画
yy_data = [np.random.normal(0,std,100) for std in range(1,4)]
fig = plt.figure(figsize=(8,6))
plt.boxplot(yy_data,sym='s',vert=True)
plt.xticks([y+1 for y in range(len(yy_data))],['x1','x2','x3'])
plt.xlabel('x')
plt.title('box plot')
"""
Text(0.5, 1.0, 'box plot')
"""
boxplot()函数就是主要绘图部分,查看完整的参数,最直接的办法看帮助文档
参数 | 功能 |
---|---|
x | 指定要绘制箱线图的数据 |
notch | 是否以凹口的形式展现箱线图,默认非凹口 |
sym | 指定异常点的形状,默认为+号显示 |
vert | 是否需要将箱线图垂直摆放,默认垂直摆放 |
positions | 指定箱线图的位置,默认为[0,1,2…] |
widths | 指定箱线图的宽度,默认为0.5 |
patch_artist | 是否填充箱体的颜色 |
meanline | 是否用线的形式表示均值,默认用点来表示 |
showmeans | 是否显示均值,默认不显示 |
showcaps | 是否显示箱线图顶端和末端的两条线,默认显示 |
showbox | 是否显示箱线图的箱体,默认显示 |
showfliers | 是否显示异常值,默认显示 |
boxprops | 设置箱体的属性,如边框色、填充色等 |
labels | 为箱线图添加标签,类似于图例的作用 |
filerprops | 设置异常值的属性,如异常点的形状、大小、填充色等 |
medianprops | 设置中位数的属性,如线的类型、粗细等 |
meanprops | 设置均值的属性,如点的大小、颜色等 |
capprops | 设置箱线图顶端和末端线条的属性,如颜色、粗细等 |
print(help(plt.boxplot))
"""
Help on function boxplot in module matplotlib.pyplot:boxplot(x, notch=None, sym=None, vert=None, whis=None, positions=None, widths=None, patch_artist=None, bootstrap=None, usermedians=None, conf_intervals=None, meanline=None, showmeans=None, showcaps=None, showbox=None, showfliers=None, boxprops=None, labels=None, flierprops=None, medianprops=None, meanprops=None, capprops=None, whiskerprops=None, manage_ticks=True, autorange=False, zorder=None, *, data=None)Make a box and whisker plot.Make a box and whisker plot for each column of *x* or eachvector in sequence *x*. The box extends from the lower toupper quartile values of the data, with a line at the median.The whiskers extend from the box to show the range of thedata. Flier points are those past the end of the whiskers.Parameters----------x : Array or a sequence of vectors.The input data.notch : bool, default: FalseWhether to draw a noteched box plot (`True`), or a rectangular boxplot (`False`). The notches represent the confidence interval (CI)around the median. The documentation for *bootstrap* describes howthe locations of the notches are computed... note::In cases where the values of the CI are less than thelower quartile or greater than the upper quartile, thenotches will extend beyond the box, giving it adistinctive "flipped" appearance. This is expectedbehavior and consistent with other statisticalvisualization packages.sym : str, optionalThe default symbol for flier points. An empty string ('') hidesthe fliers. If `None`, then the fliers default to 'b+'. Morecontrol is provided by the *flierprops* parameter.vert : bool, default: TrueIf `True`, draws vertical boxes.If `False`, draw horizontal boxes.whis : float or (float, float), default: 1.5The position of the whiskers.If a float, the lower whisker is at the lowest datum above``Q1 - whis*(Q3-Q1)``, and the upper whisker at the highest datumbelow ``Q3 + whis*(Q3-Q1)``, where Q1 and Q3 are the first andthird quartiles. The default value of ``whis = 1.5`` correspondsto Tukey's original definition of boxplots.If a pair of floats, they indicate the percentiles at which todraw the whiskers (e.g., (5, 95)). In particular, setting this to(0, 100) results in whiskers covering the whole range of the data."range" is a deprecated synonym for (0, 100).In the edge case where ``Q1 == Q3``, *whis* is automatically setto (0, 100) (cover the whole range of the data) if *autorange* isTrue.Beyond the whiskers, data are considered outliers and are plottedas individual points.bootstrap : int, optionalSpecifies whether to bootstrap the confidence intervalsaround the median for notched boxplots. If *bootstrap* isNone, no bootstrapping is performed, and notches arecalculated using a Gaussian-based asymptotic approximation(see McGill, R., Tukey, J.W., and Larsen, W.A., 1978, andKendall and Stuart, 1967). Otherwise, bootstrap specifiesthe number of times to bootstrap the median to determine its95% confidence intervals. Values between 1000 and 10000 arerecommended.usermedians : array-like, optionalA 1D array-like of length ``len(x)``. Each entry that is not`None` forces the value of the median for the correspondingdataset. For entries that are `None`, the medians are computedby Matplotlib as normal.conf_intervals : array-like, optionalA 2D array-like of shape ``(len(x), 2)``. Each entry that is notNone forces the location of the corresponding notch (which isonly drawn if *notch* is `True`). For entries that are `None`,the notches are computed by the method specified by the otherparameters (e.g., *bootstrap*).positions : array-like, optionalSets the positions of the boxes. The ticks and limits areautomatically set to match the positions. Defaults to``range(1, N+1)`` where N is the number of boxes to be drawn.widths : float or array-likeSets the width of each box either with a scalar or asequence. The default is 0.5, or ``0.15*(distance betweenextreme positions)``, if that is smaller.patch_artist : bool, default: FalseIf `False` produces boxes with the Line2D artist. Otherwise,boxes and drawn with Patch artists.labels : sequence, optionalLabels for each dataset (one per dataset).manage_ticks : bool, default: TrueIf True, the tick locations and labels will be adjusted to matchthe boxplot positions.autorange : bool, default: FalseWhen `True` and the data are distributed such that the 25th and75th percentiles are equal, *whis* is set to (0, 100) suchthat the whisker ends are at the minimum and maximum of the data.meanline : bool, default: FalseIf `True` (and *showmeans* is `True`), will try to render themean as a line spanning the full width of the box according to*meanprops* (see below). Not recommended if *shownotches* is alsoTrue. Otherwise, means will be shown as points.zorder : float, default: ``Line2D.zorder = 2``Sets the zorder of the boxplot.Returns-------dictA dictionary mapping each component of the boxplot to a listof the `.Line2D` instances created. That dictionary has thefollowing keys (assuming vertical boxplots):- ``boxes``: the main body of the boxplot showing thequartiles and the median's confidence intervals ifenabled.- ``medians``: horizontal lines at the median of each box.- ``whiskers``: the vertical lines extending to the mostextreme, non-outlier data points.- ``caps``: the horizontal lines at the ends of thewhiskers.- ``fliers``: points representing data that extend beyondthe whiskers (fliers).- ``means``: points or lines representing the means.Other Parameters----------------showcaps : bool, default: TrueShow the caps on the ends of whiskers.showbox : bool, default: TrueShow the central box.showfliers : bool, default: TrueShow the outliers beyond the caps.showmeans : bool, default: FalseShow the arithmetic means.capprops : dict, default: NoneThe style of the caps.boxprops : dict, default: NoneThe style of the box.whiskerprops : dict, default: NoneThe style of the whiskers.flierprops : dict, default: NoneThe style of the fliers.medianprops : dict, default: NoneThe style of the median.meanprops : dict, default: NoneThe style of the mean.Notes-----.. note::In addition to the above described arguments, this function can takea *data* keyword argument. If such a *data* argument is given,every other argument can also be string ``s``, which isinterpreted as ``data[s]`` (unless this raises an exception).Objects passed as **data** must support item access (``data[s]``) andmembership test (``s in data``).None
"""
还有一种图形与盒图长得有点相似,叫作小提琴图(violinplot)
小提琴图给人以“胖瘦”的感觉,越“胖”表示当前位置的数据点分布越密集,越“瘦”则表示此处数据点比较稀疏。
小提琴图没有展示出离群点,而是从数据的最小值、最大值开始展示
fig,axes = plt.subplots(nrows=1,ncols=2,figsize=(12,5))
yy_data = [np.random.normal(0,std,100) for std in range(6,10)]
axes[0].violinplot(yy_data,showmeans=False,showmedians=True)
axes[0].set_title('violin plot')#设置图题axes[1].boxplot(yy_data)#右边画盒图
axes[1].set_title('box plot')#设置图题for ax in axes:#为了对比更清晰一些,把网格画出来ax.yaxis.grid(True)
ax.set_xticks([y+1 for y in range(len(yy_data))])#指定x轴画的位置
ax.set_xticklabels(['x1','x2','x3','x4'])#设置x轴指定的名称
"""
[Text(1, 0, 'x1'), Text(2, 0, 'x2'), Text(3, 0, 'x3'), Text(4, 0, 'x4')]
"""
Ⅲ,直方图与散点图
直方图(Histogram)可以更清晰地表示数据的分布情况
画直方图的时候,需要指定一个bins,也就是按照什么区间来划分
例如:np.arange(−10,10,5)=array([−10,−5,0,5])
data = np.random.normal(0,20,1000)
bins = np.arange(-100,100,5)
plt.hist(data,bins=bins)
plt.xlim([min(data)-5,max(data)+5])
plt.show()
同时展示不同类别数据的分布情况,也可以分别绘制,但是要更透明一些,否则就会堆叠在一起
data1 = [random.gauss(15,10) for i in range(500)]#随机构造些数据
data2 = [random.gauss(5,5) for i in range(500)]#两个类别进行对比
bins = np.arange(-50,50,2.5)#指定区间
plt.hist(data1,bins=bins,label='class 1',alpha=0.3)#分别绘制,都透明一点,alpha控制透明度,设置小点
plt.hist(data2,bins=bins,label='class 2',alpha=0.3)
plt.legend(loc='best')#用不同颜色表示不同的类别
plt.show()
通常散点图可以来表示特征之间的相关性,调用 scatter()函数即可
N = 1000
x = np.random.randn(N)
y = np.random.randn(N)
plt.scatter(x,y,alpha=0.3)
plt.grid(True)
plt.show()
Ⅳ,3D图
展示三维数据需要用到3D图
fig = plt.figure()
ax = fig.add_subplot(111,projection='3d')#绘制空白3D图
plt.show()
往空白3D图中填充数据
以不同的视角观察结果,只需在最后加入ax.view_init()函数,并在其中设置旋转的角度即可
np.random.seed(1)#设置随机种子,使得结果一致def randrange(n,vmin,vmax):#随机创建数据方法return (vmax-vmin)*np.random.rand(n)+vminfig = plt.figure()ax = fig.add_subplot(111,projection='3d')#绘制3D图
n = 100for c,m,zlow,zhigh in [('r','o',-50,-25),('b','x','-30','-5')]:#设置颜色的标记以及取值范围xs = randrange(n,23,32)ys = randrange(n,0,100)zs = randrange(n,int(zlow),int(zhigh))ax.scatter(xs,ys,zs,c=c,marker=m)#三个轴的数据都需要传入
plt.show()
其他图表的3D图绘制方法相同,只需要调用各自的绘图函数即可
fig = plt.figure()
ax = fig.add_subplot(111,projection='3d')
for c,z in zip(['r','g','b','y'],[30,20,10,0]):xs = np.arange(20)ys = np.random.rand(20)cs = [c]*len(xs)ax.bar(xs,ys,zs=z,zdir='y',color=cs,alpha=0.5)
plt.show()
Ⅴ,布局设置
ax1 = plt.subplot2grid((3,3),(0,0))#3×3的布局,第一个子图ax2 = plt.subplot2grid((3,3),(1,0))#布局大小都是3×3,但是各自位置不同ax3 = plt.subplot2grid((3,3),(0,2),rowspan=3)#一个顶3个ax4 = plt.subplot2grid((3,3),(2,0),colspan=2)#一个顶2个ax5 = plt.subplot2grid((3,3),(0,1),rowspan=2)#一个顶2个
不同子图的规模不同,在布局时,也可以在图表中再嵌套子图
x = np.linspace(0,10,1000)
y2 = np.sin(x**2)
y1 = x**2
fig,ax1 = plt.subplots()
设置嵌套图的参数含义如下:
left:绘制区左侧边缘线与Figure画布左侧边缘线距离
bottom:绘图区底部边缘线与Figure画布底部边缘线的距离
width:绘图区的宽度
height:绘图区的高度
x = np.linspace(0,10,1000)#随便创建数据
y2 = np.sin(x**2)#因为要创建两个图,需要准备两份数据
y1 = x**2
fig,ax1 = plt.subplots()
left,bottom,width,height = [0.22,0.42,0.3,0.35]#设置嵌套图的位置
ax2 = fig.add_axes([left,bottom,width,height])
ax1.plot(x,y1)
ax2.plot(x,y2)
"""
[<matplotlib.lines.Line2D at 0x23b8297c940>]
"""