您一直在寻找5+个简单的一线工具来提升Python可视化效果

Insightful and aesthetic visualizations don’t have to be a pain to create. This article will prevent 5+ simple one-liners you can add to your code to increase its style and informational value.

富有洞察力和美学的可视化不必费心创建。 本文将防止您添加到代码中以增加其样式和信息价值的5种以上简单的单行代码。

将线图绘制成面积图 (Line plot into area chart)

Consider the following standard line plot, created with seaborn’s lineplot, with the husl palette and whitegrid style. The data is generated as a sine wave with normally distributed data and elevated above the x-axis.

考虑下面的标准线图,该线图是用seaborn的lineplot创建的,具有husl调色板和whitegrid样式。 数据以正弦波的形式生成,具有正态分布的数据,并高于x轴。

Image for post

With a few styling choices, the plot looks presentable. However, there is one issue: by default, Seaborn does not begin at a zero baseline, and the numerical impact of the y-axis is lost. Assuming that the x and y variables are named as such, adding plt.fill_between(x,y,alpha=0.4) will turn the data into an area chart that more nicely begins at the base line and emphasizes the y-axis.

通过一些样式选择,该图看起来很合适。 但是,存在一个问题:默认情况下,Seaborn并不是从零基线开始的,并且y轴的数值影响会丢失。 假设xy变量是这样命名的,则添加plt.fill_between(x,y,alpha=0.4)会将数据转换为面积图,该图从基线开始会更好,并强调y轴。

Image for post

Note that this line is added in conjunction with the original lineplot, sns.lineplot(x,y), which provides the bolded line at the top. The alpha parameter, which appears in many seaborn plots as well, controls the transparency of the area (the less, the lighter). plt represents the matplotlib library. In some cases, using area may not be suitable.

请注意,这条线是与原始线图sns.lineplot(x,y) ,它在顶部提供了粗体线。 alpha参数(也出现在许多海洋图中)控制着区域的透明度(越少越亮)。 plt代表matplotlib库。 在某些情况下,使用区域可能不合适。

When multiple area plots are used, it can emphasize overlapping and intersections of the lines, although, again, it may not be appropriate for the visualization context.

当使用多个区域图时,它可以强调线的重叠和相交,尽管同样,它可能不适用于可视化上下文。

Image for post

线图到堆叠区域图 (Line plot to stacked area plot)

Sometimes, the relationship between lines requires that the area plots be stacked on top of each other. This is easy to do with matplotlib stackplot: plt.stackplot(x,y,alpha=0.4). In this case, colors were manually specified through colors=[], which takes in a list of color names or hex codes.

有时,线之间的关系要求面积图彼此堆叠。 使用matplotlib stackplot很容易做到: plt.stackplot(x,y,alpha=0.4) 。 在这种情况下,颜色是通过colors=[]手动指定的,它接受颜色名称或十六进制代码的列表。

Image for post

Note that y is a list of y1 and y2, which represent the noisy sine and cosine waves. These are stacked on top of each other in the area representation, and can heighten understanding of the relative distance between two area plots.

请注意, yy1y2的列表,它们代表有噪声的正弦波和余弦波。 它们在区域表示中彼此堆叠,可以加深对两个区域图之间相对距离的理解。

删除讨厌的传说 (Remove pesky legends)

Seaborn often uses legends by default when the hue parameter is called to draw multiple of the same plot, differing by the column specified as the hue. These legends, while sometimes helpful, often cover up important parts of the plot and contain information that could be better expressed elsewhere (perhaps in a caption).

默认情况下,在调用hue参数绘制同一图的倍数时,Seaborn通常默认使用图例,不同之处在于指定为hue的列。 这些图例虽然有时会有所帮助,但通常会掩盖剧情中的重要部分,并包含可以在其他地方更好地表达的信息(也许在标题中)。

For example, consider the following medical dataset, which contains signals from various subjects. In this case, we want to use multiple line plots to visualize the general trend and range across different patients by setting the subject column as the hue (yes, putting this many lines is known as a ‘spaghetti chart’ and is generally not advised). One can see how the default labels are a) not ordered, b) so long that it obstructs part of the chart, and c) not the point of the visualization.

例如,考虑以下医疗数据集,其中包含来自各个受试者的信号。 在这种情况下,我们希望使用多个折线图,通过将subject列设置为hue来可视化不同患者的总体趋势和范围(是的,放置这么多折线被称为“意大利面条图”,通常不建议这样做) 。 可以看到默认标签是如何排列的:a)没有排序,b)太长以致于它阻碍了图表的一部分,并且c)没有可视化的要点。

Image for post

This can be done by setting the plot equal to a variable (commonly g), like such: g=sns.lineplot(x=…, y=…, hue=…). Then, by accessing the plot object’s legend attributes, we can remove it: g.legend_.remove(). If you are working with a grid object like PairGrid or FacetGrid, use g._legend.remove().

这可以通过将绘图设置为等于变量(通常为g )来完成,例如: g=sns.lineplot(x=…, y=…, hue=…) 。 然后,通过访问绘图对象的图例属性,可以将其删除: g.legend_.remove() 。 如果您正在使用诸如PairGrid或FacetGrid之类的网格对象,请使用g._legend.remove()

Image for post
Much better.
好多了。

手动X和Y轴基线 (Manual x and y axis baselines)

Seaborn does not draw the x and y axis lines by default, but the axes are important for understanding not only the shape of the data but where they stand in relation to the coordinate system.

Seaborn默认情况下不会绘制xy轴线,但是这些轴不仅对于理解数据的形状而且对于理解其相对于坐标系的位置非常重要。

Matplotlib provides a simple way to add the x-axis by simply adding g.axhline(0), where g is the grid object and 0 represents the y-axis value at which the horizontal line is placed. Additionally, one can specify color (in this case color=’black’) and alpha (transparency, in this case alpha=0.5). linestyle is a parameter used to create dotted lines by being set to ‘--’.

Matplotlib提供了一种简单的方法,只需添加g.axhline(0)即可添加x轴,其中g是网格对象,0表示放置水平线的y轴值。 另外,可以指定color (在这种情况下为color='black' )和alpha (透明度,在这种情况下为alpha=0.5 )。 linestyle是用于通过将其设置为'--'来创建虚线的参数。

Image for post

Additionally, vertical lines can be added through g.axvline(0).

另外,可以通过g.axvline(0)添加垂直线。

You can also use axhline to display averages or benchmarks for, say, bar plots. For example, say that we want to show the plants that were able to meet the 0.98 petal_width benchmark based on sepal_width.

您还可以使用axhline显示axhline平均值或基准。 例如,假设我们要显示能够满足基于sepal_width petal_width基准的sepal_width

Image for post

对数刻度 (Logarithmic Scales)

Logarithmic scales are used because they can show a percent change. In many scenarios, this is exactly what is necessary — after all, an increase of $1000 for a business that normally earns $300 is not the same as an increase of $1000 for a megacorporation that earns billions. Instead of needing to calculate percentages in the data, matplotlib can convert scales to logarithmic.

使用对数刻度,因为它们可以显示百分比变化。 在许多情况下,这正是必要的条件—毕竟,通常赚取300美元的企业增加1000美元,与赚取数十亿美元的大型企业增加1000美元并不相同。 matplotlib无需计算数据中的百分比,而是可以将比例转换为对数。

As with many matplotlib features, logarithmic scales operate on the ax of a standard figure created with fig, ax = plt.subplots(figsize=(x,y)). Then, a logarithmic x-scale is as simple as ax.set_xscale(‘log’):

与许多matplotlib功能一样,对数刻度在用fig, ax = plt.subplots(figsize=(x,y))创建的标准图形的fig, ax = plt.subplots(figsize=(x,y)) 。 然后,对数x ax.set_xscale('log')ax.set_xscale('log')一样简单:

Image for post
A sine wave. Note that matplotlib creates exponential-notation x-labels for you!
正弦波。 请注意,matplotlib为您创建指数符号x标签!

A logarithmic y-scale, which is more commonly used, can be done with ax.setyscale(‘log’):

可以使用ax.setyscale('log')完成更常用的对数y ax.setyscale('log')

Image for post
y-logarithmic scale for a sine wave with noise, showing the percent change from the previous time step.
带有噪声的正弦波的y对数标度,显示与上一个时间步长相比的变化百分比。

荣誉奖 (Honorable mentions)

  • Invest in a good default palette. Color is one of the most important aspects of a visualization: it ties it together and expressed a theme. You can choose and set one of Seaborn’s many great palettes with sns.set_palette(name). Check out demonstrations and tips to choosing palettes here.

    投资一个好的默认调色板。 颜色是可视化的最重要方面之一:颜色将其绑在一起并表达了主题。 您可以使用sns.set_palette(name)选择并设置Seaborn的众多出色调色板sns.set_palette(name) 。 在此处查看演示和选择调色板的提示。

  • You can add grids and change the background color with sns.set_style(name), where name can be white (default), whitegrid, dark, or darkgrid.

    您可以使用sns.set_style(name)添加网格并更改背景颜色,其中name可以是white (默认), whitegriddarkdarkgrid

  • Did you know that matplotlib and seaborn can process LaTeX, the beautiful mathematical formatting language? You can use it in your x/y axis labels, titles, legends, and more by enclosing LaTeX expressions within dollar signs $expression$.

    您是否知道matplotlib和seaborn可以处理LaTeX(一种漂亮的数学格式化语言) ? 通过将LaTeX表达式包含在美元符号$expression$ ,可以在x / y轴标签,标题,图例等中使用它。

  • Explore different linestyles, annotation sizes, and fonts. Matplotlib is full of them, if only you have the will to explore its documentation pages.

    探索不同的线型,注释大小和字体。 Matplotlib充满了它们,只要您愿意探索它的文档页面。
  • Most plots have additional parameters, such as error bars for bar plots, thickness, dotted lines, and transparency for line plots. Taking some time to visit the documentation pages and peering through all the available parameters can take only a minute but has the potential to bring your visualization to top-notch aesthetic and informational value.

    大多数图都有其他参数,例如条形图的误差线,厚度,虚线和线图的透明度。 花一些时间访问文档页面并浏览所有可用参数仅需一分钟,但有可能使您的可视化达到一流的美学和信息价值。

    For example, adding the parameter

    例如,添加参数

    inner=’quartile’ in a violinplot draws the first, second, and third quartiles of a distribution in dotted lines. Two words for immense informational gain — I’d say that’s a good deal!

    小提琴图中的inner='quartile' quartile inner='quartile'用虚线绘制分布的第一,第二和第三四分位数。 两个词可带来巨大的信息收益-我说这很划算!

Image for post

附加阅读 (Additional Reading)

All charts created by author.

所有图表由作者创建。

翻译自: https://towardsdatascience.com/5-simple-one-liners-youve-been-looking-for-to-level-up-your-python-visualization-42ebc1deafbc

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389147.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

用C#编写的代码经C#编译器后,并非生成本地代码而是生成托管代码

用C#编写的代码经C#编译器后,并非生成本地代码而是生成托管代码。也就是说,程序集在打包时是连同CLR一起打包的。在客户端的机器上,CLR一行行的读取IL,在读取每行IL时,CLR利用JIT编译器将IL编译成本地的CPU指令。若要节…

figma 安装插件_彩色滤光片Figma插件,用于色盲

figma 安装插件So as a UX Designer, it is important to design with disabilities in mind. One of these is color blindness. It is important to make sure important information on your product is legible to everyone. This is why I like using this tool:因此&…

服务器运维

1.服务器和网站漏洞检测,对Web漏洞、弱口令、潜在的恶意行为、违法信息等进行定期扫描;代码的定期检查,漏洞检查及服务器安全加固 2.服务器数据备份,包括网站程序文件备份,数据库文件备份、配置文件备份,如…

产品观念:更好的捕鼠器_故事很重要:为什么您需要成为更好的讲故事的人

产品观念:更好的捕鼠器重点 (Top highlight)Telling a compelling story helps you get your point across effectively else you get lost in translation.讲一个引人入胜的故事可以帮助您有效地传达观点,否则您会迷失在翻译中。 Great stories happen…

7月15号day7总结

今天复习了springMVC的框架搭建。 思维导图: 转载于:https://www.cnblogs.com/kangy123/p/9315919.html

关于注意力的问题

问题:一旦持续的注意力分散和精力无法集中成为习惯性动作,这将成为一个严重的问题。 实质:加强有意识的集中程度和持续时间,尽量避免无意识注意对大脑的干扰。 不要浪费注意力。大脑以天为周期,每天注意力是有限的。T…

设计师的10种范式转变

For $250, a business can pay a graphic designer to create a logo for their business. Or, for $10,000 a business can hire a graphic designer to form a design strategy that contextually places the business’s branding in a stronghold against the market it’s…

面向Tableau开发人员的Python简要介绍(第2部分)

用PYTHON探索数据 (EXPLORING DATA WITH PYTHON) And we’re back! Let’s pick up where we left off in the first article of this series and use the visual we built there as a starting point.我们回来了! 让我们从在本系列的第一篇文章中停下来的地方开始&…

GAC中的所有的Assembly都会存放在系统目录%winroot%/assembly下面

是的,GAC中的所有的Assembly都会存放在系统目录"%winroot%/assembly下面。放在系统目录下的好处之一是可以让系统管理员通过用户权限来控制Assembly的访问。 关于GAC本身,上面redcaff_l所引述的一段话正是MSDN中对GAC的定义。GAC全称是Global A…

Mysql(三) Mysq慢查询日志

Mysql Slow Query Log MYSQL慢查询日志是用来记录执行时间超过指定时间的查询语句。通过慢查询日志,可以查找出哪些查询语句的执行效率很低,以便进行优化。一般建议开启,它对服务器性能的影响微乎其微,但是可以记录mysql服务器上执…

绘制基础知识-canvas paint

先来看一下Canvas Canvas 用来提供draw方法的调用。绘制东西需要4个基本的组建:一个bitmap用来存放像素,一个canvas用来提供draw方法的调用(往bitmap里写入),原始绘制元素(e.g.Rect, Path, text,Bitmap), 一个paint。 …

Python - 调试Python代码的方法

调试(debug) 将可疑环节的变量逐步打印出来,从而检查哪里是否有错。让程序一部分一部分地运行起来。从核心功能开始,写一点,运行一点,再修改一点。利用工具,例如一些IDE中的调试功能,提高调试效率。Python …

设计组合中的10个严重错误可能会导致您丧命

As an agency co-founder and design lead, I’ve been participating in many recruitment processes. I’ve seen hundreds of portfolios and CVs of aspiring designers. If you’re applying for a UI designer position, it is good to have some things in mind and to …

netflix_Netflix的计算因果推论

netflixJeffrey Wong, Colin McFarland杰弗里黄 , 科林麦克法兰 Every Netflix data scientist, whether their background is from biology, psychology, physics, economics, math, statistics, or biostatistics, has made meaningful contributions to the way…

算法题库网站

Google Code Jam(GCJ)Peking University Online Judge(POJ)CodeForces(CF)LeetCode(LC)Aizu Online Judge(AOJ)

org.dom4j.DocumentException: null Nested exception: null解决方法

由于最近在学习使用Spring架构,经常会遇到与xml文档打交道,今天遇到了此问题,特来分享一下解决方案。 出错原因: 很明显是因为找不到文件路径。这个原因是因为我使用了*.clas.getResourceAsStream(xmlFilePath&#xf…

MySQL命令学习

上面两篇博客讲了MySQL的安装、登录,密码重置,为接下来的MySQL命令学习做好了准备,现在开启MySQL命令学习之旅吧。 首先打开CMD,输入命令:mysql -u root -p 登录MySQL。 注意:MySQL命令终止符为分号 (;) …

实验心得_大肠杆菌原核表达实验心得(上篇)

大肠杆菌原核表达实验心得(上篇)对于大肠杆菌蛋白表达,大部分小伙伴都觉得 so easy! 做大肠杆菌蛋白表达十几年经历的老司机还经常阴沟翻船,被大肠杆菌表达蛋白虐千百遍的惨痛经历,很多小伙伴都有切肤之痛。福因德接下…

scrapy从安装到爬取煎蛋网图片

下载地址:https://www.lfd.uci.edu/~gohlke/pythonlibs/pip install wheelpip install lxmlpip install pyopensslpip install Twistedpip install pywin32pip install scrapy scrapy startproject jandan 创建项目cd jandancd jandan items.py 存放数据pipelines.p…

高斯金字塔 拉普拉斯金字塔_金字塔学入门指南

高斯金字塔 拉普拉斯金字塔The topic for today is on data validation and settings management using Python type hinting. We are going to use a Python package called pydantic which enforces type hints at runtime. It provides user-friendly errors, allowing you …