冠状病毒时代的负责任数据可视化

First, a little bit about me: I’m a data science grad student. I have been writing for Medium for a little while now. I’m a scorpio. I like long walks on beaches. And writing for Medium made me realize the importance of taking personal responsibility over my data viz.

首先,关于我的一些知识:我是一名数据科学研究生。 我已经为Medium写了一段时间了。 我是天蝎座。 我喜欢在海滩上散步。 为Medium写信使我意识到了对数据负责的重要性。

我的理念 (My Philosophy)

I’ve always been libertarian when it comes to data dissemination; the more publicly available data, the more people tinkering with it from their basements and school libraries, the better. Data science is an increasingly pivotal field and I want others to get excited about it as much as I am. Great computer scientists are often made from 12-year-olds coding Tetris in Python (or whatever games they play now — Candy Crush?), although I myself didn’t open my first computer program until I was 21.

在数据发布方面,我一直都是自由主义者。 公开数据越多,地下室和学校图书馆对数据进行修补的人就越多。 数据科学是一个日益重要的领域,我希望其他人能像我一样对它感到兴奋。 优秀的计算机科学家通常是由12岁以下的人用Python(或现在玩的任何游戏,例如Candy Crush?)编写的Tetris编写的,尽管我自己直到21岁才打开我的第一个计算机程序。

I’d love to see an army of 12-year-olds graphing covid-19 in novel ways, getting invested in the spread and chiding their aunts and uncles to wash their hands better at their (socially distant) Thanksgivings. Furthermore, the more publicly available data, the more perspectives data scientists can tie in when trying to make predictions and recommendations on the job. More data + more people interested in data = a better world, clean and simple.

我很乐意看到一支由12岁的年轻人组成的小组以新颖的方式绘制covid-19字样,投入资金进行传播,并责怪他们的姑姑和叔叔在(远离社交的)感恩节那天更好地洗手。 此外,公开可用的数据越多,数据科学家在尝试对工作进行预测和建议时可以结合的视角越多。 更多数据+更多对数据感兴趣的人=一个更美好,更干净,更简单的世界。

Or so I thought.

还是我想。

关键时刻 (The Moment of Truth)

This morning I had written an article about obtaining covid-19 data, performing your own exploratory analysis and then graphing it with animation in R. The end product looked something like this:

今天早上,我写了一篇有关获取covid-19数据,进行您自己的探索性分析,然后用R中的动画绘制图形的文章。最终产品看起来像这样:

Image for post

Cool, right? Not only a hot topic at the moment (covid-19) but now it’s animated!

酷吧? 不仅是当下的热门话题(covid-19),而且现在已经成为动画!

If anything, I felt like this was the right thing to do if it helped one person visualize the pandemic from their own computer. Furthermore, I was using knowledge gained from my master’s to make a chart I thought both accurate and eye-catching and I genuinely proud of that.

如果有的话,我认为如果这可以帮助一个人从自己的计算机上直观地看到大流行,那是正确的选择。 此外,我利用从硕士获得的知识来制作一张我认为既准确又引人注目的图表,我为此感到非常自豪。

I was just about to hit publish when I decided to do a few parting reads of how other Medium articles broached the topic. That’s when I started reading a host of posts published by public health and data viz experts that made a whole lot of sense — and not in a way that made me feel helpful or positive anymore.

当我决定对其他Medium文章如何提出该主题进行部分阅读时,我即将publish 。 从那时起,我开始阅读由公共卫生和数据专家撰写的大量文章,这些文章很有道理-不再以使我感到帮助或积极的方式出现。

世界正在遭受Covid-19大小的可视化泡沫 (The World is Suffering a Covid-19-Sized Visualization Bubble)

“To sum it up — #vizresponsibly; which may mean not publishing your visualizations in the public domain at all.”

“总结起来-负责任地; 这可能意味着根本就不会在公共领域发布您的可视化文件。”

— Amanda Makulec

—阿曼达·马库莱克(Amanda Makulec)

Instead of my article visualization the coronavirus from the umpteenth time, I think it would be better to skim over one of these articles instead, written by actual professors and health experts:

我认为最好是略过由实际教授和卫生专家撰写的以下文章之一,而不是我的文章从第10次开始可视化冠状病毒:

In her sobering article, Amanda Makulec goes on to say:

阿曼达·马库莱克(Amanda Makulec)在其发人深省的文章中继续说:

The stakes are high around how we communicate about this epidemic to the wider public. Visualizations are powerful for communicating information, but can also mislead, misinform, and — in the worst cases — incite panic. We are in the middle of complete information overload, with hourly case updates and endless streams of information.

在我们如何将这种流行病传播给广大公众方面风险非常高 。 可视化功能强大,可以传达信息,但也会误导,误导信息,在最坏的情况下还会引起恐慌。 我们正处于完全的信息过载之中,每小时更新一次案例,信息源源不断。

As a public health professional, might I ask:

作为一名公共卫生专业人员,请问:

Please consider if what you’ve created serves an actual information need in the public domain. Does it add value to the public and uncover new information?

请考虑您创建的内容是否满足公共领域的实际信息需求。 它会为公众增加价值并发现新信息吗?

If not, perhaps this is one viz that should be for your own use only.

如果不是,也许这只是您自己使用的一种。

Reading these posts — and taking a moment to think hard about my next steps and their consequences — made me realize that it was better to pull the article than publish it. Sure, maybe I lost a few hours of my life by not publishing an article I had already finished — but there was a substantial chance I could do more harm than good, and for 1/10th more of the time I could be sharing articles written by people who know the topic better than I ever will.

阅读这些文章,并花点时间仔细考虑我的后续步骤及其后果,使我意识到,撰写这篇文章比发表它更好。 当然,也许我因为不发表自己已经写完的文章而损失了数小时的时间,但是我很有可能弊大于利,而且有超过十分之一的时间我可以分享所写的文章谁比我更了解这个主题的人。

离别的想法 (Parting Thoughts)

When it comes to a life-threatening disease that has impacted millions of people, I’ve come to realize that it is better to amplify the voices of experts than to contribute to a melting pot of novices (including myself) putting their hat in the ring. Whether or not I can personally make an accurate graph matters less than helping to share a select number of data visualizations that are the most telling and honest in their depictions of this epidemic.

当涉及到威胁到数百万人的威胁生命的疾病时,我已经意识到, 扩大专家的声音要比助长新手(包括我自己)的大锅大喝更好。环。 是否可以亲自制作一张准确的图表,所需要解决的事情,不如帮助分享一些有关该流行病的描述中最能说明事实和最诚实的数据可视化。

Image for post
Edwin Andrade on 埃德温·安德拉德 ( UnsplashUndersplash)摄影

Thank you to the visualization experts who unknowingly taught me the importance of responsibility over my data viz today. Having the ability to create information from data is like having a superpower —and you just saved me from becoming a villain.

感谢可视化专家,他们在不知不觉中教会了我今天数据管理责任的重要性。 能够从数据中创建信息就像拥有超级大国一样,而您只是使我免于成为小人。

Amanda West is a current master’s student in the School of Data Science at the University of Virginia. Prior to the program, she attended the University of Michigan, where she graduated with honors in economics and a math minor, studied abroad as a Gilman Scholar in Beijing, interned for the Ministry of Economic Development in Albania, trained and competed internationally in taekwondo, and held various jobs including as a Research Assistant and Data Visualization Consultant. You can contact her through her personal website here.

Amanda West目前是弗吉尼亚大学数据科学学院的硕士研究生。 在参加该计划之前,她曾就读于密歇根大学(University of Michigan),以优异的成绩毕业于经济学和数学专业,并在北京以吉尔曼学者的身份出国学习,在阿尔巴尼亚经济发展部实习,并在跆拳道进行了国际培训和比赛,并担任过各种工作,包括担任研究助理和数据可视化顾问。 您可以通过她的个人网站联系她 在这里

翻译自: https://towardsdatascience.com/a-students-first-encounter-with-responsible-data-viz-847f21c1c8e4

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/391696.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

集合_java集合框架

转载自http://blog.csdn.net/zsw101259/article/details/7570033 Java集合框架图 简化图: Java平台提供了一个全新的集合框架。“集合框架”主要由一组用来操作对象的接口组成。不同接口描述一组不同数据类型。 1、Java 2集合框架图 ①集合接口:6个…

显示随机键盘

显示随机键盘 1 <!DOCTYPE html>2 <html lang"zh-cn">3 <head>4 <meta charset"utf-8">5 <title>7-77 课堂演示</title>6 <link rel"stylesheet" type"text/css" href"style…

数据特征分析-统计分析

一、统计分析 统计分析是对定量数据进行统计描述&#xff0c;常从集中趋势和离中趋势两个方面分析。 集中趋势&#xff1a;指一组数据向某一中心靠拢的倾向&#xff0c;核心在于寻找数据的代表值或中心值-统计平均数&#xff08;算数平均数和位置平均数&#xff09; 算术平均数…

数据eda_银行数据EDA:逐步

数据edaThis banking data was retrieved from Kaggle and there will be a breakdown on how the dataset will be handled from EDA (Exploratory Data Analysis) to Machine Learning algorithms.该银行数据是从Kaggle检索的&#xff0c;将详细介绍如何将数据集从EDA(探索性…

结构型模式之组合

重新看组合/合成&#xff08;Composite&#xff09;模式&#xff0c;发现它并不像自己想象的那么简单&#xff0c;单纯从整体和部分关系的角度去理解还是不够的&#xff0c;并且还有一些通俗的模式讲解类的书&#xff0c;由于其举的例子太过“通俗”&#xff0c;以致让人理解产…

计算机网络原理笔记-三次握手

三次握手协议指的是在发送数据的准备阶段&#xff0c;服务器端和客户端之间需要进行三次交互&#xff1a; 第一次握手&#xff1a;客户端发送syn包(synj)到服务器&#xff0c;并进入SYN_SEND状态&#xff0c;等待服务器确认&#xff1b; 第二次握手&#xff1a;服务器收到syn包…

Bigmart数据集销售预测

Note: This post is heavy on code, but yes well documented.注意&#xff1a;这篇文章讲的是代码&#xff0c;但确实有据可查。 问题描述 (The Problem Description) The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in…

数据特征分析-帕累托分析

帕累托分析(贡献度分析)&#xff1a;即二八定律 目的&#xff1a;通过二八原则寻找属于20%的关键决定性因素。 随机生成数据 df pd.DataFrame(np.random.randn(10)*10003000,index list(ABCDEFGHIJ),columns [销量]) #避免出现负数 df.sort_values(销量,ascending False,i…

dt决策树_决策树:构建DT的分步方法

dt决策树介绍 (Introduction) Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred f…

读C#开发实战1200例子记录-2017年8月14日10:03:55

C# 语言基础应用&#xff0c;注释 "///"标记不仅仅可以为代码段添加说明&#xff0c;它还有一项更重要的工作&#xff0c;就是用于生成自动文档。自动文档一般用于描述项目&#xff0c;是项目更加清晰直观。在VisualStudio2015中可以通过设置项目属性来生成自动文档。…

数据特征分析-正太分布

期望值&#xff0c;即在一个离散性随机变量试验中每次可能结果的概率乘以其结果的总和。 若随机变量X服从一个数学期望为μ、方差为σ^2的正态分布&#xff0c;记为N(μ&#xff0c;σ^2)&#xff0c;其概率密度函数为正态分布的期望值μ决定了其位置&#xff0c;其标准差σ决定…

r语言调用数据集中的数据集_自然语言数据集中未解决的问题

r语言调用数据集中的数据集Garbage in, garbage out. You don’t have to be an ML expert to have heard this phrase. Models uncover patterns in the data, so when the data is broken, they develop broken behavior. This is why researchers allocate significant reso…

数据特征分析-相关性分析

相关性分析是指对两个或多个具备相关性的变量元素进行分析&#xff0c;从而衡量两个变量的相关密切程度。 相关性的元素之间需要存在一定的联系或者概率才可以进行相关性分析。 相关系数在[-1,1]之间。 一、图示初判 通过pandas做散点矩阵图进行初步判断 df1 pd.DataFrame(np.…

获取所有权_住房所有权经济学深入研究

获取所有权Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seekin…

getBoundingClientRect说明

getBoundingClientRect用于获取某个元素相对于视窗的位置集合。 1.语法&#xff1a;这个方法没有参数。 rectObject object.getBoundingClientRect() 2.返回值类型&#xff1a;TextRectangle对象&#xff0c;每个矩形具有四个整数性质&#xff08; 上&#xff0c; 右 &#xf…

robot:接口入参为图片时如何发送请求

https://www.cnblogs.com/changyou615/p/8776507.html 接口是上传图片&#xff0c;通过F12抓包获得如下信息 由于使用的是RequestsLibrary&#xff0c;所以先看一下官网怎么传递二进制文件参数&#xff0c;https://2.python-requests.org//en/master/user/advanced/#post-multi…

已知两点坐标拾取怎么操作_已知的操作员学习-第3部分

已知两点坐标拾取怎么操作有关深层学习的FAU讲义 (FAU LECTURE NOTES ON DEEP LEARNING) These are the lecture notes for FAU’s YouTube Lecture “Deep Learning”. This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as mu…

缺失值和异常值处理

一、缺失值 1.空值判断 isnull()空值为True&#xff0c;非空值为False notnull() 空值为False&#xff0c;非空值为True s pd.Series([1,2,3,np.nan,hello,np.nan]) df pd.DataFrame({a:[1,2,np.nan,3],b:[2,np.nan,3,hello]}) print(s.isnull()) print(s[s.isnull() False]…

特征工程之特征选择_特征工程与特征选择

特征工程之特征选择&#x1f4c8;Python金融系列 (&#x1f4c8;Python for finance series) Warning: There is no magical formula or Holy Grail here, though a new world might open the door for you.警告 &#xff1a; 这里没有神奇的配方或圣杯&#xff0c;尽管新世界可…

版本号控制-GitHub

前面几篇文章。我们介绍了Git的基本使用方法及Gitserver的搭建。本篇文章来学习一下怎样使用GitHub。GitHub是开源的代码库以及版本号控制库&#xff0c;是眼下使用网络上使用最为广泛的服务&#xff0c;GitHub能够托管各种Git库。首先我们须要注冊一个GitHub账号&#xff0c;打…