mongodb仲裁者
Coming out of college with a background in mathematics, I fell upward into the rapidly growing field of data analytics. It wasn’t until years later that I realized the incredible power that comes with the position. As Uncle Ben told Peter Parker (aka Spiderman), “With great power, comes great responsibility”. The proverb echoed by Uncle Ben perfectly sums up an unspoken reality for data professionals of all levels and types. You have to wonder if Peter Parker’s real superpower was data expertise. Unlike Spiderman, our enemies are not quite as obvious as a flying green monster. As a data professional, we must remain vigilant on topics such as data privacy, algorithmic biases, and presenting information objectively.
从大学毕业并拥有数学背景后,我就进入了快速增长的数据分析领域。 直到几年后,我才意识到该职位所具有的强大力量。 正如本叔叔对彼得·帕克(又名蜘蛛侠)说的那样:“能力越大,责任就越大”。 本叔叔回响的谚语完美地概括了所有级别和类型的数据专业人员一个不言而喻的现实。 您必须怀疑Peter Parker的真正超级能力是否是数据专业知识。 与蜘蛛侠不同,我们的敌人并不像飞行的绿色怪物那么明显。 作为数据专业人员,我们必须保持警惕,例如数据隐私,算法偏见和客观地呈现信息。
政府中的数据伦理 (Data Ethics in the Government)
My first encounter with sensitive data came at the U.S. Census Bureau back in 2016. My team was responsible for compiling and disseminating the U.S International Trade in Goods and Services report each month. The reports show how much the U.S. imports and exports various commodities with other countries. To the average person, this might not impact their lives, but to an investor, this information is incredibly valuable.
我第一次接触敏感数据是在2016年的美国人口普查局。我的团队负责每月编制和发布《美国国际商品和服务贸易报告》 。 报告显示,美国与其他国家进出口了多少商品。 对于普通人来说,这可能不会影响他们的生活,但对投资者而言,此信息非常有价值。
Being an ambitious employee, I wanted to add a little pizzazz to their webpage. My plan was to display a fancy, Tableau chart (yes, they were fancy back then) relating to the Trans-Pacific-Partnership. This would be the equivalent of a news agency reporting the relevant facts for any major economic event. Sadly, I was shut down. I was told that the Census could not appear biased on the new free trade agreement. At the time, I did not quite understand. However, looking back on it, I can fully appreciate the sensitivity. The Census controls incredibly valuable information that could have wide implications on the economy and its people. In order to be effective, it must remain non-partisan. Otherwise, the numbers will become politicized and then the truth becomes questionable.
作为一个雄心勃勃的员工,我想在他们的网页上加些小气。 我的计划是要显示一张与跨太平洋伙伴关系有关的Tableau图表(是的,当时它们很漂亮)。 这相当于新闻机构报道任何重大经济事件的有关事实。 可悲的是,我被关闭了。 有人告诉我,人口普查似乎不会对新的自由贸易协定产生偏见。 当时,我不太了解。 但是,回顾它,我可以完全理解它的敏感性。 人口普查控制着极其宝贵的信息,这些信息可能对经济及其人民产生广泛影响。 为了有效,它必须保持无党派。 否则,数字将被政治化,然后真相就变得可疑。
算法偏向 (Algorithmic Biases)
“When a measure becomes a target, it ceases to be a good measure”- Goodhart’s Law
“当一项措施成为目标时,它就不再是一项好的措施”-古德哈特定律
I see the above statement quoted often, yet KPIs remain incredibly common in organizations. One of my previous digital transformation projects required my department to adopt a new CRM (Contact Relationship Management) software. With this new system, leadership requested KPIs to measure participation in the tool. Anyone who has installed a new system knows the challenges of culture change and adoption. The software and the process must go hand-in-hand to be successful. Therefore, we needed the best method for measuring and incentivizing user activity in the CRM.
我看到上面的陈述经常被引用,但是KPI在组织中仍然非常普遍。 我以前的数字转换项目之一要求我的部门采用新的CRM(联系关系管理)软件。 通过此新系统,领导层要求KPI衡量该工具的参与程度。 任何安装了新系统的人都知道文化变革和采用的挑战。 该软件和过程必须齐头并进,才能成功。 因此,我们需要衡量和激励CRM中用户活动的最佳方法。
In our system, users were expected to enter and update potential public policies that would impact the organization. We had users responsible for different regions around the globe. Some regions, such as Europe, had more policy activity than other regions. Some regions had more users to help keep the records up to date. Each region could vary in its importance from a financial perspective. In our CRM, you could measure logins, views, edits, added records, deleted records, and more. Each metric had an inherent bias in the calculation. To simplify things, we will assume that we can only calculate metrics at the region level and this will be on a biweekly basis. Let’s take a look at some of the options and their implications.
在我们的系统中,希望用户输入并更新可能影响组织的潜在公共策略。 我们有负责全球不同地区的用户。 欧洲等某些地区的政策活动比其他地区更多。 一些地区有更多的用户来帮助使记录保持最新。 从财务角度看,每个地区的重要性可能会有所不同。 在我们的CRM中,您可以衡量登录,视图,编辑,添加的记录,删除的记录等。 每个指标在计算中都有一个固有的偏差。 为了简化起见,我们假设我们只能在区域级别上计算指标,并且这将是每两周一次。 让我们看一些选项及其含义。
When designing the appropriate KPIs for this new system, there were biases, assumptions, and incentives at play no matter which metric we chose. While mindlessly scrolling through Twitter, I recently came upon a quote that perfectly sums up the above process.
在为该新系统设计适当的KPI时,无论我们选择哪种度量标准,都存在偏差,假设和激励因素。 在漫不经心地浏览Twitter时,我最近引述了一个引言,它完美地总结了上述过程。
“The very act of turning something into a number is an assumption.”- Kareem Carr
“将某物转化为数字的行为只是一种假设。”- Kareem Carr
诚信是必须的 (Integrity is a Must)
A few months back, I was working with a colleague who needed some assistance with the analysis and presentation of information that would be available to the public. As soon as you hear the words, “public data”, any data professional’s mind will immediately gravitate towards data security. Fortunately, this was not an issue.
几个月前,我正在与一位同事合作,他需要一些帮助来分析和呈现可供公众使用的信息。 一旦您听到“公共数据”一词,任何数据专业人士的想法都会立即趋向于数据安全。 幸运的是,这不是问题。
My colleague proceeded to explain what data we had (i.e. very little) and the purpose of the presentation. After some exploration, I realized that we could not provide any summary statistics at the requested level of detail. We could only provide an estimate of the overall total. This was insufficient for their project. There was pressure to “make some magic happen”; especially, if I wanted to impress a few senior level colleagues. The short term would yield a reputational boost for myself, but over the long term, it risks significant reputational damage for the organization (and myself).
我的同事开始解释我们拥有的数据(即很少)以及演示的目的。 经过一番探索,我意识到我们无法提供所要求的详细级别的任何摘要统计信息。 我们只能提供总体估算值。 这对于他们的项目是不够的。 迫于压力“要使一些魔术发生”; 特别是如果我想打动一些高级同事。 短期将为自己带来声誉提升,但从长远来看,它将给组织(和我自己)带来重大声誉损失。
最后的想法 (Final Thoughts)
As data is becoming seamlessly woven into every process, there come ethical risks that aren’t talked about enough. When data professionals start implementing black-box algorithms into your decision-making processes, it will be too late. Organizations need to instill a culture of ethical, data-driven decision making from the top.
随着数据无缝地融入到每个流程中,随之而来的道德风险还没有得到足够的重视。 当数据专业人员开始在您的决策过程中实施黑盒算法时,为时已晚。 组织需要从高层灌输一种道德的,由数据驱动的决策文化。
As a data professional, you will frequently find yourself at the center of difficult decisions, especially, if you work with colleagues who struggle with data and numbers. Your job is to bridge the gap between their subject matter expertise and the appropriate analysis or presentation of the information. In that gap, lies an opportunistic, invisible enemy who wants you to take the shortcut. Follow in Spiderman’s footsteps and proceed with integrity.
作为数据专业人员,您经常会发现自己处于困难决策的中心,尤其是与与数据和数字纠缠不清的同事一起工作时。 您的工作是弥合他们的主题专业知识和适当的信息分析或表示之间的鸿沟。 在那个空白中,是一个机会主义的,看不见的敌人,他想让你走捷径。 跟随蜘蛛侠的脚步,继续诚信。
~ The Data Generalist
〜 数据通才
翻译自: https://towardsdatascience.com/the-arbiters-of-truth-d97ce1a4e4a6
mongodb仲裁者
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389423.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!