您是六个主要数据角色中的哪一个

When you were growing up, did you ever play the name game? The modern data organization has something similar, and it’s called the “Bad Data Blame Game.” Unlike the name game, however, the Bad Data Blame Game is played when data downtime strikes and no amount of rhyming and dancing can save the day.

当您长大时,您玩过名字游戏吗? 现代的数据组织也有类似的东西,它被称为“不良数据责备游戏”。 但是,与名称游戏不同,当数据停机时会玩坏数据责备游戏 罢工,没有任何押韵和跳舞可以挽救一天。

Data downtime refers to periods of time when your data is partial, erroneous, missing, or otherwise inaccurate, and nine times out of ten, you have no idea what caused it. All you know is that it’s 3 a.m., your CEO is pissed, your dashboards are wrong, and you need to fix it — stat.

数据停机时间是指数据不完整,错误,丢失或以其他方式不准确的时间段,十分之九,您不知道是什么原因造成的。 您所知道的是现在是凌晨3点,您的CEO生气,您的仪表板错误,您需要修复它-统计信息。

After speaking to over 200 data teams, we’ve identified the major data personas involved in the Bad Data Blame Game. Maybe you recognize one or two?

在与200多个数据团队进行了交谈之后,我们已经确定了Bad Data Blame游戏中涉及的主要数据角色。 也许您认识一两个?

In this article, we’ll introduce these roles, zero in on their hopes, dreams, and fears, and share our approach to conquering data reliability at your company.

在本文中,我们将介绍这些角色,零落他们的希望,梦想和恐惧,并分享我们在贵公司征服数据可靠性的方法。

首席数据官 (Chief Data Officer)

Image for post
Image courtesy of Javier Sierra on Unsplash.
图片由 哈维尔·塞拉 ( Javier Sierra) 在《 Unsplash》 提供

This is Ophelia, your Chief Data Officer (CDO). Although she’s probably not (wo)manning your company’s data pipelines or Looker dashboards, Ophelia’s impact is hitched to the consistency, accuracy, relevance, interpretability, and reliability of the data her team provides.

这是Ophelia,您的首席数据官(CDO)。 尽管Ophelia可能不负责公司的数据管道或Looker仪表板,但其团队的影响力在于一致性,准确性,相关性,可解释性和可靠性。

Ophelia wakes up every day and asks herself two things. First, are different departments getting the data they need to be effective? And second, are we managing risk around that data effectively?

奥菲莉亚每天醒来,问自己两件事。 首先,不同部门是否在获取有效数据? 其次,我们是否围绕该数据有效地管理风险?

She would sleep much easier with a clear, bird’s-eye view showing that her data ecosystem is operating as it should. At the end of the day, if bad data gets in front of the CEO, out to the public, or to any other data consumer, she’s on the line.

通过清晰的鸟瞰图可以看出她的数据生态系统正在按预期运行,从而使她的睡眠更加轻松。 归根结底,如果不良数据出现在CEO面前,向公众或其他任何数据消费者传播,那么她就可以上线了。

商业智能分析师 (Business Intelligence Analyst)

Image for post
Image courtesy of 图片由 Christina克里斯蒂娜 on UnsplashUnsplash.

Betty, the business intelligence lead or data analyst, wants a punchy and insightful dashboard she can share with her stakeholders in marketing, sales, and operations to answer their multifarious questions about how their business functions are performing. When things go wrong at the practitioner-level, Betty is the first one paged.

商业智能主管或数据分析师Betty希望她能与市场,销售和运营部门的利益相关者共享一个强大而有见地的仪表板,以回答他们有关其业务功能如何执行的各种问题。 当从业者层面出现问题时,Betty是第一页。

To ensure reliable data, she needs to answer these questions:

为了确保数据可靠,她需要回答以下问题:

  • Are we translating data into metrics and insights that are meaningful to the business?

    我们是否将数据转换为对业务有意义的指标和见解?
  • Are we confident that the data is reliable and means what we think it means?

    我们是否有信心数据可靠并能代表我们认为的意义?
  • Is it easy for others to access and understand these insights?

    其他人是否容易获得和理解这些见解?

Null values and duplicated entries are Betty’s arch-nemeses and she’s a fan of anything that can prevent data downtime from compromising her peace of mind. She’s fatigued by business stakeholders that ask her to investigate a funny value in a report — it’s a long process to chase the data upstream and validate if it’s right!

空值和重复的条目是Betty的主要敌人,她是任何可以防止数据停机影响她安心的事物的支持者。 她对业务涉众感到疲倦,他们要求她调查报告中的一个有趣的值-追逐上游数据并验证是否正确是一个漫长的过程!

数据科学家 (Data Scientist)

Image for post
Image courtesy of Tim van der Kuip on Unsplash.
图片由 Tim van der Kuip Unsplash 提供

Sam, the data scientist, studied Forestry in undergrad, but decided to make the jump to industry to pay off his student loans. Somewhere between a line of Python code and a data visualization, he fell in love with data science. And the rest was history.

数据科学家萨姆(Sam)在本科学习了林业,但是他决定跳入工业界以偿还学生贷款。 在一段Python代码和数据可视化之间的某个地方,他爱上了数据科学。 剩下的就是历史了。

To do his job well, Sam needs to know 1) where the data comes from and 2) that it’s reliable, because if it’s not, his team’s A/B tests won’t work and all downstream consumers (analysts, managers, executives, and customers) will suffer.

为了做好自己的工作,Sam需要知道1)数据来自何处以及2)可靠,因为如果不可靠,他的团队的A / B测试将无法正常工作,并且所有下游消费者(分析师,经理,管理人员,和客户)将遭受损失。

Sam’s team spends roughly 80 percent of their time scrubbing, cleaning, and understanding the context of the data, so they need tools and solutions that can make their lives easier.

Sam的团队花费了大约80%的时间来清理,清理和理解数据的上下文,因此他们需要可以简化生活的工具和解决方案。

数据治理主管 (Data Governance Lead)

Image for post
Image courtesy of 图片由 GAGA on UnsplashUnsplash中提供.

Proud owner of a seven-month old puppy, Gerald is the company’s very first data governance specialist. He started off on the legal team, and then, when GDPR and CCPA entered the picture, eventually focused his efforts exclusively on data compliance. It’s a novel role, but becoming increasingly important as the organization grows.

杰拉尔德(Gerald)骄傲地拥有一只七个月大的小狗,是公司的第一位数据治理专家。 他开始加入法律团队,然后当GDPR和CCPA介入时,最终将他的工作完全集中在数据合规性上。 这是一个新颖的角色,但随着组织的发展而变得越来越重要。

When it comes to data reliability, Gerald cares about 1) unified definitions of data and metrics across the company and 2) understanding who has access and visibility to what data.

关于数据可靠性,Gerald关注的是:1)公司中数据和指标的统一定义,以及2)了解谁可以访问和查看哪些数据。

For Gerald, bad data can mean costly fines, erosion of customer trust, and lawsuits. Despite the criticality of his role, he sometimes jests that it’s like accounting: “you’re only front and center if something has gone wrong!”

对于杰拉尔德(Gerald)而言,不良数据可能意味着高昂的罚款,客户信任度的下降以及诉讼。 尽管他扮演的角色很关键,但有时他还是开玩笑说这就像会计:“如果出了问题,您只会处于中心位置!”

数据工程师 (Data Engineer)

Image for post
Image courtesy of 图片由 Christina克里斯蒂娜 on UnsplashUnsplash.

When it comes to data reliability, Emerson, the data engineer, is at the crux of the equation.

在数据可靠性方面,数据工程师艾默生(Emerson)处于关键所在。

Emerson started out as a full-stack developer at a small e-commerce startup, but then as the company grew, so too did their data needs. Before she knew it, she was responsible not just for building their data product but also integrating the data sources the team relies on to make decisions about the business. Now, she’s a Snowflake expert, PowerBI guru, and general data tooling whiz.

Emerson最初是一家小型电子商务初创公司的全栈开发人员,但是随着公司的发展,他们的数据需求也随之增长。 在不知不觉中,她不仅负责构建其数据产品,还负责集成团队用来制定业务决策所依赖的数据源。 现在,她是Snowflake专家,PowerBI专家和通用数据工具专家。

Emerson and her team are the glue that hold the company’s data ecosystem together. They implement technologies that monitor the reliability of their company’s data, and if something goes awry, she’s the one whose paged by the analytics team at 3 a.m. to fix it. Like Betty, she’s lost countless hours of sleep because of this.

艾默生和她的团队是将公司数据生态系统整合在一起的粘合剂。 他们采用的技术可以监控公司数据的可靠性,如果出现问题,分析小组会在凌晨3点对她进行修复。 像贝蒂一样,她因此失去了数小时的睡眠。

To be successful at her job, Emerson must tackle a lot of things, including:

为了在工作中取得成功,艾默生必须处理很多事情,包括:

  • Designing a data platform solution that scales

    设计可扩展的数据平台解决方案
  • Ensuring that data ingestion is reliable

    确保数据提取可靠
  • Making the platform accessible to other teams

    使其他团队可以访问该平台
  • Being able to fix data downtime quickly when it happens

    能够在发生故障时快速修复数据停机
  • And above all else, making life sustainable for the entire data organization

    最重要的是,使整个数据组织的生命可持续

数据产品经理 (Data Product Manager)

Image for post
Image courtesy of Elizeu Dias on Unsplash.
图片由 Elizeu Dias 提供, 内容为 Unsplash

This is Peter. He’s a data product manager. Peter got his start as a back-end developer, but made the jump to product management a few years ago. Like Gerald, he’s the company’s first-ever hire in this role, which is simultaneously exciting and challenging.

这是彼得。 他是数据产品经理。 Peter最初是一名后端开发人员,但几年前就跳槽到产品管理领域。 和杰拉尔德一样,他是公司有史以来第一位担任此职位的人,这既令人兴奋又充满挑战。

He’s up to date on all the latest data engineering and data analytics solutions, and is often called upon to make decisions on what offerings his organization needs to invest in to be successful. He knows firsthand how automation and self-serve tooling make all the difference when it comes to delivering an accessible, scalable data product.

他了解所有最新的数据工程和数据分析解决方案,并且经常被要求就其组织为成功需要投资哪些产品做出决策。 他直接了解自动化和自助服务工具在交付可访问的,可扩展的数据产品方面如何发挥作用。

All other data stakeholders, from analysts to social media managers, are dependent on him for building a platform that ingests, unifies, and makes accessible data from a myriad of sources to consumers all over the business. Oh, and did we mention that this data must be compliant with GDPR, CCPA, and other industry regulations? It’s a challenging role and it’s difficult to keep everyone happy — it seems like his platform is always one transformation away from what BI actually wanted.

从分析师到社交媒体经理的所有其他数据利益相关者都依赖他来构建一个平台,该平台可从众多来源向整个企业的消费者提取,统一并提供可访问的数据。 哦,我们是否提到过这些数据必须符合GDPR,CCPA和其他行业法规? 这是一个具有挑战性的角色,很难让每个人都开心–看来他的平台始终是BI 真正想要的一种转变。

谁负责数据可靠性? (Who is responsible for data reliability?)

So, who in your data organization owns the reliability piece of your data ecosystem?

那么,您的数据组织中谁拥有数据生态系统的可靠性?

As you can imagine, the answer isn’t simple. From your company’s CDO to your data engineers, it’s ultimately everyone’s responsibility to ensure data reliability. And although nearly every arm of every organization at every company relies on data, not every data team has the same structure, and various industries have different requirements. (For instance, it’s the norm for financial institutions to hire entire teams of data governance experts, but at a small startup, not so much. And for those startups that do — we commend you!)

您可以想象,答案并不简单。 从公司的CDO到数据工程师,确保数据可靠性最终都是每个人的责任。 尽管每个公司的每个组织的几乎每个部门都依赖数据,但并非每个数据团队都具有相同的结构,并且各个行业都有不同的要求。 (例如,对于金融机构而言,雇用整个团队的数据治理专家是正常的做法,但是对于一家小型初创公司而言,聘请的人数不多。对于那些从事此类工作的初创公司,我们表示赞赏!)

Below, we outline our approach to mapping data responsibilities, from accessibility to reliability, across your data organization using the RACI (Responsible, Accountable, Consulted, and Informed) matrix guidelines:

下面,我们概述了使用RACI (负责,负责,咨询和知情)矩阵准则在整个数据组织中映射数据职责(从可访问性到可靠性)的方法:

Image for post
Diagram courtesy of Monte Carlo.
图由蒙地卡罗 ( Monte Carlo)提供 。

At companies that ingest and transform terabytes of data (like Netflix or Uber), we’ve found that it is common for data engineers and data product managers to tackle the responsibility of monitoring and alerting for data reliability issues.

在摄取和转换TB级数据的公司(例如Netflix或Uber ),我们发现数据工程师和数据产品经理通常要承担监视和警告数据可靠性问题的责任。

Barring these behemoths, the responsibility often falls on data engineers and product managers. They must balance the organization’s demand for data with what can be provided reliably. Notably, the brunt of any bad choices made here is often borne by the BI analysts, whose dashboards may wind up containing bad information or break from silent changes. In very early data organizations, these roles are often combined into a jack-of-all-trades data person or a product manager.

除这些庞然大物之外,责任通常落在数据工程师和产品经理身上。 他们必须在组织对数据的需求与可靠提供的数据之间取得平衡。 值得注意的是,BI分析师通常会在这里做出任何错误选择,其中BI分析师的仪表板可能最终包含错误的信息或无法进行静默更改。 在非常早期的数据组织中,这些角色通常被组合为万事通数据人员或产品经理。

Regardless of your team’s situation, you’re not alone.

无论您的团队情况如何,您都不是一个人。

Fortunately, there’s a better way to start trusting your data: data observability. It’s an approach that’s taking off with most innovative companies, no matter who is ultimately responsible for ensuring data reliability in your organization.

幸运的是,有一种更好的方式开始信任您的数据: 数据可观察性 。 无论谁最终负责确保组织中数据的可靠性,这种方法在大多数创新型公司中都在流行。

In fact, with the right data reliability strategy, the Bad Data Blame Game is a thing of the past and full end-to-end observability is in sight.

实际上,有了正确的数据可靠性策略,Bad Data Blame Game已经成为过去,并且可以看到完整的端到端可观察性。

Interested in learning more? Reach out to Barr Moses, Will Robins, and the rest of the Monte Carlo team.

有兴趣了解更多吗? Barr Moses Will Robins 蒙特卡洛团队 的其他 成员接触

This article was written by Barr Moses and Will Robins.

本文由 Barr Moses Will Robins 撰写

翻译自: https://towardsdatascience.com/which-of-the-six-major-data-personas-are-you-8dbf434b7c9e

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389787.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

自定义按钮动态变化_新闻价值的变化定义

自定义按钮动态变化I read Bari Weiss’ resignation letter from the New York Times with some perplexity. In particular, I found her claim that she “was hired with the goal of bringing in voices that would not otherwise appear in your pages” a bit strange: …

Linux记录-TCP状态以及(TIME_WAIT/CLOSE_WAIT)分析(转载)

1.TCP握手定理 2.TCP状态 l CLOSED:初始状态,表示TCP连接是“关闭着的”或“未打开的”。 l LISTEN :表示服务器端的某个SOCKET处于监听状态,可以接受客户端的连接。 l SYN_RCVD :表示服务器接收到了来自客户端请求…

算法 从 数中选出_算法可以选出胜出的nba幻想选秀吗

算法 从 数中选出Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without …

django-rest-framework第一次使用使用常见问题

2019独角兽企业重金招聘Python工程师标准>>> 记录在第一次使用django-rest-framework框架使用时遇到的问题,为了便于理解在这里创建了Person和Grade这两个model from django.db import models class Person(models.Model):SHIRT_SIZES ((S, Small),(M, …

插入脚注把脚注标注删掉_地狱司机不应该只是英国电影历史数据中的脚注,这说明了为什么...

插入脚注把脚注标注删掉Cowritten by Andie Yam由安迪(Andie Yam)撰写 Hell Drivers”, 1957地狱司机 》电影海报 Data visualization is a great way to celebrate our favorite pieces of art as well as reveal connections and ideas that were previously invisible. Mor…

贝叶斯统计 传统统计_统计贝叶斯如何补充常客

贝叶斯统计 传统统计For many years, academics have been using so-called frequentist statistics to evaluate whether experimental manipulations have significant effects.多年以来,学者们一直在使用所谓的常客统计学来评估实验操作是否具有significant效果。…

saltstack二

配置管理 haproxy的安装部署 haproxy各版本安装包下载路径https://www.haproxy.org/download/1.6/src/,跳转地址为http,改为https即可 创建相关目录 # 创建配置目录 [rootlinux-node1 ~]# mkdir /srv/salt/prod/pkg/ [rootlinux-node1 ~]# mkdir /srv/sa…

319. 灯泡开关

319. 灯泡开关 初始时有 n 个灯泡处于关闭状态。第一轮,你将会打开所有灯泡。接下来的第二轮,你将会每两个灯泡关闭一个。 第三轮,你每三个灯泡就切换一个灯泡的开关(即,打开变关闭,关闭变打开&#xff0…

因为你的电脑安装了即点即用_即你所爱

因为你的电脑安装了即点即用Data visualization is a great way to celebrate our favorite pieces of art as well as reveal connections and ideas that were previously invisible. More importantly, it’s a fun way to connect things we love — visualizing data and …

2074. 反转偶数长度组的节点

2074. 反转偶数长度组的节点 给你一个链表的头节点 head 。 链表中的节点 按顺序 划分成若干 非空 组,这些非空组的长度构成一个自然数序列(1, 2, 3, 4, …)。一个组的 长度 就是组中分配到的节点数目。换句话说: 节点 1 分配给…

团队管理新思考_需要一个新的空间来思考讨论和行动

团队管理新思考andrew wong安德鲁黄 Follow跟随 Sep 4 九月4 There is a need for a new space to think, discuss, and act. This need are being felt by the majority of AI / ML / Data Product Managers out there. They are exhausted by the ever increasing data volum…

2075. 解码斜向换位密码

2075. 解码斜向换位密码 字符串 originalText 使用 斜向换位密码 ,经由 行数固定 为 rows 的矩阵辅助,加密得到一个字符串 encodedText 。 originalText 先按从左上到右下的方式放置到矩阵中。 先填充蓝色单元格,接着是红色单元格&#xff…

微服务实战(六):落地微服务架构到直销系统(事件存储)

在CQRS架构中,一个比较重要的内容就是当命令处理器从命令队列中接收到相关的命令数据后,通过调用领域对象逻辑,然后将当前事件的对象数据持久化到事件存储中。主要的用途是能够快速持久化对象此次的状态,另外也可以通过未来最终一…

时间序列数据的多元回归_清理和理解多元时间序列数据

时间序列数据的多元回归No matter what kind of data science project one is assigned to, making sense of the dataset and cleaning it always critical for success. The first step is to understand the data using exploratory data analysis (EDA)as it helps us crea…

vue-cli搭建项目的目录结构及说明

vue-cli基于webpack搭建项目的目录结构 build文件夹 ├── build // 项目构建的(webpack)相关代码 │ ├── build.js // 生产环境构建代码(在npm run build的时候会用到这个文件夹)│ ├── check-versions.js // 检查node&am…

391. 完美矩形

391. 完美矩形 给你一个数组 rectangles ,其中 rectangles[i] [xi, yi, ai, bi] 表示一个坐标轴平行的矩形。这个矩形的左下顶点是 (xi, yi) ,右上顶点是 (ai, bi) 。 如果所有矩形一起精确覆盖了某个矩形区域,则返回 true ;否…

bigquery 教程_bigquery挑战实验室教程从数据中获取见解

bigquery 教程This medium article focusses on the detailed walkthrough of the steps I took to solve the challenge lab of the Insights from Data with BigQuery Skill Badge on the Google Cloud Platform (Qwiklabs). I got access to this lab in the Google Cloud R…

学习linux系统到底有没捷径?

2019独角兽企业重金招聘Python工程师标准>>> 说起linux操作系,可能对于很多不了解的人来说,第一个想到的就是类似于黑客帝国中的黑框框以及一串串不知所云的代码,总之这些感觉都可以总结成为一个字,那就是——酷&#…

wxpython实现界面跳转

wxPython实现Frame之间的跳转/更新的一种方法 wxPython是Python中重要的GUI框架,下面通过自己的方法实现模拟类似PC版微信登录,并跳转到主界面(朋友圈)的流程。 (一)项目目录 【说明】 icon : 保存项目使用…

java职业技能了解精通_如何通过精通数字分析来提升职业生涯的发展,第8部分...

java职业技能了解精通Continuing from the seventh article in this series, we are going to explore ways to present data. Over the past few years, Marketing and SEO field has become more data-driven than in the past thanks to tools like Google Webmaster Tools …