数据科学家编程能力需要多好
I have held the title of data scientist in two industries. I’ve interviewed for more than 30 additional data science positions. I’ve been the CTO of a data-centric startup. I’ve done many hours of data science consulting.
我曾担任过两个行业的数据科学家。 我已经面试了30多个其他数据科学职位。 我曾担任以数据为中心的初创公司的CTO。 我已经完成了许多小时的数据科学咨询。
With that background, you will hopefully realize that I’m not a data denier. I’m a firm believer in the power of statistics, machine learning, and all the tools in a data scientist’s toolbox. I know that data science is a powerhouse field filled with amazing people that are changing the world.
有这样的背景,您将有希望认识到我不是拒绝数据的人。 我坚信统计,机器学习以及数据科学家工具箱中的所有工具的强大功能。 我知道数据科学是一个强大的领域,充满着改变世界的杰出人士。
That being said, many companies don’t need a data scientist.
话虽这么说,许多公司并不需要数据科学家。
No, that wasn’t strong enough. Let me try again.
不,那还不够强大。 让我再试一遍。
The vast majority of companies that are looking for a data scientist don’t need one.
寻找数据科学家的绝大多数公司都不需要。
Of all the companies I’ve worked or interviewed with as a data scientist, I’d say 80% of them were looking for the wrong role.
在我作为数据科学家工作或采访过的所有公司中,我要说其中80%都在寻找错误的角色。
Some of them just needed a data analyst. Others needed a data engineer or a data architect. The rest didn’t have a data need at all.
其中一些只需要一个数据分析师。 其他人则需要数据工程师或数据架构师。 其余的完全没有数据需求。
您想解决什么问题? (What problem are you looking to solve?)
I always ask this question when someone is looking to hire me. Originally, I asked what they were looking to do with their data, but I’ve since realized that the answer to that latter question doesn’t matter. The focus needs to be on the problem, not the solution. Companies hire to solve problems.
当有人要雇用我时,我总是问这个问题。 最初,我问他们想如何处理他们的数据,但后来我意识到对后一个问题的答案并不重要。 重点需要放在问题上,而不是解决方案上。 公司雇用来解决问题。
Good companies don’t hire a position because it’s trendy to have around. They hire because — for every dollar that employee costs them — they are getting more than a dollar in return. It’s that simple. It’s all about ROI.
好的公司不会雇用职位,因为这很时髦。 他们之所以雇用,是因为-员工每花费1美元,他们就会获得超过1美元的回报。 就这么简单。 都是关于投资回报率的。
All companies understand that when it comes to positions like accounting and sales because they know how ROI works for accounting or sales. They know what problem needs to be solved and they know who can do it.
所有公司都了解会计和销售等职位,因为他们知道投资回报率如何用于会计或销售。 他们知道需要解决什么问题,并且知道谁可以解决。
But data confuses companies. It especially confuses older companies, but startups are not immune. We’ve all been told that there’s gold in them thar data.
但是数据使公司感到困惑。 它尤其使较老的公司感到困惑,但是初创公司并非无法幸免。 我们都被告知这些数据中有黄金。
And who doesn’t love a good gold rush?
还有谁不喜欢淘金热呢?
Just like the gold rush of old, most people don’t know where to look for the gold, many of them have fallen for fool’s gold, and no matter how much a vein has been picked clean, people keep coming back looking for scraps.
就像古老的淘金热一样,大多数人都不知道在哪里寻找黄金,其中许多人已经沦为傻瓜的黄金,而且无论清理了多少静脉,人们都不断回来寻找废料。
The underlying issue is that companies have been told their data is valuable. And it might be. But whether packaged for sale or used internally, data is a part of a solution, and every solution’s value is determined by the cost of the problem it is solving.
根本问题是,公司被告知其数据很有价值。 可能是这样。 但是,无论是打包出售还是内部使用,数据都是解决方案的一部分,每个解决方案的价值都取决于解决方案的成本。
Without a problem, a solution is just an idea. And, as I’ve mentioned in multiple previous posts, ideas are worthless.
没有问题,解决方案只是一个想法。 而且,正如我在之前的多篇文章中提到的那样,想法毫无价值。
Data rushes happen because companies have a solution — data — and they are looking for a problem to apply it to. It’s a completely backward approach. You don’t decide to use screws because you have a screwdriver handy. You decide to use a screwdriver because you need to tighten a screw.
出现数据高峰是因为公司拥有解决方案-数据-并且他们正在寻找将其应用的问题。 这是一种完全落后的方法。 由于螺丝刀很方便,因此您不决定使用螺钉。 您决定使用螺丝刀,因为您需要拧紧螺丝。
Data is a resource. So why is data not treated like any other resource?
数据是一种资源。 那么为什么数据没有像其他资源一样被对待呢?
Data is inherently different than other resources in one important way.
数据在一种重要方式上与其他资源固有地不同。
Let’s look at oil, a pretty standard resource. Unless you are The Beverly Hillbillies, you don’t just find oil lying around in your backyard. If you have thousands of tons of oil, you have it because you planned to have it for a specific purpose. And once you use it for that purpose, it’s gone.
让我们看一下石油,这是一种非常标准的资源。 除非您是The Beverly Hillbillies ,否则您不仅会发现后院周围散布着石油。 如果您有数千吨的石油,那么就拥有它是因为您计划将其用于特定目的。 一旦将其用于此目的,它就消失了。
But companies have exabytes of data. Maybe they had it for a purpose. Maybe there was a regulatory requirement for them to keep it. Maybe it was just easier to keep than to throw away.
但是公司拥有EB级的数据。 也许他们有目的。 也许他们有保留的监管要求。 也许保留起来比扔掉要容易。
Whatever the reason, they have it now, and they want to use it. They just don’t know what to use it for. And they often assume data scientists are the answer. After all, data is right there in the title, and scientists are smart.
无论出于何种原因,他们现在都拥有它,并且想要使用它。 他们只是不知道用它做什么。 他们通常认为数据科学家就是答案。 毕竟,数据就在标题中,科学家是聪明的。
科学家不是你拼写工程师的方式 (S-c-i-e-n-t-i-s-t is not how you spell engineer)
Let me give these companies the benefit of the doubt and say they actually do have problems that their data could solve. That still doesn’t necessarily make hiring a data scientist the correct next step.
让我给这些公司带来疑问的好处,并说他们确实确实存在其数据可以解决的问题。 但这并不一定使下一步聘请数据科学家成为正确的选择。
Data scientists solve puzzles. They take billions of pieces of data and turn them into a single, cohesive picture. But they can’t do that if you don’t give them all the pieces.
数据科学家解决难题。 他们获取数十亿条数据,并将它们转变为单一的,有凝聚力的图像。 但是,如果您不给他们所有的东西,他们将无法做到这一点。
If your data streams into ten different systems that don’t talk to each other, you are setting your data scientist up for failure. You need someone that can bridge those systems, bringing the data into a single place. That’s the job of a data engineer, not a data scientist. Depending on the situation, you may also need data architecture, data modeling, and database administration.
如果您的数据流到十个彼此不通信的不同系统中,那么您将使数据科学家面临失败的准备。 您需要可以桥接这些系统的人员,将数据放在一个地方。 那是数据工程师的工作,而不是数据科学家的工作。 根据情况,您可能还需要数据体系结构,数据建模和数据库管理。
If you really want to, you can find a data scientist that can handle everything from the engineering to the DB admin work. I’ve been that data scientist. But my rate was much higher than what they would have paid to just hire the correct person for the job.
如果确实需要,您可以找到一个数据科学家,可以处理从工程到数据库管理员的所有工作。 我一直是那个数据科学家。 但是我的薪水比他们仅仅雇用合适的人所付出的薪水要高得多。
Why did they overpay? Because they didn’t yet understand the current status of their data or what a data scientist actually does.
他们为什么多付钱? 因为他们还不了解数据的当前状态或数据科学家的实际行为。
Why did I take the job? Because I was too naive to know better.
我为什么要这份工作? 因为我太天真,无法更好地了解。
Everyone would have been better off if the company had hired a data engineer, waited 6–12 months, then brought on a data scientist when they were fully prepared.
如果公司聘请了一位数据工程师,等待了6到12个月,然后在他们做好充分准备的情况下请来了一位数据科学家,那么每个人都会过得更好。
准备? 有目标吗? 聘请! (Ready? Have an aim? Hire!)
Has your company identified problems that you need data science to solve?
您的公司是否已确定需要数据科学解决的问题?
Is your data in a state that a data scientist can work with?
您的数据处于数据科学家可以使用的状态吗?
If you answered both of these with a definitive ‘yes’, then you may need a data scientist. Congratulations, your company is doing things right. Pat yourselves on the back no more than three times then go do some amazing things.
如果您用肯定的“是”回答了这两个问题,那么您可能需要一位数据科学家。 恭喜,您的公司做对了。 拍拍自己的背部不超过三遍,然后去做一些令人惊奇的事情。
If you answered either question with a ‘no’ or a general look of confusion, then save your money and a data scientist’s sanity by taking down that job posting you just put up. Maybe replace it with a posting for a data engineer or data analyst. Or maybe just be happy not to have to go through the hiring process.
如果您回答“否”或普遍感到困惑,则可以通过删除刚提出的工作来节省金钱和数据科学家的理智。 也许将其替换为数据工程师或数据分析师的帖子。 或者也许只是高兴地不必经历整个招聘过程。
Not sure what you need? Talk to a data consultant before you waste your money.
不确定你需要什么? 在浪费金钱之前,请与数据顾问联系。
Like this advice? Take 0.001% of the money you just saved and buy me a drink someday.
喜欢这个建议吗? 拿走您刚存的钱的0.001%,有一天再给我喝一杯。
翻译自: https://medium.com/swlh/do-we-need-data-scientists-8d8e8062688a
数据科学家编程能力需要多好
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/389058.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!