在上一篇文章中,我们展示了当数据将情绪从动作中剥离时会发生什么 (In the last article, we showed what happens when data strip emotions out of an action)

In Part 1 of this series, we argued that data can turn anyone into a psychopath, and though that’s an extreme way of looking at things, it holds a certain amount of truth.

在本系列的第1部分中 ,我们认为数据可以使任何人都变得精神病 ,尽管这是一种看待事物的极端方法,但它具有一定的真理性。

It’s natural to cheer at a newspaper headline proclaiming the downfall of a distant enemy stronghold, but is it ok the cheer while actually watching thousands of civilians inside that city die gruesome deaths?


No, it’s not.


But at the same time―if you cheer the headline showing a distant military victory, it means you’re a human, and not necessarily a psychopath.


The abstracted data of that headline strips the emotional currency of the event, and induces a psychopathic response from you.


That’s what headlines do, and they can induce a callous response from most anyone.


So if data can induce a state of momentary psychopathy, what happens when you combine data and algorithms?


Data can’t feel, and algorithms can’t feel either.


Is that a state of unfeeling multiplied by two?


Or is it a state of unfeeling squared?


Whatever the case, let’s not talk about the momentary psychopathy abetted by these unfeeling elements.


Let’s talk about bias.


Because if left unchecked, unfeeling algorithms can and will lead anyone into a state of bias, including you.


But before we try to understand algorithmic bias, we must take a moment to recognize how much we don’t understand our own algorithms.


Yes, humanity makes algorithms, and humanity relies upon them countless times every day, but we don’t understand them.


Computer with display of data — Photo by Markus Spiske on Unsplash
Photo by Markus Spiske on Unsplash
Markus Spiske在Unsplash上拍摄的照片

无论我们认为自己做了多少,我们都不再了解我们自己的算法 (We no longer understand our own algorithms, no matter how much we think we do)

At a high, high level, we could conceive of an algorithmic process as having three parts―an Input, the algorithm itself, and an Outcome.


Infographic — showing three parts of an Algorithm — An Input, the Algorithm and an Outcome
The three parts of an algorithm, viewed from a high level

But we are now far, far away from human-understandable algorithms like the Sieve of Eratosthenes, and though the above image might be great for an Introduction to Algorithms class―today’s algorithms can no longer be adequately described by the above three parts alone.

但是我们现在离人类可以理解的算法( 例如Eratosthenes的Sieve)还差得很远,尽管上面的图片对于算法入门课程来说可能很棒,但是仅靠以上三个部分就不能充分描述当今的算法 。

The tech writer Franklin Foer describes one of the reasons for this in his book World Without Mind: The Existential Threat of Big Tech―

科技作家富兰克林·富尔(Franklin Foer)在他的《 世界无思想:大科技的生存威胁》一书中描述了其中的原因之一-

Perhaps Facebook no longer fully understands its own tangle of algorithms — the code, all sixty million lines of it, is a palimpsest, where engineers add layer upon layer of new commands. (This is hardly a condition unique to Facebook. The Cornell University computer scientist Jon Kleinberg cowrote an essay that argued, “We have, perhaps for the first time ever, built machines we do not understand. . . . At some deep level we don’t even really understand how they’re producing the behavior we observe. This is the essence of their incomprehensibility.” What’s striking is that the “we” in that sentence refers to the creators of code.)

也许Facebook不再完全了解它自己的算法缠结-它的全部六千万行代码都是最简单的方法,工程师在其中添加了一层又一层的新命令。 (这几乎不是Facebook所独有的条件。康奈尔大学计算机科学家乔恩·克莱恩伯格(Jon Kleinberg)在一篇文章中写道:“也许我们有史以来第一次建造了我们不了解的机器……从某种程度上讲,我们不了解。 “甚至还没有真正理解他们如何产生我们观察到的行为。这是他们难以理解的本质。”令人惊讶的是,该句子中的“我们”是指代码的创建者。)

At the very least, the algorithmic codes that run our lives are palimpsests―documents that are originally written by one group of people, and then written over by another group, and then a third, and then a fourth―until there is no one expert on the code itself, or perhaps even one person who understands it.


And these algorithmic palimpsests are millions of lines of code long, or even billions.

这些算法上的障碍是数百万行代码, 甚至数十亿行 。

Remember Mark Zuckerberg’s 2018 testimony before Congress?

还记得马克·扎克伯格(Mark Zuckerberg)在国会面前的2018年证词吗?

Mark Zuckerberg testifying before Congress — Source — Guardian News — photo shown under the Fair Use Permit
Mark Zuckerberg testifying before Congress — Source — Guardian News

That was the testimony of an individual who didn’t have the faintest understanding about 99% of Facebook’s inner workings.

那是一个对Facebook 99%的内部运作不了解的人的证词。

Because no one does.


Larry Page and Sergey Brin don’t understand Google as a whole.

拉里·佩奇(Larry Page)和谢尔盖·布林(Sergey Brin)不太了解Google。

Because no one does.


And the algorithms that define our daily lives?


No one understands them completely, nor does anyone understand the massive amounts of data that they take in.


So let’s update our algorithm diagram. We need to understand that there are more Inputs than we can understand, and that the algorithms themselves are black boxes.

因此,让我们更新算法图。 我们需要了解的是,输入比我们理解的要多,并且算法本身就是黑匣子。

So here is a slightly more accurate, yet still high-level view of what is happening with our algorithms.


Infographic — showing three parts of an Algorithm — Multiple Inputs, the Algorithm as a Black Box — and an Outcome
The three parts of an algorithm, viewed from a slightly closer level — multiple Inputs, the Algorithm as a Black Box, and an Outcome

Again―there are more Inputs than we can understand, going into a black-box algorithm we do not fully understand.


And this can lead to many things, including bias.


算法偏见的案例研究-一家公司被告知青睐曲棍球运动员Jared (A case study in algorithmic bias―a company is told to favor lacrosse players named Jared)

A company recently ran a hiring algorithm, and the intent of the algorithm was to eliminate bias in hiring processes.


The algorithm’s purpose was to find the best candidates.


The company entered some training data into the algorithm based on past successful candidates, and then ran the algorithm again with a current group of candidates.


The algorithm, among other things, favored candidates named Jared that played lacrosse.

除其他因素外,该算法更喜欢打曲棍网兜球的候选人贾里德 。

Lacrosse players — Photo by Forest Simon on Unsplash
Photo by Forest Simon on Unsplash
Forest Simon在Unsplash上拍摄的照片

The algorithmic Output was biased, but not in the way anyone expected.


How could this have happened?


算法不是富有同情心的,更不用说有感觉了,但是它们确实善于发现模式 (Algorithms are not compassionate, let alone sentient―but they are really good at finding patterns)

In the above case, the algorithm found a pattern within the training data that lacrosse players named Jared tend to be good hires.


That’s a biased recommendation of course, and a faulty one.


Why did it occur?


Deep Background Podcast Logo — photo shown under the Fair Use Permit
Deep Background Podcast Logo

Well, beyond us recognizing that we don’t understand the algorithm itself, we can cite thinkers like Dr. Nicol Turner Lee of Brookings, who explained on Noah Feldman’s Deep Background podcast that external sources of algorithmic bias are often manifold.

好吧,除了我们认识到我们不了解算法本身之外,我们还可以引用布鲁金斯大学的Nicol Turner Lee博士这样的思想家,他们在Noah Feldman的Deep Background播客中解释说,算法偏差的外部来源通常是多种多样的。

There might be bias in the training data, and quite often the data scientists who made the algorithm might be of a homogenous group, which might in turn encourage the algorithm to suggest the hiring of more candidates like themselves.


And of course, there is societal and systemic bias, which will inevitably work its way into an unfeeling, pattern-recognizing algorithm.


So to update our algorithm chart once again―


Image for post
The three parts of a biased algorithm, viewed from a slightly closer level — multiple Inputs with Jared and lacrosse, the Algorithm as a Black Box, and an Outcome with Jared and lacrosse

There are faint echoes of Jared and lacrosse somewhere in the Inputs, and we certainly see them in the Outputs.


Of course, both the full scope of the Inputs and the algorithm itself remain a mystery.


The only thing we know for sure is that if your name is Jared, and you played lacrosse, you will have an advantage.


这是一个幽默的例子,但是当赌注更高时会发生什么呢? (This was a humorous example―but what happens when the stakes are higher?)

Hiring algorithms are relatively low stakes in the grand scheme of things, especially considering that virtually any rational company would take steps to eliminate a penchant for lacrosse-playing Jareds from their hiring processes as soon as they could.


But what if the algorithm is meant to set credit rates?


What if the algorithm is meant to determine a bail amount?


What if this algorithm leads to a jail term for someone who should have been sent home instead?


Image for post
Photo by camilo jimenez on Unsplash
camilo jimenez在Unsplash上拍摄的照片

If you are spending the night in jail only because your name isn’t Jared and you didn’t play lacrosse, your plight is no longer a humorous cautionary tale.


And when considering Outcome of a single unwarranted night in jail, there is one conclusion―


An Outcome like that cannot be.


Even if a robotic algorithm leads to 100 just verdicts in a row, if the 101st leads to an unjust jail sentence, that cannot be.


There are protections against this of course―the legal system understands, in theory at least, that an unjust sentence cannot be.


But we’re dealing with algorithms here, and they often operate at a level far beyond our understanding of what can and cannot be.


简要说明一下-算法无法从技术上显示基于受宪法保护的类的偏见,但他们通常会找到实现此目的的方法 (A brief aside — Algorithms cannot technically show bias based on Constitutionally protected classes, but they often find ways to do this)

It’s not just morality prohibiting bias in high stakes algorithmic decisions, it’s the Constitution.


Algorithms are prohibited from showing bias―or preferences―based on ethnicity, gender, sexual orientation and many other things.


The US Constitution — Source — Wikimedia Commons
The US Constitution — Source — Wikimedia Commons

Those cannot be a factor, due to them being a Constitutionally-protected class.

由于它们是受宪法保护的阶级 ,所以它们不能成为一个因素。

But what about secondary characteristics that imply any of the above?


Again, algorithms are great at finding patterns, and even if they are told to ignore certain categories, they can―and will―find patterns that act as substitute for those categories.


Consider these questions―


  • What gender has a name like Jared?

  • What kind of background suggests that a person played lacrosse in high school?


And going a bit further―


  • What is implied by the zip code of the subject’s home address?


So no, an algorithm―particularly one born of a public institution like a courthouse―cannot show bias against Constitutionally-protected classes.


But it might, and probably will if we are not vigilant.


算法可以使您产生偏见吗? 考虑到算法无处不在-答案可能是肯定的。 (Can algorithms make you biased? Considering algorithms are everywhere―the answer may be yes.)

You don’t have to be an HR person at a Tech company or a bail-setting judge to become biased by algorithms.


If you live in the modern world and―


Engage in Social Media, read a news feed, go onto dating apps, or do just about anything online―that bias will be sent down to you.


Bias will influence the friends you choose, the beliefs you have, the people you date and everything else.


Smartphones — Photo by Daniel Romero on Unsplash
Photo by Daniel Romero on Unsplash
Daniel Romero在Unsplash上拍摄的照片

The average smartphone user engages with 9 apps per day, and spends about 2 hours and 15 minutes per day interacting with them.

智能手机的平均用户每天使用9个应用程序 ,并且每天花费约2个小时15分钟与之互动 。

And what are the inner-workings of these apps?


That’s a mystery to the user.


What are the inner-workings of the algorithms inside these apps?


The inner-workings of the apps are a black box to both the user and the company that designed them.


当然,恒定的算法数据流会导致长期存在的隐性和隐性系统偏差 (And of course, the constant stream of algorithmic data can lead to the perpetuation of insidious, and often unseen systemic bias)

Dr. Lee gave this example on the podcast―


One thing for example I think we say in the paper which I think is just profound is that as an African-American who may be served more higher-interest credit card rates, what if I see that ad come through, and I click it just because I’m interested to see why I’m getting this ad, automatically I will be served similar ads, right? So it automatically places me in that high credit risk category. The challenge that we’re having now, Noah, is that as an individual consumer I have no way of recurating what my identity is.

例如,我认为我们在论文中说的一件我认为意义深远的事情是,作为可能会获得更高利率信用卡利率的非裔美国人,如果我看到该广告通过,然后点击它,该怎么办?因为我很想知道为什么要得到这则广告,所以会自动向我投放类似的广告,对吗? 因此,它自动将我置于高信用风险类别中。 诺亚,我们现在面临的挑战是,作为个人消费者,我无法重新获得自己的身份。

Dr. Lee has a Doctorate and is a Senior Fellow at a prestigious institute, and has accomplished countless other things.


But if an algorithm sends her an ad for a high-interest credit card because of her profile, and she inadvertently clicks an ad, or even just hovers her mouse over an ad, that action is registered and added to her profile.

但是,如果算法由于她的个人资料而向她发送了一张针对高息信用卡的广告,而她无意间点击了广告, 甚至只是将鼠标悬停在广告上 ,该操作就会被注册并添加到她的个人资料中。

User on computer with credit card — Photo by rupixen.com on Unsplash
Photo by rupixen.com on Unsplash

And then her credit is dinged, because another algorithm sees her as the type of person who clicks or hovers on ads for high-interest credit card rates.


And of course, if an algorithm sees that lacrosse-playing Jareds should be served ads for Individual Retirement Accounts, that may lead to a different Outcome.


Dr. Lee makes the point that this is no one’s fault per se, but systemic bias can certain show up.


Every response you make to a biased algorithm is added to your profile, even if the addition is antithetical to your true profile.


And of course there is no way that any of us can know what our profile is, let alone recurate it.


因此,个人和系统无意间受到算法的偏见-我们该怎么办? (So individuals and the system are unintentionally biased by algorithms―what do we do?)

First of all, we don’t scrap the whole system.


Algorithms can make you biased, and as I showed in Part 1, data can lead you to a form of psychopathy.


But algorithms and data also improve our lives in countless other ways. They can cure diseases and control epidemics. They can improve test scores of the children from underserved communities.

但是算法和数据还可以通过无数其他方式改善我们的生活。 他们可以治愈疾病并控制流行病。 他们可以提高服务不足社区儿童的考试成绩。

Rockford, Illinois employed data and algorithms to end homelessness in their city.

伊利诺伊州罗克福德采用数据和算法来结束他们所在城市的无家可归现象 。

They solved homelessness, and that is incredible.


So what do we do?


We tweak the system, and we tweak our own approach to it.


And we’ll do that in Part 3.


Stay tuned!


This article is Part 2 of a 3 Part series — The Perils and Promise of Data


Part 1 of this series is here— 3 ways data can turn anyone into a psychopath, including you

本系列的第1部分在这里- 数据可以使任何人变成精神病者(包括您)的3种方式

Part 3 of this series — Coming Soon!


Jonathan Maas has a few books on Amazon, and you can contact him through Medium, or Goodreads.com/JMaas .

乔纳森·马斯(Jonathan Maas) 在亚马逊上有几本书 ,您可以通过Medium或Goodreads.com/JMaas与他联系。

翻译自: https://medium.com/predict/algorithms-can-leave-anyone-biased-including-you-f39cb6abd127






