如何不认识自己

重点 (Top highlight)

By Angela Xiao Wu, assistant professor at New York University

纽约大学助理教授Angela Xiao Wu

This blog post comes out of a paper by Angela Xiao Wu and Harsh Taneja that offers a new take on social sciences’ ongoing embrace of platform log data by questioning their measurement conditions. The distinct nature of platform datafication is foregrounded in comparison with the longer tradition of third-party audience measurement.

这篇博客文章来自 Angela Xiao Wu Harsh Taneja 一篇论文 通过质疑它们的测量条件,为社会科学对平台日志数据的持续接受提供了新的思路。 与第三方受众评估的悠久传统相比,平台数据化的独特性质得到了展望。

Surfing a wave of societal awe and excitement about “Big Data,” platforms formed a habit of releasing “data science” insights on what we search, like, express, purchase, obsess over, attempt to hide, and prefer to forget. These colorful graphics and juicy taglines — most notably from OKCupid and PornHub, whose data lay claims to the quirks and desires of our intimate lives — are always popular novelties to behold, ponder, and reference. If knowing ourselves through platform data is a practice of our age, it is certainly not confined to platforms themselves. Aspiring data scientists, curious programmers, vigilant data journalists, analysts of civic organizations and political campaigns, and (last but not the least) academic social scientists such as myself make up the growing field that is figuring out who we are, what we do, and how we sway in the swathes of platform data.

平台引起了社会对“大数据”的敬畏和兴奋,习惯养成了对我们搜索,表达,购买,痴迷,试图隐藏以及宁愿忘记的事物发布“数据科学”见解的习惯。 这些色彩鲜艳的图形和多汁的标语,尤其是来自OKCupid和PornHub的数据,它们的数据表明了我们私密生活的怪癖和渴望,这些都是新颖的新颖事物,值得注视,思考和借鉴。 如果通过平台数据了解自己是我们时代的一种实践,那么它肯定不仅限于平台本身。 有抱负的数据科学家,好奇的程序员,警惕的数据记者,民间组织和政治运动的分析人员,以及(最后但并非最不重要的)像我这样的学术社会科学家组成了一个不断发展的领域,该领域正在弄清我们是谁,我们做什么,以及我们如何在众多平台数据中摇摆。

Such data can be impressive due to their unprecedented granularity and volume, as well as the fact that they are seemingly “unobtrusive” recordings of our activities when no one is watching. These apparent strengths of data for social research are outweighed by a problem in what we call the “measurement conditions”: platform data are platforms’ records of their own behavioral experimentation. Trying to know ourselves through platform data tends to yield partial and contorted accounts of human behavior that conceal platform interventions. Moreover, though increasingly produced by non-corporate actors, such knowledge accounts and narratives tend to be amenable to platform money-making and image-building.

由于这些数据的空前的粒度和数量,以及当没有人观看时,它们似乎对我们的活动“不干扰”的记录,因此这些数据之所以令人印象深刻。 社会研究数据的这些明显优势被我们所谓的“测量条件”问题所抵消:平台数据是平台自身行为实验的记录。 试图通过平台数据了解自己往往会产生隐藏在平台干预中的人类行为的部分和扭曲的描述。 此外,尽管由非企业行为者越来越多地产生这种知识,但这些叙述和叙述往往适合平台赚钱和建立形象。

Trying to know ourselves through platform data tends to yield partial and contorted accounts of human behavior that conceal platform interventions.

试图通过平台数据了解自己往往会产生隐藏在平台干预中的人类行为的部分和扭曲的描述。

To be clear, for years many have contested the ascendance of platform data as a staple in quantitative social sciences alongside conventional data collection methods, such as surveys and experiments. These contestations focus on issues about the data’s representativeness, privacy concerns, and precarious access at the mercy of platform companies. The “measurement conditions” problem, however, is entirely different. In our newly published paper, Harsh Taneja and I call for attention to the circumstances under which these data come about: what purpose does the measurement initially serve? As historians have told us, measurement — or converting parts of the social world into quantities according to some enduring instrument — is not an end in itself, but a means for managing events and coordinating actions. Measurement is thus a product of the social and institutional context (i.e., “measurement conditions”) in which it is called upon and carried out.

需要明确的是,多年来,许多人一直将平台数据的崛起与定量社会科学以及常规数据收集方法(例如调查和实验)一起作为定量社会科学中的主要手段来进行竞争。 这些竞赛的重点是关于数据的代表性,隐私问题以及平台公司的不确定性。 但是,“测量条件”问题完全不同。 在我最近发表的论文中 ,Harsh Taneja和我提请注意这些数据出现的情况:测量最初起什么作用? 正如历史学家告诉我们的那样,测量(或根据某种持久性工具将社会世界的一部分转换为数量)本身并不是目的,而是管理事件和协调行动的一种手段。 因此,衡量是社会和制度环境(即“衡量条件”)的产物,在此环境中需要进行衡量。

A closer look at the measurement conditions of platforms allows us to rethink the nature of platform log data: they are essentially “administrative data” that platforms generate to realize their own organizational goals, which go little beyond enlarging advertising income, harvesting intermediary fees, and attracting venture capitals. These companies track user engagements with their platforms to evaluate and showcase “product performance.” Such data analytics are integral to the iterative process whereby platforms tinker with their digital architectures in attempts to shape usage in ways that maximize profits.

仔细研究平台的衡量条件,我们可以重新考虑平台日志数据的性质:它们本质上是平台为实现自己的组织目标而生成的“管理数据”,除了增加广告收入,收取中介费和吸引风险投资。 这些公司通过其平台跟踪用户参与度,以评估和展示“产品性能”。 此类数据分析是迭代过程不可或缺的部分,在此过程中,平台将对其数字架构进行修补,以尝试通过使利润最大化的方式来改变使用方式。

In other words, platform log data are not “unobtrusive” recordings of human behavior out in the wild. Rather, their measurement conditions determine that they are accounts of putative user activity — “putative” in a sense that platforms are often incentivized to keep bots and other fake accounts around, because, from their standpoint, it’s always a numbers game with investors, marketers, and the actual, oft-insecure users. With calculated neglect comes calibrated nudges: platform user activity, in the first place, is induced, coaxed, and experimented on by the platform environment. From multilayered graphical organization to complex algorithmic recommendation, it is from all these platform arrangements that user activity arises. Conversely, it is to make decisions about these arrangements that platform companies measure usage.

换句话说,平台日志数据并不是野外人类行为的“毫不干扰”记录。 相反,他们的衡量条件确定他们是假定的用户活动的帐户-在某种意义上说,“经常”是指平台经常受到激励以保持机器人程序和其他虚假帐户的存在,因为从他们的角度来看,这始终是与投资者,营销商的数字游戏,以及经常不安全的实际用户。 经过计算的疏忽带来了经过校准的微调:首先,平台环境会诱发,哄骗和试验平台用户的活动。 从多层图形化组织到复杂的算法推荐,正是从所有这些平台安排中产生了用户活动。 相反,平台公司将根据使用情况做出决策。

Thus, it is difficult to tell to what extent the patterns emerging from platform data are about “us,” rather than testimonies to the effects of platform nudges.

因此,很难说平台数据出现的模式在多大程度上是关于“我们”的,而不是平台微弱效果的证词。

Of course, when bulks of platform log data become available for inquisitive parties to crunch, platforms keep the other part of the iterative process — shifting platform arrangements aimed to nudge usage — in the dark. Thus, it is difficult to tell to what extent the patterns emerging from platform data are about “us,” rather than testimonies to the effects of platform nudges. When we are experimental subjects oblivious to platforms’ treatments on us, taking our induced behaviors as “natural” means regarding these platforms as benign, transparent vehicles for our inherent intentions, and thus obscuring their prevailing power.

当然,当大量平台日志数据可供查询方处理时,平台会将重复过程的另一部分(即旨在轻推使用的平台安排转移到黑暗中)保留下来。 因此,很难说平台数据出现的模式在多大程度上是关于“我们”的,而不是平台微弱效果的证词。 当我们是实验对象而忽略平台对我们的治疗时,将我们的诱发行为视为“自然”意味着将这些平台视为对我们固有意图的良性透明工具,从而掩盖了它们的主导力量。

Consider peeking into our innate preferences (by race, geography, and daily rhythms!) based on “patterns” that emerge from PornHub’s log data, when the site’s visual design, temporal pacing, and content curation is all about eliciting and extending the user’s state of pleasure and pleasure seeking; or using Twitter data to study the insurgent online protests during Occupy Wall Street when, due to unknown algorithmic workings, the very term failed to trend; or using Uber’s rides data to study commuting habits when Uber wields its driving force with strategies, such as price surging under the name of (predicted but unverifiable) high demand; or using YouTube, or more fantastically Netflix data, to discern media preferences when these platforms’ entire business rests on herding sequences of viewing. (Each of these platform strategies have been creatively uncovered by critical scholars.)

考虑基于PornHub日志数据中出现的“模式”来窥视我们的先天偏好(按种族,地理和日常节奏!),此时网站的视觉设计,时间步调和内容管理都是关于激发和扩展用户状态的享乐和寻求享乐; 或使用Twitter数据研究“占领华尔街”期间的叛乱在线抗议活动,当时由于未知的算法工作原理,这一术语未能趋于发展 ; 或当Uber 运用策略推动其通行动力时,使用Uber的乘车数据研究通勤习惯,例如以(预计但无法验证的)高需求的名义飙升价格; 当这些平台的整个业务都集中在观看序列上时,或者使用YouTube或更奇妙的Netflix数据来识别媒体偏好。 (批评学者们创造性地发现了每种平台策略。)

…platforms’ intervention in human behavior is at once the center of platform business models and the secret that platforms strive to hide.

……平台对人类行为的干预既是平台业务模型的中心,又是平台努力隐藏的秘密。

When we wind up finding human nature in platform data, we take administrative records from insulated digital experiments as expressions of humanity in our society. The data envelope a platform-shaped hole that may eschew the scrutiny of the most sophisticated computational techniques. Such a data analytic pitfall, increasingly common in data science showcases, journalistic reporting, and academic research, effectively obscures platforms’ intervention in human behavior. And platforms’ intervention in human behavior is at once the center of platform business models and the secret that platforms strive to hide.

当我们最终在平台数据中发现人性时,我们将隔离的数字实验中的管理记录作为人类在社会中的表现。 数据包围着一个平台形的Kong,可以避免对最复杂的计算技术的审查。 这种数据分析陷阱在数据科学展示,新闻报道和学术研究中越来越普遍,有效地掩盖了平台对人类行为的干预。 平台对人类行为的干预既是平台业务模型的中心,又是平台努力隐藏的秘密。

What are the human actions and predispositions that initially spark our curiosity? What is the kind of self-knowledge that we would cherish as a foundation for enriching our sociality, our civil and public institutions, and our democratic process? Readily resorting to platform data analytics for such knowledge risks taking platform environments as our entire world. Instead, when dealing with platform data we should aspire to “put the platforms in perspective,” foregrounding rather than obscuring their interventions in how we behave.

最初激发我们好奇心的人类行为和倾向是什么? 我们将以什么样的自我知识作为丰富我们的社会,我们的公民和公共机构以及我们的民主进程的基础? 随便使用平台数据分析来获得这样的知识风险,需要把平台环境当作我们的整个世界。 相反,在处理平台数据时,我们应该着眼于“透视平台”,而不是掩盖他们对我们行为的干预。

In this collective effort, non-corporate critical actors may find useful some of the strategies discussed in our paper.

在这种集体努力中,非企业的关键角色可能会发现本文讨论的一些策略有用。

Angela Xiao Wu is an assistant professor in Media, Culture and Communication at New York University researching information technology, knowledge production, and political cultures.

吴小安(Angela Xiao Wu) 是纽约大学媒体,文化和传播学的助理教授,研究信息技术,知识生产和政治文化。

翻译自: https://points.datasociety.net/how-not-to-know-ourselves-5227c185569

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/392375.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

JDBC 数据库连接操作——实习第三天

今天开始了比较重量级的学习了,之前都是对于Java基础的学习和回顾。继续上篇的话题,《谁动了我的奶酪》,奉献一句我觉得比较有哲理的话:“学会自嘲了,而当人们学会自嘲,能够嘲笑自己的愚蠢和所做的错事时,他就在开始改变了。他甚至…

webassembly_WebAssembly的设计

webassemblyby Patrick Ferris帕特里克费里斯(Patrick Ferris) WebAssembly的设计 (The Design of WebAssembly) I love the web. It is a modern-day superpower for the dissemination of information and empowerment of the individual. Of course, it has its downsides …

leetcode 509. 斐波那契数(dfs)

斐波那契数,通常用 F(n) 表示,形成的序列称为 斐波那契数列 。该数列由 0 和 1 开始,后面的每一项数字都是前面两项数字的和。也就是: F(0) 0,F(1) 1 F(n) F(n - 1) F(n - 2),其中 n > 1 给你 n &a…

java基本特性_Java面试总结之Java基础

无论是工作多年的高级开发人员还是刚入职场的新人,在换工作面试的过程中,Java基础是必不可少的面试题之一。能不能顺利通过面试,拿到自己理想的offer,在准备面试的过程中,Java基础也是很关键的。对于工作多年的开发人员…

plotly python_使用Plotly for Python时的基本思路

plotly pythonI recently worked with Plotly for data visualization on predicted outputs coming from a Machine Learning Model.我最近与Plotly合作,对来自机器学习模型的预测输出进行数据可视化。 The documentation I referred to : https://plotly.com/pyt…

转发:毕业前的赠言

1、找一份真正感兴趣的工作。 “一个人如果有两个爱好,并且把其中一个变成自己的工作,那会是一件非常幸福的事情。那么另外一个爱好用来做什么?打发时间啦。所以,第二个兴趣非常重要,在你无聊寂寞的时候越发显得它…

Python模块之hashlib:提供hash算法

算法介绍 Python的hashlib提供了常见的摘要算法,如MD5,SHA1等等。 什么是摘要算法呢?摘要算法又称哈希算法、散列算法。它通过一个函数,把任意长度的数据转换为一个长度固定的数据串(通常用16进制的字符串表示&#xf…

css flexbox模型_完整CSS课程-包括flexbox和CSS网格

css flexbox模型Learn CSS in this complete 83-part course for beginners. Cascading Style Sheets (CSS) tell the browser how to display the text and other content that you write in HTML.在这本由83部分组成的完整课程中,为初学者学习CSS。 级联样式表(CS…

leetcode 830. 较大分组的位置

在一个由小写字母构成的字符串 s 中,包含由一些连续的相同字符所构成的分组。 例如,在字符串 s “abbxxxxzyy” 中,就含有 “a”, “bb”, “xxxx”, “z” 和 “yy” 这样的一些分组。 分组可以用区间 [start, end] 表示,其中…

php 匹配图片路径_php正则匹配图片路径原理与方法

下面我来给大家介绍在php正则匹配图片路径原理与实现方法,有需要了解的朋友可进入参考参考。提取src里面的图片地址还不足够,因为不能保证那个地址一定是绝对地址,完全的地址,如果那是相对的呢?如果地址诸如&#xff1…

java项目经验行业_行业研究以及如何炫耀您的项目

java项目经验行业苹果 | GOOGLE | 现货 | 其他 (APPLE | GOOGLE | SPOTIFY | OTHERS) Editor’s note: The Towards Data Science podcast’s “Climbing the Data Science Ladder” series is hosted by Jeremie Harris. Jeremie helps run a data science mentorship startup…

MongoDB教程-使用Node.js从头开始CRUD应用

In this MongoDB Tutorial from NoobCoder, you will learn how to use MongoDB to create a complete Todo CRUD Application. This project uses MongoDB, Node.js, Express.js, jQuery, Bootstrap, and the Fetch API.在NoobCoder的MongoDB教程中,您将学习如何使…

leetcode 399. 除法求值(bfs)

给你一个变量对数组 equations 和一个实数值数组 values 作为已知条件,其中 equations[i] [Ai, Bi] 和 values[i] 共同表示等式 Ai / Bi values[i] 。每个 Ai 或 Bi 是一个表示单个变量的字符串。 另有一些以数组 queries 表示的问题,其中 queries[j]…

【0718作业】收集和整理面向对象的六大设计原则

面向对象的六大设计原则 (1)单一职责原则——SRP (2)开闭原则——OCP (3)里式替换原则——LSP (4)依赖倒置原则——DIP (5)接口隔离原则——ISP (…

数据科学 python_适用于数据科学的Python vs(和)R

数据科学 pythonChoosing the right programming language when taking on a new project is perhaps one of the most daunting decisions programmers often make.在进行新项目时选择正确的编程语言可能是程序员经常做出的最艰巨的决定之一。 Python and R are no doubt amon…

如何进行有效的需求调研

一、什么是需求调研?需求调研对于一个应用软件开发来说,是一个系统开发的开始阶段,它的输出“软件需求分析报告”是设计阶段的输入,需求调研的质量对于一个应用软件来说,是一个极其重要的阶段,它的质量在一…

java中直角三角形第三条边,Java编程,根据输入三角形的三个边边长,程序能判断三角形类型为:等边、等腰、斜角、直角三角形,求代码...

private static Scanner sc;private static int edge[] new int[3];public static void main(String[] args) {System.out.println("请输入三角形的三条边");sc new Scanner(System.in);input();}public static void input() {int index 0;//数组下标while (sc.ha…

react中使用构建缓存_使用React和Netlify从头开始构建电子商务网站

react中使用构建缓存In this step-by-step, 6-hour tutorial from Coding Addict, you will learn to build an e-commerce site from scratch using React and create-react-app.在这个Coding Addict的分步,为时6小时的教程中,您将学习使用React和creat…

Django+Vue前后端分离项目的部署

部署静态文件: 静态文件有两种方式 1:通过django路由访问 2:通过nginx直接访问 方式1: 需要在根目录的URL文件中增加 url(r^$, TemplateView.as_view(template_name"index.html")),作为入口,在setting中更改…

leetcode 547. 省份数量(bfs)

有 n 个城市,其中一些彼此相连,另一些没有相连。如果城市 a 与城市 b 直接相连,且城市 b 与城市 c 直接相连,那么城市 a 与城市 c 间接相连。 省份 是一组直接或间接相连的城市,组内不含其他没有相连的城市。 给你一…