重点 (Top highlight)
By Angela Xiao Wu, assistant professor at New York University
纽约大学助理教授Angela Xiao Wu
This blog post comes out of a paper by Angela Xiao Wu and Harsh Taneja that offers a new take on social sciences’ ongoing embrace of platform log data by questioning their measurement conditions. The distinct nature of platform datafication is foregrounded in comparison with the longer tradition of third-party audience measurement.
这篇博客文章来自 Angela Xiao Wu 和 Harsh Taneja 的 一篇论文 , 通过质疑它们的测量条件,为社会科学对平台日志数据的持续接受提供了新的思路。 与第三方受众评估的悠久传统相比,平台数据化的独特性质得到了展望。
Surfing a wave of societal awe and excitement about “Big Data,” platforms formed a habit of releasing “data science” insights on what we search, like, express, purchase, obsess over, attempt to hide, and prefer to forget. These colorful graphics and juicy taglines — most notably from OKCupid and PornHub, whose data lay claims to the quirks and desires of our intimate lives — are always popular novelties to behold, ponder, and reference. If knowing ourselves through platform data is a practice of our age, it is certainly not confined to platforms themselves. Aspiring data scientists, curious programmers, vigilant data journalists, analysts of civic organizations and political campaigns, and (last but not the least) academic social scientists such as myself make up the growing field that is figuring out who we are, what we do, and how we sway in the swathes of platform data.
平台引起了社会对“大数据”的敬畏和兴奋,习惯养成了对我们搜索,表达,购买,痴迷,试图隐藏以及宁愿忘记的事物发布“数据科学”见解的习惯。 这些色彩鲜艳的图形和多汁的标语,尤其是来自OKCupid和PornHub的数据,它们的数据表明了我们私密生活的怪癖和渴望,这些都是新颖的新颖事物,值得注视,思考和借鉴。 如果通过平台数据了解自己是我们时代的一种实践,那么它肯定不仅限于平台本身。 有抱负的数据科学家,好奇的程序员,警惕的数据记者,民间组织和政治运动的分析人员,以及(最后但并非最不重要的)像我这样的学术社会科学家组成了一个不断发展的领域,该领域正在弄清我们是谁,我们做什么,以及我们如何在众多平台数据中摇摆。
Such data can be impressive due to their unprecedented granularity and volume, as well as the fact that they are seemingly “unobtrusive” recordings of our activities when no one is watching. These apparent strengths of data for social research are outweighed by a problem in what we call the “measurement conditions”: platform data are platforms’ records of their own behavioral experimentation. Trying to know ourselves through platform data tends to yield partial and contorted accounts of human behavior that conceal platform interventions. Moreover, though increasingly produced by non-corporate actors, such knowledge accounts and narratives tend to be amenable to platform money-making and image-building.
由于这些数据的空前的粒度和数量,以及当没有人观看时,它们似乎对我们的活动“不干扰”的记录,因此这些数据之所以令人印象深刻。 社会研究数据的这些明显优势被我们所谓的“测量条件”问题所抵消:平台数据是平台自身行为实验的记录。 试图通过平台数据了解自己往往会产生隐藏在平台干预中的人类行为的部分和扭曲的描述。 此外,尽管由非企业行为者越来越多地产生这种知识,但这些叙述和叙述往往适合平台赚钱和建立形象。
Trying to know ourselves through platform data tends to yield partial and contorted accounts of human behavior that conceal platform interventions.
试图通过平台数据了解自己往往会产生隐藏在平台干预中的人类行为的部分和扭曲的描述。
To be clear, for years many have contested the ascendance of platform data as a staple in quantitative social sciences alongside conventional data collection methods, such as surveys and experiments. These contestations focus on issues about the data’s representativeness, privacy concerns, and precarious access at the mercy of platform companies. The “measurement conditions” problem, however, is entirely different. In our newly published paper, Harsh Taneja and I call for attention to the circumstances under which these data come about: what purpose does the measurement initially serve? As historians have told us, measurement — or converting parts of the social world into quantities according to some enduring instrument — is not an end in itself, but a means for managing events and coordinating actions. Measurement is thus a product of the social and institutional context (i.e., “measurement conditions”) in which it is called upon and carried out.
需要明确的是,多年来,许多人一直将平台数据的崛起与定量社会科学以及常规数据收集方法(例如调查和实验)一起作为定量社会科学中的主要手段来进行竞争。 这些竞赛的重点是关于数据的代表性,隐私问题以及平台公司的不确定性。 但是,“测量条件”问题完全不同。 在我最近发表的论文中 ,Harsh Taneja和我提请注意这些数据出现的情况:测量最初起什么作用? 正如历史学家告诉我们的那样,测量(或根据某种持久性工具将社会世界的一部分转换为数量)本身并不是目的,而是管理事件和协调行动的一种手段。 因此,衡量是社会和制度环境(即“衡量条件”)的产物,在此环境中需要进行衡量。
A closer look at the measurement conditions of platforms allows us to rethink the nature of platform log data: they are essentially “administrative data” that platforms generate to realize their own organizational goals, which go little beyond enlarging advertising income, harvesting intermediary fees, and attracting venture capitals. These companies track user engagements with their platforms to evaluate and showcase “product performance.” Such data analytics are integral to the iterative process whereby platforms tinker with their digital architectures in attempts to shape usage in ways that maximize profits.
仔细研究平台的衡量条件,我们可以重新考虑平台日志数据的性质:它们本质上是平台为实现自己的组织目标而生成的“管理数据”,除了增加广告收入,收取中介费和吸引风险投资。 这些公司通过其平台跟踪用户参与度,以评估和展示“产品性能”。 此类数据分析是迭代过程不可或缺的部分,在此过程中,平台将对其数字架构进行修补,以尝试通过使利润最大化的方式来改变使用方式。
In other words, platform log data are not “unobtrusive” recordings of human behavior out in the wild. Rather, their measurement conditions determine that they are accounts of putative user activity — “putative” in a sense that platforms are often incentivized to keep bots and other fake accounts around, because, from their standpoint, it’s always a numbers game with investors, marketers, and the actual, oft-insecure users. With calculated neglect comes calibrated nudges: platform user activity, in the first place, is induced, coaxed, and experimented on by the platform environment. From multilayered graphical organization to complex algorithmic recommendation, it is from all these platform arrangements that user activity arises. Conversely, it is to make decisions about these arrangements that platform companies measure usage.
换句话说,平台日志数据并不是野外人类行为的“毫不干扰”记录。 相反,他们的衡量条件确定他们是假定的用户活动的帐户-在某种意义上说,“经常”是指平台经常受到激励以保持机器人程序和其他虚假帐户的存在,因为从他们的角度来看,这始终是与投资者,营销商的数字游戏,以及经常不安全的实际用户。 经过计算的疏忽带来了经过校准的微调:首先,平台环境会诱发,哄骗和试验平台用户的活动。 从多层图形化组织到复杂的算法推荐,正是从所有这些平台安排中产生了用户活动。 相反,平台公司将根据使用情况做出决策。
Thus, it is difficult to tell to what extent the patterns emerging from platform data are about “us,” rather than testimonies to the effects of platform nudges.
因此,很难说平台数据出现的模式在多大程度上是关于“我们”的,而不是平台微弱效果的证词。
Of course, when bulks of platform log data become available for inquisitive parties to crunch, platforms keep the other part of the iterative process — shifting platform arrangements aimed to nudge usage — in the dark. Thus, it is difficult to tell to what extent the patterns emerging from platform data are about “us,” rather than testimonies to the effects of platform nudges. When we are experimental subjects oblivious to platforms’ treatments on us, taking our induced behaviors as “natural” means regarding these platforms as benign, transparent vehicles for our inherent intentions, and thus obscuring their prevailing power.
当然,当大量平台日志数据可供查询方处理时,平台会将重复过程的另一部分(即旨在轻推使用的平台安排转移到黑暗中)保留下来。 因此,很难说平台数据出现的模式在多大程度上是关于“我们”的,而不是平台微弱效果的证词。 当我们是实验对象而忽略平台对我们的治疗时,将我们的诱发行为视为“自然”意味着将这些平台视为对我们固有意图的良性透明工具,从而掩盖了它们的主导力量。
Consider peeking into our innate preferences (by race, geography, and daily rhythms!) based on “patterns” that emerge from PornHub’s log data, when the site’s visual design, temporal pacing, and content curation is all about eliciting and extending the user’s state of pleasure and pleasure seeking; or using Twitter data to study the insurgent online protests during Occupy Wall Street when, due to unknown algorithmic workings, the very term failed to trend; or using Uber’s rides data to study commuting habits when Uber wields its driving force with strategies, such as price surging under the name of (predicted but unverifiable) high demand; or using YouTube, or more fantastically Netflix data, to discern media preferences when these platforms’ entire business rests on herding sequences of viewing. (Each of these platform strategies have been creatively uncovered by critical scholars.)
考虑基于PornHub日志数据中出现的“模式”来窥视我们的先天偏好(按种族,地理和日常节奏!),此时网站的视觉设计,时间步调和内容管理都是关于激发和扩展用户状态的享乐和寻求享乐; 或使用Twitter数据研究“占领华尔街”期间的叛乱在线抗议活动,当时由于未知的算法工作原理,这一术语未能趋于发展 ; 或当Uber 运用策略推动其通行动力时,使用Uber的乘车数据研究通勤习惯,例如以(预计但无法验证的)高需求的名义飙升价格; 当这些平台的整个业务都集中在观看序列上时,或者使用YouTube或更奇妙的Netflix数据来识别媒体偏好。 (批评学者们创造性地发现了每种平台策略。)
…platforms’ intervention in human behavior is at once the center of platform business models and the secret that platforms strive to hide.
……平台对人类行为的干预既是平台业务模型的中心,又是平台努力隐藏的秘密。
When we wind up finding human nature in platform data, we take administrative records from insulated digital experiments as expressions of humanity in our society. The data envelope a platform-shaped hole that may eschew the scrutiny of the most sophisticated computational techniques. Such a data analytic pitfall, increasingly common in data science showcases, journalistic reporting, and academic research, effectively obscures platforms’ intervention in human behavior. And platforms’ intervention in human behavior is at once the center of platform business models and the secret that platforms strive to hide.
当我们最终在平台数据中发现人性时,我们将隔离的数字实验中的管理记录作为人类在社会中的表现。 数据包围着一个平台形的Kong,可以避免对最复杂的计算技术的审查。 这种数据分析陷阱在数据科学展示,新闻报道和学术研究中越来越普遍,有效地掩盖了平台对人类行为的干预。 平台对人类行为的干预既是平台业务模型的中心,又是平台努力隐藏的秘密。
What are the human actions and predispositions that initially spark our curiosity? What is the kind of self-knowledge that we would cherish as a foundation for enriching our sociality, our civil and public institutions, and our democratic process? Readily resorting to platform data analytics for such knowledge risks taking platform environments as our entire world. Instead, when dealing with platform data we should aspire to “put the platforms in perspective,” foregrounding rather than obscuring their interventions in how we behave.
最初激发我们好奇心的人类行为和倾向是什么? 我们将以什么样的自我知识作为丰富我们的社会,我们的公民和公共机构以及我们的民主进程的基础? 随便使用平台数据分析来获得这样的知识风险,需要把平台环境当作我们的整个世界。 相反,在处理平台数据时,我们应该着眼于“透视平台”,而不是掩盖他们对我们行为的干预。
In this collective effort, non-corporate critical actors may find useful some of the strategies discussed in our paper.
在这种集体努力中,非企业的关键角色可能会发现本文讨论的一些策略有用。
Angela Xiao Wu is an assistant professor in Media, Culture and Communication at New York University researching information technology, knowledge production, and political cultures.
吴小安(Angela Xiao Wu) 是纽约大学媒体,文化和传播学的助理教授,研究信息技术,知识生产和政治文化。
翻译自: https://points.datasociety.net/how-not-to-know-ourselves-5227c185569
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/392375.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!