php如何减缓gc_管理信息传播-使用数据科学减缓错误信息的传播

php如何减缓gc

With more people now than ever relying on social media to stay updated on current events, there is an ethical responsibility for hosting companies to defend against false information. Disinformation, which is a type of misinformation that is intended to manipulate and mislead, can create unrest and panic. Other types of misinformation such as rumors and hoaxes, if left unchecked, also has the potential to bring mental and physical harm to unwary readers. The key to stopping the spread of misinformation is taking swift action against them since they have the tendency to travel very quickly. In fact, studies show that falsehood spreads exponentially faster than the truth (source). Social media companies have put in place protocols to limit the virality of inaccurate content, but they only take effect once the content has been reviewed by third-party fact-checking partners. Therefore, the focus is on rapid assessment of veracity. We’ve seen remarkable ingenuity from technology companies in this capacity. Namely, the use of Machine Learning algorithms to complement fact-checking programs for identifying inaccurate content. However, this is yet to be a complete solution. In this article, we’ll study the process and explore how it might evolve.

如今,比以往任何时候都更多的人依赖社交媒体来了解最新新闻,因此托管公司有道德责任承担防范虚假信息的责任。 虚假信息是一种旨在操纵和误导的虚假信息,会引起骚动和恐慌。 如果不加以制止,其他类型的错误信息,例如谣言和恶作剧,也有可能给粗心的读者带来精神和身体上的伤害。 阻止错误信息传播的关键是对它们采取Swift的行动,因为它们倾向于快速传播。 实际上,研究表明,虚假的传播速度比真相的传播速度快( 来源 )。 社交媒体公司已经制定了协议来限制不准确内容的病毒性,但是只有在第三方事实检查合作伙伴对内容进行审核后,它们才会生效。 因此,重点是对准确性进行快速评估。 我们已经看到技术公司在此方面具有非凡的创造力。 即,使用机器学习算法来补充事实检查程序,以识别不正确的内容。 但是,这尚未成为一个完整的解决方案。 在本文中,我们将研究该过程并探讨其可能如何发展。

如何识别错误信息 (How Misinformation is Identified)

Image for post
Fact-Checking Program workflow
事实检查计划工作流程

The process of evaluating the content’s accuracy begins with an internal screening of potential falsehood. This involves the utilization of Automation and Machine Learning models to pick up various signals. If the content is determined to potentially be misinformation, it’s routed to fact-checking partners for further review. After manual research and/or consultation with the primary source, a content rating is assigned. The resulting rating notifies the social media company if action needs to be taken. Further, the rating also helps train the Machine Learning models to become better at catching misinformation in the future. Below is how Machine Learning contributes to the process:

评估内容准确性的过程始于对潜在虚假性的内部筛选。 这涉及利用自动化和机器学习模型来拾取各种信号。 如果确定内容可能是错误信息,则将其发送给事实检查合作伙伴以进行进一步检查。 在对主要来源进行人工研究和/或咨询后,会分配内容分级。 如果需要采取行动,则由此产生的评级将通知社交媒体公司。 此外,该等级还有助于训练机器学习模型,使其在将来更好地捕捉错误信息。 以下是机器学习对流程的贡献:

  • The prediction models significantly reduce the number of reviews third-party fact-checking partners need to perform

    预测模型大大减少了第三方事实检查合作伙伴需要执行的审阅次数
  • Finding duplicate or near-duplicate content frees up capacity for fact-checking partners to review new instances of misinformation

    查找重复或几乎重复的内容可释放事实检查合作伙伴查看新的错误信息实例的能力

It’s quite a robust process, but not one without challenges. Below are the main challenges for this process:

这是一个强大的过程,但并非没有挑战。 以下是此过程的主要挑战:

  • The large and growing number of active users makes the platform a target for coordinated propaganda attacks, bringing urgency and heavy workload for the fact-checking program

    大量活跃用户使该平台成为协调宣传攻击的目标,为事实检查程序带来了紧迫性和繁重的工作量
  • The scarcity of verified deceptive content to be used as the corpora for predictive classification model training is a roadblock for Machine Learning methods. This is further exacerbated by the desire to have more narrow categories of “truthiness” since they require different treatments, thus diluting the available data

    缺乏可用于预测分类模型训练的经过验证的欺骗性内容是机器学习方法的障碍。 由于对“真实性”的分类更窄,因此它们的需求进一步加剧,因为它们需要不同的处理方式,从而稀释了可用数据
  • “Bad actors” who hide misleading context behind genuine content are hard to detect. For example, a Meme can use text layered on top of a photo or video to form deceitful content

    在真实内容后隐藏误导性上下文的“坏演员”很难被发现。 例如,一个Meme可以使用在照片或视频上分层的文字来构成欺骗性内容
  • Satirical may be misunderstood by people and are even more difficult for computers

    讽刺语可能会被人们误解,并且对于计算机而言甚至更加困难
Image for post
Monthly Active Users continue to grow as social media become the dominant medium for people to get news
随着社交媒体成为人们获取新闻的主要媒介,每月活跃用户持续增长

仔细检查筛选过程 (A Closer Look at the Screening Process)

Image for post
Automation and Machine Learning look for signals to screen content
自动化和机器学习寻找屏幕内容的信号

开发中 (In Development)

Technology companies are working to improve this process by significantly expanding their databases that will help them build Artificial Intelligence to combat sophisticated attacks such as “deep fakes” and “weaponized memes”. The effectiveness of the algorithms and models largely depend on the having a diverse data set to train on. Fortunately, with the wide collaboration across the technology community in terms of data sharing, the models are becoming better at understanding content. Nevertheless, this is work in progress.

科技公司正在努力通过显着扩展其数据库来改善此过程,这将帮助它们构建人工智能来对抗复杂的攻击,例如“深造假”和“武器化模因”。 算法和模型的有效性在很大程度上取决于要训练的多样化数据集。 幸运的是,随着整个技术社区在数据共享方面的广泛合作,这些模型在理解内容方面变得越来越好。 尽管如此,这项工作仍在进行中。

推荐建议 (Recommendations)

There are considerations that should be explored to make immediate improvements. One recommendation that I’m exploring is the prioritization and specialization of contents for third-party fact-checkers. We can perform A/B testing to compare the turn-over and overall virality to measure the impact of these measures.

应该探索一些考虑因素以立即进行改进。 我正在探索的一项建议是对第三方事实检查者的内容进行优先级划分和专业化处理。 我们可以进行A / B测试,以比较周转率和整体病毒性来衡量这些措施的影响。

  • Prioritization of dangerous content that have a propensity to spread before they become viral

    优先确定容易传播的易于传播的危险内容
  • Specialization of content directs content to third-party fact-checkers within their area of expertise to cut the amount of time require to review

    内容的专业化将内容定向到其专业领域内的第三方事实检查人员,以减少审核所需的时间

摘要 (Summary)

Infodemic is a disease that has plague us long before the recent health crisis. Without proper management, it can do tremendous harm to our society. Thankfully, there are technological tools to help us mitigate those risks. We reviewed the fact-checking progress and specifically how Machine Learning is being applied in this use case.

信息病是在最近的健康危机之前很久困扰我们的疾病。 如果没有适当的管理,它将对我们的社会造成巨大伤害。 值得庆幸的是,有技术工具可以帮助我们减轻这些风险。 我们回顾了事实检查的进展,特别是在此用例中如何应用机器学习。

翻译自: https://towardsdatascience.com/managing-infodemics-slowing-the-spread-of-misinformation-b8b74e3e2618

php如何减缓gc

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/388154.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

[UE4]删除UI:Remove from Parent

同时要将保存UI的变量清空,以释放占用的系统内存 转载于:https://www.cnblogs.com/timy/p/9842206.html

BZOJ2503: 相框

Description P大的基础电路实验课是一个无聊至极的课。每次实验,T君总是提前完成,管理员却不让T君离开,T君只能干坐在那儿无所事事。先说说这个实验课,无非就是把几根导线和某些元器件(电阻、电容、电感等)…

泰坦尼克号 数据分析_第1部分:泰坦尼克号-数据分析基础

泰坦尼克号 数据分析My goal was to get a better understanding of how to work with tabular data so I challenged myself and started with the Titanic -project. I think this was an excellent way to learn the basics of data analysis with python.我的目标是更好地了…

vba数组dim_NDArray — —一个基于Java的N-Dim数组工具包

vba数组dim介绍 (Introduction) Within many development languages, there is a popular paradigm of using N-Dimensional arrays. They allow you to write numerical code that would otherwise require many levels of nested loops in only a few simple operations. Bec…

关于position的四个标签

四个标签是static,relative,absolute,fixed。 static 该值是正常流,并且是默认值,因此你很少看到(如果存在的话)指定该值。 relative:框的位置能够相对于它在正常流中的位置有所偏移…

python算法和数据结构_Python中的数据结构和算法

python算法和数据结构To至 Leonardo da Vinci达芬奇(Leonardo da Vinci) 介绍 (Introduction) The purpose of this article is to give you a panorama of data structures and algorithms in Python. This topic is very important for a Data Scientist in order to help …

CSS:元素塌陷问题

2019独角兽企业重金招聘Python工程师标准>>> 描述: 在文档流中,父元素的高度默认是被子元素撑开的,也就是子元素多高,父元素就多高。但是当子元素设置浮动之后,子元素会完全脱离文档流,此时将会…

Celery介绍及常见错误

celery 情景:用户发起request,并等待response返回。在本些views中,可能需要执行一段耗时的程序,那么用户就会等待很长时间,造成不好的用户体验,比如发送邮件、手机验证码等。 使用celery后,情况…

python dash_Dash是Databricks Spark后端的理想基于Python的前端

python dash📌 Learn how to deliver AI for Big Data using Dash & Databricks this recorded webinar with Peter Kim of Plotly and Prasad Kona of Databricks.this通过Plotly的Peter Kim和Databricks的Prasad Kona的网络研讨会了解如何使用Dash&#xff06…

Eclipse 插件开发遇到问题心得总结

Eclipse 插件开发遇到问题心得总结 Posted on 2011-07-17 00:51 季枫 阅读(3997) 评论(0) 编辑 收藏1、Eclipse 中插件开发多语言的实现 为了使用 .properties 文件,需要在 META-INF/MANIFEST.MF 文件中定义: Bundle-Localization: plugin 这样就会…

在Python中查找子字符串索引的5种方法

在Python中查找字符串中子字符串索引的5种方法 (5 Ways to Find the Index of a Substring in Strings in Python) str.find() str.find() str.rfind() str.rfind() str.index() str.index() str.rindex() str.rindex() re.search() re.search() str.find() (str.find()) …

Eclipse 插件开发 向导

阅读目录 最近由于特殊需要,开始学习插件开发。   下面就直接弄一个简单的插件吧!   1 新建一个插件工程   2 创建自己的插件名字,这个名字最好特殊一点,一遍融合到eclipse的时候,不会发生冲突。   3 下一步,进…

线性回归 假设_线性回归的假设

线性回归 假设Linear Regression is the bicycle of regression models. It’s simple yet incredibly useful. It can be used in a variety of domains. It has a nice closed formed solution, which makes model training a super-fast non-iterative process.线性回归是回…

solo

solo - 必应词典 美[soʊloʊ]英[səʊləʊ]n.【乐】独奏(曲);独唱(曲);单人舞;单独表演adj.独唱[奏]的;单独的;单人的v.独奏;放单飞adv.独网络梭罗;独奏曲;索罗变形复数&#xff1…

Eclipse 简介和插件开发天气预报

Eclipse 简介和插件开发 Eclipse 是一个很让人着迷的开发环境,它提供的核心框架和可扩展的插件机制给广大的程序员提供了无限的想象和创造空间。目前网上流传相当丰富且全面的开发工具方面的插件,但是 Eclipse 已经超越了开发环境的概念,可以…

趣味数据故事_坏数据的好故事

趣味数据故事Meet Julia. She’s a data engineer. Julia is responsible for ensuring that your data warehouses and lakes don’t turn into data swamps, and that, generally speaking, your data pipelines are in good working order.中号 EETJulia。 她是一名数据工程…

Linux 4.1内核热补丁成功实践

最开始公司运维同学反馈,个别宿主机上存在进程CPU峰值使用率异常的现象。而数万台机器中只出现了几例,也就是说万分之几的概率。监控产生的些小误差,不会造成宕机等严重后果,很容易就此被忽略了。但我们考虑到这个异常转瞬即逝、并…

python分句_Python循环中的分句,继续和其他子句

python分句Python中的循环 (Loops in Python) for loop for循环 while loop while循环 Let’s learn how to use control statements like break, continue, and else clauses in the for loop and the while loop.让我们学习如何在for循环和while循环中使用诸如break &#xf…

eclipse plugin 菜单

简介: 菜单是各种软件及开发平台会提供的必备功能,Eclipse 也不例外,提供了丰富的菜单,包括主菜单(Main Menu),视图 / 编辑器菜单(ViewPart/Editor Menu)和上下文菜单&am…

python数据建模数据集_Python中的数据集

python数据建模数据集There are useful Python packages that allow loading publicly available datasets with just a few lines of code. In this post, we will look at 5 packages that give instant access to a range of datasets. For each package, we will look at h…