A short history of AI
In the first of six weekly briefs, we ask how AI overcame decades of underdelivering
Over the summer of 1956 a small but illustrious group gathered at
Dartmouth College in New Hampshire; it included Claude Shannon, the
begetter of information theory, and Herb Simon, the only person ever to win
both the Nobel Memorial Prize in Economic Sciences awarded by the Royal
Swedish Academy of Sciences and the Turing Award awarded by the
Association for Computing Machinery. They had been called together by a
young researcher, John McCarthy, who wanted to discuss “how to make
machines use language, form abstractions and concepts” and “solve kinds of
problems now reserved for humans”. It was the first academic gathering
devoted to what McCarthy dubbed “artificial intelligence”. And it set a
template for the field’s next 60-odd years in coming up with no advances on
a par with its ambitions.
1956年夏天,一个人数不多但很著名的团体聚集在新罕布什尔州的达特茅斯学院;其中包括信息论创始人Claude Shannon和Herb Simon,后者是唯一一个同时获得由皇家瑞典学院科学奖授予的诺贝尔经济学奖和由计算机械协会授予的图灵奖的人。他们是由一位年轻的研究员约翰·麦卡锡召集的,他想讨论“如何让机器使用语言,形成抽象概念”和“解决现在留给人类的各种问题”。这是第一次致力于麦卡锡所谓的“人工智能”的学术聚会。它为该领域接下来的60多年树立了一个模板,在这60多年里,它没有取得与其雄心不相上下的进步。
illustrious:美 [ɪˈlʌstriəs] 著名的;杰出的;显赫的;
begetter:美 [bɪ’ɡetə( r)] 父;生产者
memorial:美 [məˈmɔːriəl] 纪念碑;纪念像;纪念物; 纪念的;悼念的
the Nobel Memorial Prize in Economic Sciences:诺贝尔经济学奖
on a bar:同等;持平;堪比
The Dartmouth meeting did not mark the beginning of scientific inquiry into
machines which could think like people. Alan Turing, for whom the Turing
prize is named, wondered about it; so did John von Neumann, an inspiration
to McCarthy. By 1956 there were already a number of approaches to the
issue; historians think one of the reasons McCarthy coined the term artificial
intelligence, later AI, for his project was that it was broad enough to
encompass them all, keeping open the question of which might be best.
Some researchers favoured systems based on combining facts about the
world with axioms like those of geometry and symbolic logic so as to infer
appropriate responses; others preferred building systems in which the
probability of one thing depended on the constantly updated probabilities of
many others.
inquiry:美 [ˈɪnkwəri] 探查;探询;
scientific inquiry:科学研究
coin the term:起名字
axioms:美 ['æksɪrmz] 公理;原理;定理;(axiom的复数)注意发音
The following decades saw much intellectual ferment and argument on the
topic, but by the 1980s there was wide agreement on the way forward:
“expert systems” which used symbolic logic to capture and apply the best of
human know-how. The Japanese government, in particular, threw its weight
behind the idea of such systems and the hardware they might need. But for
the most part such systems proved too inflexible to cope with the messiness
of the real world. By the late 1980s AI had fallen into disrepute, a byword for
overpromising and underdelivering. Those researchers still in the field
started to shun the term.
ferment:美 [fərˈment , ˈfɜːrment] 政治或社会上的)动乱;骚动;纷扰;激烈的讨论
messiness:乱糟糟; 混乱
disrepute:美 [ˌdɪsrəˈpjut] 坏名声;不光彩;名誉受损;声名狼藉
shun:美 [ʃʌn] (故意)避开;回避;
overpromising: 承诺超出实际能做到的
解释 “Overpromising and Underdelivering”
“Overpromising and underdelivering” 指的是承诺的超出实际能做到的,结果未能实现预期。详细解析:
- Overpromising:做出过度承诺,指的是承诺某事会有很好的结果或达到很高的标准。
- Underdelivering:未能兑现承诺,指的是实际结果未达到承诺的标准或预期。
例句:The project manager overpromised on the delivery date but underdelivered, causing delays and dissatisfaction.
例句:The new software was heavily advertised, but it underdelivered, lacking many of the promised features.
- 句子中的 “overpromising and underdelivering”:
- 解释:人工智能承诺的远远超过实际能做到的,结果未能实现预期。
- 句子:By the late 1980s AI had fallen into disrepute, a byword for overpromising and underdelivering.
- 翻译:到20世纪80年代后期,人工智能已经声名狼藉,成为过度承诺和未能兑现的代名词。
It was from one of those pockets of perseverance that today’s boom was
born. As the rudiments of the way in which brain cells—a type of neuron—
work were pieced together in the 1940s, computer scientists began to
wonder if machines could be wired up the same way. In a biological brain
there are connections between neurons which allow activity in one to trigger
or suppress activity in another; what one neuron does depends on what the
other neurons connected to it are doing. A first attempt to model this in the
lab (by Marvin Minsky, a Dartford attendee) used hardware to model
networks of neurons. Since then, layers of interconnected neurons have been
simulated in software
perseverance: 美 [ˌpɜːrsəˈvɪrəns] 毅力;不屈不挠的精神;韧性 注意发音
rudiments:美 [ˈrudəmənts] 入门;基础知识;初步;(rudiment的复数)
were pieced together:连接起来
wire up:连接起来
pockets: 小团体
解释 “Wire up” 和 “Pockets”
- Wire up:连接起来,指的是将机器或设备按照某种方式连接起来。
- Pockets:小团体或小区域,这里指的是在某个特定领域或地点坚持努力的人或团队。
Wire up:这个短语常用来描述连接设备或系统的过程,就像将电线连接到一个电路板上。
- Engineers wired up the new system to ensure it worked seamlessly with the existing infrastructure.
- 翻译:工程师们将新系统连接起来,以确保它能与现有基础设施无缝工作。
- 句子:Computer scientists began to wonder if machines could be wired up the same way.
- 解释:计算机科学家开始思考机器是否可以以类似的方式连接起来,就像神经元在生物大脑中的连接方式一样。
- There are still pockets of resistance against the new policy in some regions.
- 翻译:在某些地区仍然有小部分人反对新政策。
- 句子:It was from one of those pockets of perseverance that today’s boom was born.
- 解释:今天的繁荣是从那些坚持不懈的小团体或区域中诞生的。
These artificial neural networks are not programmed using explicit rules;
instead, they “learn” by being exposed to lots of examples. During this
training the strength of the connections between the neurons (known as
“weights”) are repeatedly adjusted so that, eventually, a given input
produces an appropriate output. Minsky himself abandoned the idea, but
others took it forward. By the early 1990s neural networks had been trained
to do things like help sort the post by recognising handwritten numbers.
Researchers thought adding more layers of neurons might allow more
sophisticated achievements. But it also made the systems run much more
A new sort of computer hardware provided a way around the problem. Its
potential was dramatically demonstrated in 2009, when researchers at
Stanford University increased the speed at which a neural net could run 70-
fold, using a gaming PC in their dorm room. This was possible because, as
well as the “central processing unit” (CPU) found in all PCs, this one also had a
“graphics processing unit” (GPU) to create game worlds on screen. And the
GPU was designed in a way suited to running the neural-network code.
around: 绕过,避开
解释 “around”
- Around:在这里表示“绕过”或“避开”的意思,指的是找到一个方法来避免或解决某个问题。
- Around 这个词在不同的上下文中可以有多种解释。在这里,它用来描述找到一种方法来避免遇到的困难或问题。
- 例句:
- The team found a way around the scheduling conflict by rescheduling the meeting.
- 翻译:团队通过重新安排会议找到了一个绕过时间冲突的方法。
- 在句子中的解释:
- 句子:A new sort of computer hardware provided a way around the problem.
- 解释:一种新型的计算机硬件提供了一种绕过该问题的方法。
在这个例子中,around 强调的是通过使用新技术或方法,成功地解决了之前的困难。
Coupling that hardware speed-up with more efficient training algorithms
meant that networks with millions of connections could be trained in a
reasonable time; neural networks could handle bigger inputs and, crucially,
be given more layers. These “deeper” networks turned out to be far more
The power of this new approach, which had come to be known as “deep
learning”, became apparent in the ImageNet Challenge of 2012. Image
recognition systems competing in the challenge were provided with a
database of more than a million labelled image files. For any given word,
such as “dog” or “cat”, the database contained several hundred photos.
Image-recognition systems would be trained, using these examples, to
“map” input, in the form of images, onto output in the form of one-word
descriptions. The systems were then challenged to produce such descriptions
when fed previously unseen test images. In 2012 a team led by Geoff
Hinton, then at the University of Toronto, used deep learning to achieve an
accuracy of 85%. It was instantly recognised as a breakthrough.
这种新方法的力量,后来被称为“深度学习”,在2012年的ImageNet挑战赛中变得显而易见。参加挑战赛的图像识别系统配备了一个包含超过一百万个带标签图像文件的数据库。对于任何给定的词,如“狗”或“猫”,数据库包含数百张照片。使用这些例子,图像识别系统将被训练成把图像形式的输入“映射”到一个单词描述形式的输出上。这些系统被要求在之前没见过的测试图像时做出这样的描述。2012年,由当时在多伦多大学的Geoff Hinton领导的团队使用深度学习达到了85%的准确率。这立即被认为是一个突破。
By 2015 almost everyone in the image-recognition field was using deep
learning, and the winning accuracy at the ImageNet Challenge had reached
96%—better than the average human score. Deep learning was also being
applied to a host of other “problems…reserved for humans” which could be
reduced to the mapping of one type of thing onto another: speech
recognition (mapping sound to text), face-recognition (mapping faces to
names) and translation.
In all these applications the huge amounts of data that could be accessed
through the internet were vital to success; what was more, the number of
people using the internet spoke to the possibility of large markets. And the
bigger (ie, deeper) the networks were made, and the more training data they
were given, the more their performance improved.
spoke to: 暗示,表明
解释 “spoke”
- Spoke:在这里表示“暗示”或“表明”的意思,指的是互联网用户的数量表明了潜在市场的规模。
- His success in multiple projects spoke to his expertise in the field.
- 翻译:他在多个项目中的成功表明了他在该领域的专业知识。
Deep learning was soon being deployed in all kinds of new products and
services. Voice-driven devices such as Amazon’s Alexa appeared. Online
transcription services became useful. Web browsers offered automatic
translations. Saying such things were enabled by AI started to sound cool,
rather than embarrassing, though it was also a bit redundant; nearly every
technology referred to as AI then and now actually relies on deep learning
under the bonnet.
automatic translation:自动翻译
bonnet:美 [ˈbɑːnɪt] (车辆的)引擎盖;引擎罩;
ChatGPT and its rivals really do seem to “use language and form abstractions”
In 2017 a qualitative change was added to the quantitative benefits being
provided by more computing power and more data: a new way of arranging
connections between neurons called the transformer. Transformers enable
neural networks to keep track of patterns in their input, even if the elements
of the pattern are far apart, in a way that allows them to bestow “attention”
on particular features in the data.
qualitative:美 [ˈkwɑːlɪteɪtɪv] 性质上的;质量上的;定性的; 注意发音
qualitative change:质变
bestow:美 [bɪˈstoʊ] 授予;(将…)给予;赠送;
Transformers gave networks a better grasp of context, which suited them to
a technique called “self-supervised learning”. In essence, some words are
randomly blanked out during training, and the model teaches itself to fill in
the most likely candidate. Because the training data do not have to be
labelled in advance, such models can be trained using billions of words of
raw text taken from the internet
Transformers 让网络更好地掌握了上下文,这使它们适应了一种叫做“自我监督学习”的技术。本质上,一些单词在训练过程中被随机删除,模型自学填充最有可能的候选词。因为训练数据不需要提前标注,所以这种模型可以使用从互联网上获取的数十亿字的原始文本进行训练
Mind your language model
Transformer-based large language models (LLMs) began attracting wider
attention in 2019, when a model called GPT-2 was released by OpenAI, a startup
(GPT stands for generative pre-trained transformer). Such LLMs turned out to be
capable of “emergent” behaviour for which they had not been explicitly
trained. Soaking up huge amounts of language did not just make them
surprisingly adept at linguistic tasks like summarisation or translation, but
also at things—like simple arithmetic and the writing of software—which
were implicit in the training data. Less happily it also meant they reproduced
biases in the data fed to them, which meant many of the prevailing
prejudices of human society emerged in their output.
In November 2022 a larger OpenAImodel, GPT-3.5, was presented to the public in
the form of a chatbot. Anyone with a web browser could enter a prompt and
get a response. No consumer product has ever taken off quicker. Within
weeks ChatGPT was generating everything from college essays to computer
code. AI had made another great leap forward.
2022年11月,一个更大的开放模型GPT 3.5以聊天机器人的形式呈现在公众面前。任何有网络浏览器的人都可以输入提示并得到响应。没有哪种消费产品比这更快起飞了。几周之内,ChatGPT就生成了从大学论文到计算机代码的各种东西。AI又向前跃进了一大步。
Where the first cohort of AI-powered products was based on recognition, this
second one is based on generation. Deep-learning models such as Stable
Diffusion and DALL-E, which also made their debuts around that time, used a
technique called diffusion to turn text prompts into images. Other models
can produce surprisingly realistic video, speech or music.
第一批人工智能产品是基于识别,而第二批是基于生成。Stable Diffusion和DALL-E等深度学习模型也是在那个时候首次亮相,它们使用了一种称为扩散的技术,将文本提示转化为图像。其他模型可以产生令人惊讶的逼真的视频、语音或音乐。
cohort:美 [ˈkoʊhɔːrt] (有共同特点或举止类同的)一群人;一批人; 生物学中的)群体
The leap is not just technological. Making things makes a difference. ChatGPT
and rivals such as Gemini (from Google) and Claude (from Anthropic,
founded by researchers previously at OpenAI) produce outputs from
calculations just as other deep-learning systems do. But the fact that they
respond to requests with novelties makes them feel very unlike software
which recognises faces, takes dictation or translates menus. They really do
seem to “use language” and “form abstractions”, just as McCarthy had
This series of briefs will look at how these models work, how much further
their powers can grow, what new uses they will be put to, as well as what
they will not, or should not, be used for. ■
2024年7月27日17点53分于上海。本文是翻阅《The Economist》2024年07月20日这期杂志的第20篇文章。接下来将目光转向下一期的经济学人。