So … What Is ChatGPT Doing, and Why Does It Work?
那么…ChatGPT在做什么,为什么它有效呢?
The basic concept of ChatGPT is at some level rather simple. Start from a huge sample of human-created text from the web, books, etc. Then train a neural net to generate text that’s “like this”. And in particular, make it able to start from a “prompt” and then continue with text that’s “like what it’s been trained with”.
在某种程度上,ChatGPT 的基本概念非常简单。从互联网、书籍等来源的大量人类创作的文本开始,然后训练一个神经网络生成“类似”的文本。特别是,使其能够从一个“提示”开始,然后继续生成“类似于它所训练过的”的文本。
As we’ve seen, the actual neural net in ChatGPT is made up of very simple elements—though billions of them. And the basic operation of the neural net is also very simple, consisting essentially of passing input derived from the text it’s generated so far “once through its elements” (without any loops, etc.) for every new word (or part of a word) that it generates.
正如我们所看到的,ChatGPT 中的实际神经网络由非常简单的元素组成——尽管有数十亿之多。神经网络的基本操作也非常简单,本质上由输入传递到迄今为止生成的文本所派生的“一次通过其元素”(没有任何循环等)以生成每个新单词(或单词的一部分)。
But the remarkable—and unexpected—thing is that this process can produce text that’s successfully “like” what’s out there on the web, in books, etc. And not only is it coherent human language, it also “says things” that “follow its prompt” making use of content it’s “read”. It doesn’t always say things that “globally make sense” (or correspond to correct computations)—because (without, for example, accessing the “computational superpowers” of Wolfram|Alpha) it’s just saying things that “sound right” based on what things “sounded like” in its training material.
但令人惊讶且意想不到的是,这个过程可以生成与网络、书籍等地方的文本成功“类似”的文本。不仅是连贯的人类语言,它还“说出了事物”,根据它“读过”的内容“遵循其提示”。它并不总是说出“全局有意义的事物”(或对应于正确的计算),因为(例如,没有访问 Wolfram|Alpha 的“计算超能力”)它只是说出那些基于其训练材料中事物的“听起来像”的东西。
The specific engineering of ChatGPT has made it quite compelling. But ultimately (at least until it can use outside tools) ChatGPT is “merely” pulling out some “coherent thread of text” from the “statistics of conventional wisdom” that it’s accumulated. But it’s amazing how human-like the results are. And as I’ve discussed, this suggests something that’s at least scientifically very important: that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought. ChatGPT has implicitly discovered it. But we can potentially explicitly expose it, with semantic grammar, computational language, etc.
ChatGPT 的具体工程使其非常引人注目。但最终(至少在它可以使用外部工具之前),ChatGPT 只是从它积累的“常识统计”中挖掘出一些“连贯的文本线索”。但令人惊讶的是,结果是多么的像人类。正如我所讨论的,这暗示了一些至少在科学上非常重要的东西:人类语言(及其背后的思维模式)在结构上比我们想象的更简单、更“类似法则”。ChatGPT 已经隐含地发现了它。但我们可以通过语义语法、计算语言等将其潜在地明确地暴露出来。
What ChatGPT does in generating text is very impressive—and the results are usually very much like what we humans would produce. So does this mean ChatGPT is working like a brain? Its underlying artificial-neural-net structure was ultimately modeled on an idealization of the brain. And it seems quite likely that when we humans generate language many aspects of what’s going on are quite similar.
ChatGPT 在生成文本方面的表现非常令人印象深刻,结果通常非常类似于我们人类的产物。那么,这是否意味着 ChatGPT 像大脑一样工作呢?它底层的人工神经网络结构最初是基于大脑的理想化模型。而当我们人类产生语言时,很多方面的过程似乎相当相似。
When it comes to training (AKA learning) the different “hardware” of the brain and of current computers (as well as, perhaps, some undeveloped algorithmic ideas) forces ChatGPT to use a strategy that’s probably rather different (and in some ways much less efficient) than the brain. And there’s something else as well: unlike even in typical algorithmic computation, ChatGPT doesn’t internally “have loops” or “recompute on data”. And that inevitably limits its computational capability—even with respect to current computers, but definitely with respect to the brain.
在训练(也称为学习)方面,大脑和当前计算机的不同“硬件”(以及可能还有一些未开发的算法思想)迫使 ChatGPT 使用一种可能与大脑相当不同(在某些方面效率低得多)的策略。还有另一个方面:与典型的算法计算不同,ChatGPT 在内部没有“循环”或“重新计算数据”。这不可避免地限制了它的计算能力——即使与现有计算机相比,更不用说与大脑相比了。
It’s not clear how to “fix that” and still maintain the ability to train the system with reasonable efficiency. But to do so will presumably allow a future ChatGPT to do even more “brain-like things”. Of course, there are plenty of things that brains don’t do so well—particularly involving what amount to irreducible computations. And for these both brains and things like ChatGPT have to seek “outside tools”—like Wolfram Language.
目前还不清楚如何在保持系统合理训练效率的同时“解决这个问题”。但要做到这一点,可能会让未来的 ChatGPT 能够做更多“类似大脑的事情”。当然,大脑在许多方面做得并不好——特别是涉及到不可约计算的部分。对于这些方面,大脑和像 ChatGPT 这样的工具都必须寻求“外部工具”——如 Wolfram 语言。
But for now it’s exciting to see what ChatGPT has already been able to do. At some level it’s a great example of the fundamental scientific fact that large numbers of simple computational elements can do remarkable and unexpected things. But it also provides perhaps the best impetus we’ve had in two thousand years to understand better just what the fundamental character and principles might be of that central feature of the human condition that is human language and the processes of thinking behind it.
但现在,看到 ChatGPT 已经取得的成果非常令人兴奋。在某种程度上,这是一个很好的例子,证明了大量简单计算元素可以实现非凡和意想不到的事情这一基本科学事实。同时,它也为我们提供了两千年来最好的动力,以更好地理解构成人类状况的核心特征和原则,即人类语言及其背后的思维过程。
“点赞有美意,赞赏是鼓励”