文献翻译
- 人工智能
- 《Meta - Learning with Memory - Augmented Neural Networks》
- one-shot learning:
- Neural Turing Machines,NTMs
- 《Model - Agnostic Meta - Learning for Fast Adaptation of Deep Networks》
- Meta - learning
- gradient steps
- finetune
- 《Attention Is All You Need》
- 《Imagenet Classification with Deep Convolutional Neural Networks》
- 《Automatic Chain of Thought Prompting in Large Language Models》
人工智能
《Meta - Learning with Memory - Augmented Neural Networks》
https://proceedings.mlr.press/v48/santoro16.pdf
文献综述:
Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of “one-shot learning.” Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered,the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.
Abstract
Despite recent breakthroughs(突破) in the applications of deep neural networks(深度神经网络), one setting(一种背景) that presents(提出) a persistent(持久的;坚持不懈的;持续不断的) challenge is that of “one-shot learning.(一次性学习)” Traditional gradient-based(基于梯度的) networks require a lot of data to learn, often through extensive(广泛的) iterative(迭代的) training. When new data is encountered(遇到),the models must inefficiently relearn their parameters to adequately incorporate(吸收) the new information without (不在。。情况下)catastrophic(灾难的) interference(干扰). Architectures(架构) with **augmented (增强的)**memory capacities(性能), such as Neural Turing Machines (NTMs)(神经图灵机), offer the ability to quickly encode and retrieve(检索) new information, and hence(因此) can potentially(潜在地) obviate(消除;排除;使不必要) the downsides(缺点) of conventional(传统的) models. Here, we demonstrate(证明) the ability of a memory-augmented neural network to rapidly(迅速地) assimilate(同化) new data, and leverage(杠杆) this data to make accurate(准确的) predictions after only a few samples. We also introduc