【论文阅读】小样本学习相关研究

相关文献

Generalizing from a Few Examples: A Survey on Few-Shot Learning

Author: YAQING WANG、QUANMING YAO、JAMES T. KWOK、LIONEL M. NI
Abstract: Artificial intelligence succeeds in data-intensive applications, but it lacks the ability of learning from a limited number of examples. To tackle this problem, Few-Shot Learning (FSL) is proposed. It can rapidly generalize from new tasks of limited supervised experience using prior knowledge. To fully understand FSL, we conduct a survey study. We first clarify a formal definition for FSL. Then we figure out that the unreliable empirical risk minimizer is the core issue of FSL. Based on how prior knowledge is used to deal with the core issue, we categorize different FSL methods into three perspectives: data uses the prior knowledge to augment the supervised experience, model constrains the hypothesis space by prior knowledge, and algorithm uses prior knowledge to alter the search for the parameter of the best hypothesis in the hypothesis space. Under this unified taxonomy, we provide thorough discussion of pros and cons across different categories. Finally, we propose possible directions for FSL in terms of problem setup, techniques, applications and theories, in hope of providing insights to following research.
Summary: Few-Shot Learning (FSL) targets at bridging this gap between AI and human-like learning. It can learn new tasks of limited supervised information by incorporating prior knowledge. FSL acts as a test-bed for AI, helps to relieve the burden of collecting large-scale supervised date for industrial usages, or makes the learning of rare cases possible. With both academic dream of AI and industrial needs for cheap learning, FSL draws much attention and becomes a hot topic. In this survey, we provide a comprehensive and systematic review of FSL. We first formally define FSL, and discuss the relatedness and difference of FSL with respect to relevant learning problems such as semi-supervised learning, imbalanced learning, transfer learning and meta-learning. Then, we point out the core issue of FSL based on error decomposition in machine learning. We figure out that it is the unreliable empirical risk minimizer that makes FSL hard to learn. This can be relieved by satisfying or reducing the sample complexity of learning. Understanding the core issue can help categorize different works into data, model and algorithm according to how they solve the core
issue using prior knowledge: data augments the supervised experience of FSL, model constrain the hypothesis space of FSL, and algorithm alters the search strategy for the parameter of the best hypothesis in hypothesis space to solve FSL. Within each category, the pros and cons of different categories are thoroughly discussed and some summary of the insights under each category is presented. For future works, we provide possible directions including problem setup, techniques, applications and theories to explore, in hope of inspiring future research in FSL.
Keywords: Few-shot learning, One-shot learning, Low-shot learning, Small sample learning, Meta-learning, Prior knowledge

不可靠的经验风险最小化器是FSL的核心问题。基于先验知识如何处理核心问题，可以将FSL分为三个角度：数据使用先验知识来增强监督经验，模型通过先验知识约束假设空间，算法使用先验知识改变对假设空间中最佳假设参数的搜索。这篇小样本学习综述论文从Data、Model、Algorithm三个方面介绍了现有的小样本学习领域内的方法，以数学的形式具体解释了小样本学习中的困难点。

Dynamic Few-Shot Visual Learning without Forgetting

Author: Spyros Gidaris、Nikos Komodakis
Abstract: The human visual system has the remarkably ability to be able to effortlessly learn novel concepts from only a few examples. Mimicking the same behavior on machine learning vision systems is an interesting and very challenging research problem with many practical advantages on real world vision applications. In this context, the goal of our work is to devise a few-shot visual learning system that during test time it will be able to efficiently learn novel categories from only a few training data while at the same time it will not forget the initial categories on which it was trained (here called base categories). To achieve that goal we propose (a) to extend an object recognition system with an attention based few-shot classification weight generator, and (b) to redesign the classifier of a ConvNet model as the cosine similarity function between feature representations and classification weight vectors. The latter, apart from unifying the recognition of both novel and base categories, it also leads to feature representations that generalize better on “unseen” categories. We extensively evaluate our approach on Mini-ImageNet where we manage to improve the prior state-of-the-art on few-shot recognition (i.e., we achieve 56:20% and 73:00% on the 1-shot and 5-shot settings respectively) while at the same time we do not sacrifice any accuracy on the base categories, which is a characteristic that most prior approaches lack. Finally, we apply our approach on the recently introduced few-shot benchmark of Bharath and Girshick [4] where we also achieve stateof-the-art results.
Summary: In our work we propose a dynamic few-shot object recognition system that is able to quickly learn novel categories without forgetting the base categories on which it was trained, a property that most prior approaches on the few-shot learning task neglect to fulfill. To achieve that goal we propose a novel attention based few-shot classification weight generator as well as a cosine-similarity based ConvNet classifier. This allows to recognize in a unified way both novel and base categories and also leads to learn feature representations with better generalization capabilities. We evaluate our framework on Mini-ImageNet and the recently introduced fews-shot benchmark of Bharath and Girshick [4] where we demonstrate that our approach is capable of both maintaining high recognition accuracy on base categories and to achieve excellent few-shot recognition accuracy on novel categories that surpasses prior state-of-the-art approaches by a significant margin.
Keywords: Few-Shot Visual Learning
code: https://github.com/gidariss/FewShotWithoutForgetting
dataset：Mini-ImageNet dataset

Edge-Labeling Graph Neural Network for Few-shot Learning

Author: *Jongmin Kim、Taesup Kim、Sungwoong Kim、Chang D.Yoo
Abstract: In this paper, we propose a novel edge-labeling graph neural network (EGNN), which adapts a deep neural network on the edge-labeling graph, for few-shot learning. The previous graph neural network (GNN) approaches in few-shot learning have been based on the node-labeling framework, which implicitly models the intra-cluster similarity and the inter-cluster dissimilarity. In contrast, the proposed EGNN learns to predict the edge-labels rather than the node-labels on the graph that enables the evolution of an explicit clustering by iteratively updating the edgelabels with direct exploitation of both intra-cluster similarity and the inter-cluster dissimilarity. It is also well suited for performing on various numbers of classes without retraining, and can be easily extended to perform a transductive inference. The parameters of the EGNN are learned by episodic training with an edge-labeling loss to obtain a well-generalizable model for unseen low-data problem. On both of the supervised and semi-supervised few-shot image classification tasks with two benchmark datasets, the proposed EGNN significantly improves the performances over the existing GNNs.
Summary: This work addressed the problem of few-shot learning, especially on the few-shot classification task. We proposed the novel EGNN which aims to iteratively update edgelabels for inferring a query association to an existing support clusters. In the process of EGNN, a number of alternative node and edge feature updates were performed using explicit intra-cluster similarity and inter-cluster dissimilarity through the graph layers having different parameter sets, and the edge-label prediction was obtained from the final edge feature. The edge-labeling loss was used to update the parameters of the EGNN with episodic training. Ex-perimental results showed that the proposed EGNN outperformed other few-shot learning algorithms on both of the supervised and semi-supervised few-shot image classification tasks. The proposed framework is applicable to a broad variety of other meta-clustering tasks. For future work, we can consider another training loss which is related to the valid graph clustering such as the cycle loss [35]. Another promising direction is graph sparsification, e.g. constructing K-nearest neighbor graphs [50], that will make our algorithm more scalable to larger number of shots.
Keywords: Few-Shot Learning
code: https://github.com/khy0809/fewshot-egnn
dataset：miniImageNet、tieredImageNet

Few-Shot Learning with Localization in Realistic Settings

Author: Davis Wertheimer、Bharath Hariharan
Abstract: Traditional recognition methods typically require large, artificially-balanced training classes, while few-shot learning methods are tested on artificially small ones. In contrast to both extremes, real world recognition problems exhibit heavy-tailed class distributions, with cluttered scenes and a mix of coarse and fine-grained class distinctions. We show that prior methods designed for few-shot learning do not work out of the box in these challenging conditions, based on a new “meta-iNat” benchmark. We introduce three parameter-free improvements: (a) better training procedures based on adapting cross-validation to metalearning, (b) novel architectures that localize objects using limited bounding box annotations before classification, and © simple parameter-free expansions of the feature space based on bilinear pooling. Together, these improvements double the accuracy of state-of-the-art models on meta-iNat while generalizing to prior benchmarks, complex neural architectures, and settings with substantial domain shift.
Summary: In this paper, we have shown that past work on classical or few-shot balanced benchmarks fails to generalize to realistic heavy-tailed classification problems. We show that parameter-free localization from limited bounding box annotations, and improvements to training and representation, provide large gains beyond those previously observed in data abundant settings. Ours is but a first step in addressing broader questions of class balance and data scarcity. Acknowledgements This work was partly funded by a grant from Aricent.
Keywords: Few-Shot Learning
code:
dataset：mini-ImageNet、mini-ImageNet、Supercategory meta-iNat

LaSO: Label-Set Operations networks for multi-label few-shot learning

Author: Amit Alfassy、Leonid Karlinsky、 Amit Aides、Joseph Shtok et al.
Abstract: Example synthesis is one of the leading methods to tackle the problem of few-shot learning, where only a small number of samples per class are available. However, current synthesis approaches only address the scenario of a single category label per image. In this work, we propose a novel technique for synthesizing samples with multiple labels for the (yet unhandled) multi-label few-shot classification scenario. We propose to combine pairs of given examples in feature space, so that the resulting synthesized feature vectors will correspond to examples whose label sets are obtained through certain set operations on the label sets of the corresponding input pairs. Thus, our method is capable of producing a sample containing the intersection, union or set-difference of labels present in two input samples. As we show, these set operations generalize to labels unseen during training. This enables performing augmentation on examples of novel categories, thus, facilitating multi-label few-shot classifier learning. We conduct numerous experiments showing promising results for the label-set manipulation capabilities of the proposed approach, both directly (using the classification and retrieval metrics), and in the context of performing data augmentation for multi-label few-shot learning. We propose a benchmark for this new and challenging task and show that our method compares favorably to all the common baselines. Our code will be made available upon acceptance.
Summary: In this paper we have presented the label set manipulation concept and have demonstrated its utility for a new and challenging task of the multi-label few-shot classification. Our results show that label set manipulation holds a good potential for this and potentially other interesting applications, and we hope that this paper will convince more researchers to look into this interesting problem. Natural images are inherently multi-label. We have focused on two major sources of labels: objects and attributes. Yet, other possible sources of image labels, such as the background context, object actions, interactions and relations, etc., may be further explored in a future work. One of the interesting future directions of this work include exploring additional architectures for the proposed LaSO networks. For example an encoder-decoder architecture, where the encoder and the decoder subnets are shared between the LaSO networks, and the label-set operations themselves are implemented between the encoder and the decoder via the analytic expressions proposed in section 4.1.2. This alternative architecture has the potential to disentangle the feature space into a basis of independent constituents related to independent labels facilitating the easier use of analytic variants in such a disentangled space. Another interesting future research direction is to use the proposed techniques in the context of few-shot multi-label semi-supervised learning, where a large scale unlabeled data is available, and the proposed approach could be used for automatic retrieval of more auto-labeled examples with arbitrarily mixed label sets (obtained by mixing the few provided examples). In addition, the proposed approach might also prove useful for the interesting visual dialog use case, where the user can manipulate the returned query results by pointing out or showing visual examples of what she/he likes or doesn’t like. Finally, the approach proposed in this work is related to a well known issue in Machine Learning, known as dataset bias [35] or out-of-context recognition [1, 5]. An interesting future work direction for our proposed approach is to help reducing the bias dictated by the specific provided set of images by enabling a better control over the content of the samples.
Keywords: Few-Shot Learning
code: https://github.com/leokarlin/LaSO
dataset：MSCOCO、CelebA

Learning to Compare: Relation Network for Few-Shot Learning

Author: Flood Sung、Yongxin Yang、 Li Zhang、Tao Xiang et al.
Abstract: We present a conceptually simple, flexible, and general framework for few-shot learning, where a classifier must learn to recognise new classes given only few examples from each. Our method, called the Relation Network (RN), is trained end-to-end from scratch. During meta-learning, it learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting. Once trained, a RN is able to classify images of new classes by computing relation scores between query images and the few examples of each new class without further updating the network. Besides providing improved performance on few-shot learning, our framework is easily extended to zero-shot learning. Extensive experiments on five benchmarks demonstrate that our simple approach provides a unified and effective approach for both of these two tasks.
Summary: We proposed a simple method called the Relation Network for few-shot and zero-shot learning. Relation network learns an embedding and a deep non-linear distance metric for comparing query and sample items. Training the network end-to-end with episodic training tunes the embedding and distance metric for effective few-shot learning. This ap-proach is far simpler and more efficient than recent few-shot meta-learning approaches, and produces state-of-the-art results. It further proves effective at both conventional and generalised zero-shot settings.
Keywords: Few-Shot Learning
code: https://github.com/lzrobots/DeepEmbeddingModel_ZSL
dataset：Omniglot and miniImagenet、Animals with Attributes (AwA) and Caltech-UCSD Birds-200-2011 (CUB).

Matching Networks for One Shot Learning

Author: Oriol Vinyals、Charles Blundell、Timothy Lillicrap、Koray Kavukcuoglu et al.
Abstract: Learning from a few examples remains a key challenge in machine learning. Despite recent advances in important domains such as vision and language, the standard supervised deep learning paradigm does not offer a satisfactory solution for learning new concepts rapidly from little data. In this work, we employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories. Our framework learns a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. We then define one-shot learning problems on vision (using Omniglot, ImageNet) and language tasks. Our algorithm improves one-shot accuracy on ImageNet from 87.6% to 93.2% and from 88.0% to 93.8% on Omniglot compared to competing approaches. We also demonstrate the usefulness of the same model on language modeling by introducing a one-shot task on the Penn Treebank.
Summary:In this paper we introduced Matching Networks, a new neural architecture that, by way of its corresponding training regime, is capable of state-of-the-art performance on a variety of one-shot classification tasks. There are a few key insights in this work. Firstly, one-shot learning is much easier if you train the network to do one-shot learning. Secondly, non-parametric structures in a neural network make it easier for networks to remember and adapt to new training sets in the same tasks. Combining these observations together yields Matching Networks. Further, we have defined new one-shot tasks on ImageNet, a reduced version of ImageNet (for rapid experimentation), and a language modeling task. An obvious drawback of our model is the fact that, as the support set S grows in size, the computation for each gradient update becomes more expensive. Although there are sparse and sampling-based methods to alleviate this, much of our future efforts will concentrate around this limitation. Further, as exemplified in the ImageNet dogs subtask, when the label distribution has obvious biases (such as being fine grained), our model suffers. We feel this is an area with exciting challenges which we hope to keep improving in future work.
Keywords: Few-Shot Learning
dataset：Omniglot 、ImageNet、Penn Treebank dataset

Learning to Learn: Meta-Critic Networks for Sample Efficient Learning

Author:Flood Sung、Li Zhang、Tao Xiang、Timothy Hospedales et al.
Abstract: We propose a novel and exible approach to meta-learning for learning-to-learn from only a few examples. Our framework is motivated by actor-critic reinforcement learning, but can be applied to both reinforcement and supervised learning. The key idea is to learn a meta-critic: an action-value function neural network that learns to criticise any actor trying to solve any speci ed task. For supervised learning, this corresponds to the novel idea of a trainable task-parametrised loss generator. This meta-critic approach provides a route to knowledge transfer that can exibly deal with few-shot and semi-supervised conditions for both reinforcement and supervised learning. Promising results are shown on both reinforcement and supervised learning problems.
Summary:We have presented a very exible meta-learning method for few-shot learning that applies to both reinforcement and supervised learning settings, and can seamlessly exploit unlabelled data. Promising results are obtained on a variety of problems. In future work we would like to evaluate our method to more challenging problems in supervised learning and RL for control in terms of diculty and input/output dimensionality; and extend it to the continual learning setting.
Keywords: Computer Science - Learning
dataset：sinusoidal a sin(x+b) and linear cx+d functions、Dependant Multi-arm Bandit

META-LEARNING FOR SEMI-SUPERVISED FEW-SHOT CLASSIFICATION

Author:Mengye Ren、 Eleni Triantafillou、Sachin Ravi、Jake Snell et al.
Abstract: In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Recent progress in few-shot classification has featured meta-learning, in which a parameterized model for a learning algorithm is defined and trained on episodes representing different classification problems, each with a small labeled training set and its corresponding test set. In this work, we advance this few-shot classification paradigm towards a scenario where unlabeled examples are also available within each episode. We consider two situations: one where all unlabeled examples are assumed to belong to the same set of classes as the labeled examples of the episode, as well as the more challenging situation where examples from other distractor classes are also provided. To address this paradigm, we propose novel extensions of Prototypical Networks (Snell et al., 2017) that are augmented with the ability to use unlabeled examples when producing prototypes. These models are trained in an end-to-end way on episodes, to learn to leverage the unlabeled examples successfully. We evaluate these methods on versions of the Omniglot and miniImageNet benchmarks, adapted to this new framework augmented with unlabeled examples. We also propose a new split of ImageNet, consisting of a large set of classes, with a hierarchical structure. Our experiments confirm that our Prototypical Networks can learn to improve their predictions due to unlabeled examples, much like a semi-supervised algorithm would.
Summary:In this work, we propose a novel semi-supervised few-shot learning paradigm, where an unlabeled set is added to each episode. We also extend the setup to more realistic situations where the unlabeled set has novel classes distinct from the labeled classes. To address the problem that current fewshot classification datasets are too small for a labeled vs. unlabeled split and also lack hierarchical levels of labels, we introduce a new dataset, tieredImageNet. We propose several novel extensions of Prototypical Networks, and they show consistent improvements under semi-supervised settings compared to our baselines. As future work, we are working on incorporating fast weights (Ba et al., 2016; Finn et al., 2017) into our framework so that examples can have different embedding representations given the contents in the episode.
Keywords: Few shot-Learning
code: https://github.com/renmengye/few-shot-ssl-public
dataset：Omniglot、miniImageNet、tieredImageNet

Prototypical Networks for Few-shot Learning

Author:Jake Snell、Kevin Swersky、Richard S. Zemel
Abstract: We propose prototypical networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class. Prototypical networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve excellent results. We provide an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. We further extend prototypical networks to zero-shot learning and achieve state-of-theart results on the CU-Birds dataset.
Summary:We have proposed a simple method called prototypical networks for few-shot learning based on the idea that we can represent each class by the mean of its examples in a representation space learned by a neural network. We train these networks to specifically perform well in the few-shot setting by using episodic training. The approach is far simpler and more efficient than recent meta-learning approaches, and produces state-of-the-art results even without sophisticated extensions developed for matching networks (although these can be applied to prototypical nets as well). We show how performance can be greatly improved by carefully considering the chosen distance metric, and by modifying the episodic learning procedure. We further demonstrate how to generalize prototypical networks to the zero-shot setting, and achieve state-of-the-art results on the CUB-200 dataset. A natural direction for future work is to utilize Bregman divergences other than squared Euclidean distance, corresponding to class-conditional distributions beyond spherical Gaussians. We conducted preliminary explorations of this, including learning a variance per dimension for each class. This did not lead to any empirical gains, suggesting that the embedding network has enough flexibility on its own without requiring additional fitted parameters per class. Overall, the simplicity and effectiveness of prototypical networks makes it a promising approach for few-shot learning.
Keywords: Few shot-Learning、Prototypical Networks
dataset：Omniglot、miniImageNet

Siamese Neural Networks for One-shot Image Recognition

Author:Gregory Koch GKOCH@CS.TORONTO.EDU
Richard Zemel ZEMEL@CS.TORONTO.EDU
Ruslan Salakhutdinov RSALAKHU@CS.TORONTO.EDU
Abstract: The process of learning good features for machine learning applications can be very computationally expensive and may prove difficult in cases where little data is available. A prototypical example of this is the one-shot learning setting, in which we must correctly make predictions given only a single example of each new class. In this paper, we explore a method for learning siamese neural networks which employ a unique structure to naturally rank similarity between inputs. Once a network has been tuned, we can then capitalize on powerful discriminative features to generalize the predictive power of the network not just to new data, but to entirely new classes from unknown distributions. Using a convolutional architecture, we are able to achieve strong results which exceed those of other deep learning models with near state-of-the-art performance on one-shot classification tasks.
Summary: We have presented a strategy for performing one-shot classification by first learning deep convolutional siamese neural networks for verification. We outlined new results comparing the performance of our networks to an existing state-of-the-art classifier developed for the Omniglot data set. Our networks outperform all available baselines by a significant margin and come close to the best numbers achieved by the previous authors. We have argued that the strong performance of these networks on this task indicate not only that human-level accuracy is possible with our metric learning approach, but that this approach should extend to one-shot learning tasks in other domains, especially for image classification. In this paper, we only considered training for the verifica-tion task by processing image pairs and their distortions using a global affine transform. We have been experimenting with an extended algorithm that exploits the data about the individual stroke trajectories to produce final computed distortions (Figure 8). By imposing local affine transformations on the strokes and overlaying them into a composite image, we are hopeful that we can learn features which are better adapted to the variations that are commonly seen in new examples.
Keywords: Few shot-Learning、Siamese Neural Networks
dataset：Omniglot、MNIST