Researchers Demonstrate Less-than-One Shot Machine Learning
We’re accustomed to thinking that bigger is better in machine learning. If 10 samples are good, then 100 samples must be even better. However, researchers from the University of Waterloo recently demonstrated the feasibility of “less than one-shot” learning, or a model that can learn to identify something, even if it’s never seen an example of it.
In their September paper, titled “‘Less Than One’-Shot Learning: Learning N Classes From M<N Samples,” researchers Ilia Sucholutsky and Matthias Schonlau explain how they created a machine learning model that can learn to classify something when trained with less than one example per class.
For example, consider an alien zoologist who lands on earth and is instructed to capture a unicorn. “It has no familiarity with local fauna and there are no photos of unicorns, so humans show it a photo of a horse and a photo of a rhinoceros, and say that a unicorn is something in between,” Sucholutsky and Schonlau write. “With just two examples, the alien has now learned to recognize three different animals.”
That is essentially what the pair did with their less-than-one shot learning (or LO-shot learning) exercise. The researchers selected a k-Nearest Neighbors (kNN) classifier, a relatively simple supervised machine learning algorithm that is traditionally trained on sample data that is clearly labeled.
But instead of feeding the kNN hard data points, it instead exposed it to manufactured “soft labels” in the training set. A soft label, the researchers write, is “the vector-representation of a point’s simultaneous membership to several classes.”
By exposing the algorithm to sample data that does not have a direct one-to-one relationship with a specific class, but rather to data that reflects a continuous spectrum between two points, the researchers theorized, the algorithm will be able to induce the correct class, even though it hasn’t actually seen it.
The researchers trained their algorithm, called the distance-weighted soft-label prototype k-Nearest Neighbors (SLaPkNN), on the soft labels and found that it could correctly classify classes that it wasn’t exposed to in the training data. In effect, SLaPkNN learned to identify unicorns by seeing pictures of a horse and a rhinoceros, and being told that the unicorn is somewhere in the middle.
“…[O]ur contributions lay the theoretical foundations necessary to establish ‘less than one’-shot learning as a viable new direction in machine learning research,” Sucholutsky and Schonlau write. “We have shown that even a simple classifier like SLaPkNN can perform LO-shot learning, and we have proposed a way to analyze the robustness of the decision landscapes produced in this setting.”
The researchers hope their work will spur more investigation into LO-shot methods. “Improving prototype design is critical for speeding up instance-based, or lazy learning, algorithms like kNN by reducing the size of their training sets,” they write. “However, eager learning models like deep neural networks would benefit more from the ability to learn directly from a small number of real samples to enable their usage in settings where little training data is available. This remains a major open challenge in LO-shot learning.”
The paper is available here.
Related Items:
Three Tricks to Amplify Small Data for Deep Learning
What’s the Difference Between AI, ML, Deep Learning, and Active Learning?