Faster Training by Selecting Samples Using Embeddings

Faster Training by Selecting Samples Using Embeddings (2019)

Santiago Gonzalez, Joshua Landgraf, and Risto Miikkulainen

Long training times have increasingly become a burden for researchers by slowing down the pace of innovation, with some models taking days or weeks to train. In this paper, a new, general technique is presented that aims to speed up the training process by using a thinned-down training dataset. By leveraging autoencoders and the unique properties of embedding spaces, we are able to filter training datasets to only include only the samples that matter the most. Through evaluation on a standard CIFAR-10 image classification task, this technique is shown to be effective. With this technique, training times can be reduced with a minimal loss in accuracy. Conversely, given a fixed training time budget, the technique was shown to improve accuracy by over 50%. This intelligent dataset sampling technique is a practical tool for achieving better results with large datasets and limited computational budgets.

View:

PDF

Citation:

In Proceedings of the 2019 International Joint Conference on Neural Networks, 1-7, Budapest, Hungary, July 2019.

Bibtex:

People

Santiago Gonzalez	Ph.D. Alumni	slgonzalez [at] utexas edu
Risto Miikkulainen	Faculty	risto [at] cs utexas edu

Areas of Interest

Unsupervised Learning, Clustering, and Self-Organization Other Areas