Abstract

Large amounts of labeled data are required to train deep neural networks to achieve good performance in the case of learning visual characteristics from images or videos in computer vision applications. To avoid the cost of collecting and labeling large datasets, a subset of unsupervised learning methods called self-supervised learning methods can be deployed. They manage to learn general visual characteristics of images and videos from unlabeled datasets. The paper implements a convolutional neural network that has the pseudo-task of recognizing which geometric transformation was applied to the image from the input, specifically - rotation. After training, using transfer learning techniques network was trained on a small subset of labeled data, for the task of image classification. On the STL10 dataset, an accuracy of 76% is achieved on the image classification downstream task.

Keywords: self-supervised learning methods, neural networks
Published on website: 1.10.2022
Attached files: MCerovic_NEW.pdf