Subpixel CNN in Upsampling Applications

Improve segmentation models and generative models by using a subpixel CNN as the upsampling operation.

Start date: October 2016
Category: Applied Research
Contact point: Eder Santana -

Problem description

The upsampling of feature maps in convolutional neural networks is useful in many types of models, such as generative models [1], image segmentation models [2], and more. Currently these models use transposed convolution (also called “deconvolution”) as their upsampling operation. Subpixel CNNs [3] are a promising new way to do upsampling of feature maps which may be preferrable to transposed convolution. Currently it seems this has only been used in one CVPR 2016 paper about super-resolution [3]. We propose to investigate the use of subpixel CNNs for a wide range of models that use upsampling, in particular segmentation models and generative models.

Why this problem matters

Generative models grew to be one of the main topics in Deep Learning. For example, experts in the field considered Generative Adversarial Networks to be one of the most important research directions in deep learning [4]. Image segmentation on the other hand is one of the fundamental techniques in the new industry of self-driving cars. Image segmentation could be used to classify drivable paths, detection of obstacles, pedestrians, other cars, etc. Beyond self-driving cars, image segmentation is useful for any classification problem where one needs more information than the existence or not of an object in a scene.

How to measure success

Sucess measures are problem-specific. For the segmentation problem we should compare networks with similar number of operations and check if the subpixel improves the number of correctly segmented pixels. In the generative model case the measure to track is the log-likelihood of generated samples (in small datasets like MNIST). Beyond the quality of generated samples, we should also test the stability of the generatie models and how useful parts of it can be used for other tasks such as feature extraction for classification [1].


Generative models:

Image segmentation:

Datasets generated from video games: