Access Type

Open Access Dissertation

Date of Award

January 2019

Degree Type

Dissertation

Degree Name

Ph.D.

Department

Computer Science

First Advisor

Ming Dong

Abstract

With the rapid development of innovative models and huge success on various applications, the field of deep learning has attracted enormous attention in computer vision, machine learning, and artificial intelligence. Countless researches have validated the superior performance and unprecedented extensiveness of deep learning models, especially with the advantages of high performance computing by GPUs and parallel computation. Nonetheless, drawbacks including strong dependency on supervision (sufficient labeled data) and monotonous usage of categorized labels are negatively interfering the advancement of deep learning.

In this dissertation, we plan to expose and exploit some possibilities of deep learning without using data and labels in the traditional supervision way. Specifically, we propose a pipeline to fulfill this process in a three-step manner: ranking instead of classification and regression, transfer leaning including domain adaptation, and finally data synthesis without supervised labels.

First, we propose a novel ranking-based Convolutional Neural Network architecture. It can take advantage of both ranking algorithms and features learned with CNN models. Specifically, instead of using labels in classification or regression, it can take ordinal information into consideration. Meanwhile, features learned in CNN-based models can significantly outperform engineered features to achieve superior performance.

Then, we propose a transfer learning framework which can also fulfill the functions of knowledge distillation and domain adaptation. In this step, we propose to solve the problem when inadequate or even no labels are available for a target domain by taking advantage of a source domain. Furthermore, our approach can utilize the information across platform and architecture as long as a forward pass of the source network is obtainable.

Last, we propose an efficient and scalable model for cross-dataset one-shot person re-identification tasks. In this case, we address the problem to determine the relationship for a pair of query and gallery images from different camera styles. We adopt the concept from style transfer together with adversarial training to boost the performance and improve the robustness.

Share

COinS