Work

Topics in Classification with Deep Learning

Public

The task of classification has been increasingly attracting attention from researchers in recent years. The objective is to assign labels given attributes of samples. The classification task is practical in real-world applications and is widely explored in fields such as computer vision, natural language processing and information retrieval. The recent advances of deep learning techniques provide many efficient solutions to the classification task. This dissertation includes four chapters: 1) k-Nearest Neighbors by Means of Sequence to Sequence Deep Neural Networks and Memory Networks, 2) Automatic Ontology Learning from Domain-Specific Short Unstructured Text Data, 3) Concept Drift and Covariate Shift Detection Ensemble with Lagged Labels and 4) Open Set Domain Adaptation by Extreme Value Theory. In the first chapter, we mimic the k-Nearest Neighbors method by two families of deep networks. k-Nearest Neighbors is one of the most fundamental but effective classification models. We propose two families of models built on a sequence to sequence model and a memory network model to mimic the k-Nearest Neighbors model, which generate a sequence of labels, a sequence of out-of-sample feature vectors and a final label for classification, and thus they could also function as oversamplers. We also propose 'out-of-core' versions of our models which assume that only a small portion of data can be loaded into memory. In the second chapter, we provide an efficient and effective way to automatic ontology learning. Ontology learning is a critical task in industry, which deals with identifying and extracting concepts reported in text such that these concepts can be used in different tasks, e.g. information retrieval. The problem of ontology learning is non-trivial due to several reasons with a limited amount of prior research work that automatically learns a domain specific ontology from data. We propose a two-stage classification system to automatically learn an ontology from unstructured text. In the proposed model, the first-stage classifier classifies candidate concepts into relevant and irrelevant concepts and then the second-stage classifier assigns specific classes to the relevant concepts. The proposed system is deployed as a prototype in General Motors and its performance is validated by using complaint and repair verbatim data collected from different data sources. In the third chapter, we propose a drift detection ensemble to detect concept drifts and covariate shifts and automatically select the retraining data. In model serving, having one fixed model during the entire often lift-long inference process is usually detrimental to model performance, as data distribution evolves over time, resulting in lack of reliability of the model trained on historical data. It is important to detect changes and retrain the model in time. The existing methods generally have three weaknesses: 1) using only classification error rate as signal, 2) assuming ground truth labels are immediately available after features from samples are received and 3) unable to decide what data to use to retrain the model when change occurs. We address the first problem by utilizing six different signals to capture a wide range of characteristics of data, and we address the second problem by allowing lag of labels, where labels of corresponding features are received after a lag in time. For the third problem, our proposed method can automatically decide what data used to retrain to use on the signals. In the fourth chapter, we solve the problem of open set domain adaptation by utilizing extreme value theory. Common domain adaptation techniques assume that the source domain and the target domain share an identical label space, which is not practical since when target samples are unlabeled we have no knowledge on whether two domains share the same label space. When the assumption is not satisfied, such methods fail to perform well because the additional unknown classes are also matched with the source domain during adaptation. In this chapter, we tackle the open set domain adaptation problem which assumes the source and the target label spaces only partially overlap, and the task becomes when the unknown classes exist, how to detect the target unknown classes and avoid aligning them with the source domain. We propose to 1) utilize an instance-level reweighting strategy for domain adaptation where the weights indicate the likelihood of a sample belonging to known classes and 2) model the tail of entropy distribution with Extreme Value Theory for unknown classes detection.

Creator
DOI
Subject
Language
Alternate Identifier
Keyword
Date created
Resource type
Rights statement

Relationships

Items