Deep Learning: Neural Network Applications and Architectures in Sequence and Image Classification


This thesis focuses on the application of neural networks to three types of classification tasks, each work with its own chapter. The objective of the first work is to take advantage of deep neural networks in order to make next day crime count predictions in a fine-grain city partition. We make predictions using Chicago and Portland crime data, which is augmented with additional datasets covering weather, census data, and public transportation. The crime counts are broken into 10 bins and our model predicts the most likely bin for a each spatial region at a daily level. We train this data using increasingly complex neural network structures, including variations that are suited to the spatial and temporal aspects of the crime prediction problem. With our best model we are able to predict the correct bin for overall crime count with 75.6% and 65.3% accuracy for Chicago and Portland, respectively. The results show the efficacy of neural networks for the prediction problem and the value of using external datasets in addition to standard crime data. For the next work, we study classification tasks that take as inputs groups of images rather than single images. In order to address such situations, we introduce a nested multi-instance deep network. The approach is generic in that it is applicable to general data instances, not just images. The network has several convolutional neural networks grouped together at different stages. This primarily differs from other previous works in that we organize instances into relevant groups that are treated differently. We also introduce a method to replace instances that are missing which successfully creates neutral input instances and consistently outperforms standard fill-in methods in real world use cases. In addition, we propose a method for manual dropout when a whole group of instances is missing that allows us to use sparser training data and obtain higher accuracy at the end of training. With specific pretraining, we find that the model works to great effect on our real world and public datasets in comparison to baseline methods, with our improvements ranging from 1% to 5%. Finally, there are time series that are amenable to recurrent neural network (RNN) solutions when treated as sequences, but some series, e.g. asynchronous time series, provide a richer variation of feature types than current RNN cells take into account. In order to address such situations, we introduce a unified RNN that handles five different feature types, each in a different manner. Our RNN framework separates sequential features into two groups dependent on their frequency, which we call sparse and dense features, and which affect cell updates differently. Further, we also incorporate time features at the sequential level that relate to the time between specified events in the sequence and are used to modify the cell's memory state. We also include two types of static (whole sequence level) features, one related to time and one not, which are combined with the encoder output. The experiments show that the modeling framework proposed does increase performance compared to standard cells.

Date created
Resource type
Rights statement