Cells are often precisely organized into patterns within developing tissues. This precision must emerge from biochemical processes within, and between cells, that are inherently stochastic. I investigated the impact of stochastic gene expression on self-organized pattern formation, focusing on Senseless (Sens), a key target of Wnt and Notch signaling during...
In the short amount of time that genetic manipulation has been possible through CRISPR technology, myriad applications have been developed. Results from one of the most promising applications of this technology, pooled screens, have shown that single guide RNAs (sgRNAs), RNA sequences used to target specific regions of the genome,...
Commonsense inference is a critical capability of modern artificial intelligence (AI) systems. The machines need commonsense knowledge to perform tasks exactly like human being does. Learning commonsense inference from text has been a long standing challenge in the field of natural language processing due to reporting bias -- people do...
Modern design practices rely more and more on computer simulations due to their low cost compared with physical experiments. However, it is still an elusive task to fully unleash the advantages of the simulation models while mitigating their disadvantages for designing complex engineering systems. In simulation-based design, computer simulation models...
Sequential batches of time-evolving data for a set of persistent identifiable entities (e.g. online shopping behavior by month for a customer ID, or economic figures by year for a collection of countries) can exhibit temporal shifts in their underlying clustering structure. Methods for recovering this evolutionary clustering structure exploit natural...
Supervised learning model is one of the most fundamental machine learning models. It can provide powerful capability of prediction by learning complex patterns hidden in many, sometimes thousands, predictors. It can also be used as a building block of other machine learning tasks, like unsupervised learning and reinforcement learning. Such...
The ever growing desire for accurate estimation and efficient learning necessitates the efforts to quantitatively characterize uncertainties for models. In this thesis, four problems pertaining to uncertainty quantification are discussed: A sequential stopping framework of constructing fixed-precision confidence regions is proposed for a class of multivariate simulation problems where variance...
The advent of sequencing technologies has generated a large amount of biological and medical data. These data such as genetic sequencing data and lab experimental evidence data can help understand critical biomedical problems. This dissertation makes contribution in three different but related applications in biomedical research. In Chapter 2, we...
Gaussian process provides a principled and flexible approach for modeling the response surface or the latent function in many areas, including machine learning, statistics and computer experiment. In literature, Gaussian process models have already demonstrated their effectiveness and usefulness in a variety of applications. In this dissertation, we mainly focus...
Modeling human language is at the very frontier of machine learning and artificial intelligence. Statistical language models are probabilistic models that assign probabilities to sequences of words. For example, topic models are frequently used text-mining tools to organize a vast set of unstructured documents by exploring their theme structure. More...