Index Catalog // Arch : Northwestern University Institutional Repository

1. Applications of Statistical Language Models in Complex Network Community Detection and Definition Modeling

Description:: Modeling human language is at the very frontier of machine learning and artificial intelligence. Statistical language models are probabilistic models that assign probabilities to sequences of words. For example, topic models are frequently used text-mining tools to organize a vast set of unstructured documents by exploring their theme structure. More...
Keyword:: definition modeling, statistical language models, deep learning, nonparametric Bayesian model, community detection, and neural language models
Subject:: Artificial intelligence, Statistics, and Computer science
Creator:: Zhu, Ruimin
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_685529 and http://dissertations.umi.com/northwestern:14809

2. Subgroup Identification in Longitudinal Studies

Description:: This dissertation focuses on subgroup identification in longitudinal studies. There are two different but related topics. In chapter two and chapter three, several longitudinal based methods for subgroup identification with enhanced treatment effect are proposed to correct the deficiency in measuring treatment effect by simply using a summary statistic. In...
Keyword:: interaction, recursive partitioning, latent trajectory analysis, latent class, precision medicine, and personalized medicine
Subject:: Statistics
Creator:: Wei, Yishu
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:14649 and etdadmin_upload_662625

3. Variable Selection for High Dimensional Compositional Data with Application in Metagenomics

Description:: The advent of next-generation sequencing technologies has greatly promoted the devel- opment of metagenomics, and the analysis of compositional dataset has a wide range of application in this area. Because of the constraint that the sum of species relative abun- dance being 1, many traditional and classical statistical methods cannot...
Keyword:: stability selection, penalized regression, metagenomimcs, subgroup identification, and compositional data
Subject:: Statistics
Creator:: Wang, Pan
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_638478 and http://dissertations.umi.com/northwestern:14510

4. Embedding a Randomized Experiment Within a Regression Discontinuity Design: A Meta-Analytic Approach

Description:: Randomization is considered the gold standard when it comes to evaluating the effectiveness of interventions, primarily due to its ability to avoid bias. However, in recent years, randomization has been heavily criticized in circumstances where subject randomization may not be ethical. In a randomized controlled trial, patients who are extremely...
Keyword:: Meta-Analysis, Regression Discontinuity Design, and Randomized Experiment
Subject:: Statistics
Creator:: Hong, Mindy
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_686415 and http://dissertations.umi.com/northwestern:14841

5. Statistical Methods for Assessing Replication: A Meta-Analytic Framework

Description:: A replication crisis has enveloped several scientific fields since the early 2000s (see Baker, 2016). This has given rise to improved research and reporting practices (e.g., F. S. Collins & Tabak, 2014), as well as a cottage industry of research into issues of replication and reproducibility (e.g., R. A. Klein...
Keyword:: Replication, Statistics, Experimental Design, Heterogeneity, and Meta-analysis
Subject:: Statistics
Creator:: Schauer, Jacob
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2018-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:14385 and etdadmin_upload_613798

6. Quantification of Microstructure Induced Uncertainty in Multiscale Materials with Random Processes

Description:: The heart of computational materials science lies in providing fundamental insights and understanding of materials behavior and properties across different scales. The significance of this task is highlighted by the Materials Genome Initiative and the emergence of computational tools and frameworks such as materials by design, microstructure sensitive design, and...
Keyword:: random Process, Multiscale simulation, Microstructure, Gaussian process, and Uncertainty
Subject:: Materials Science, Statistics, and Mechanical engineering
Creator:: Bostanabad, Ramin
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:14508 and etdadmin_upload_638165

7. Investigation of diffusion of innovations among homogeneous professional groups

Description:: Innovations are adopted by individuals and spread to other individuals. They are adopted at different rates, some are never adopted at all, some are abandoned, and some become the new norms. A very extensive evidence-based research and practice paradigm that studies how innovations spread is called diffusion of innovations. This...
Keyword:: multipartite networks, complex networks, diffusion of innovations, and network reconstruction
Subject:: Statistics
Creator:: Lee, Hyojun Ada
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:14765 and etdadmin_upload_678904

8. Statistical methods for the network-based analysis of genomic data

Description:: The focus of this thesis is on evaluating, designing, and applying statistical methods that elucidate molecular mechanisms by seeking to understand the pathways that contribute to disease. Chapter 1 introduces the field and motivates the work in this thesis. Chapters 2, 3, and 4 describe original work. Chapter 5 recapitulates...
Keyword:: Gene expression, Algorithms, Pathways, Networks, and Systems biology
Subject:: Statistics and Bioinformatics
Creator:: Shah, Sahil D
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:14812 and etdadmin_upload_685626

9. Estimating Network Metrics via Random Walk Sampling

Description:: In this thesis we present methods for estimating network metrics via random walk sampling. More specifically, we generalize the Hansen-Hurwitz estimator and the Horvitz-Thompson estimator to estimate the shortest path length distribution (SPLD), closeness centrality ranking, and clustering coefficients of a network. Those are important metrics to a network, but...
Keyword:: closeness centrality, sampling, shortest path length, clustering coefficient, network, and random walk
Subject:: Statistics
Creator:: Zheng, Minhui
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_662638 and http://dissertations.umi.com/northwestern:14650

10. Efficient Estimation with Smooth Penalization

Description:: This dissertation proposes an oracle efficient estimator in the context of a sparse linear model. Chapter 1 introduces the penalty and the estimator that optimizes a penalized least squares objective. Unlike existing methods, the penalty is differentiable – once, and hence the estimator does not engage in model selection. This...
Keyword:: Lasso, Econometrics, Sparsity, Penalized estimation, and Bootstrap
Subject:: Statistics and Economics
Creator:: Gitlin, Sergey
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_682504 and http://dissertations.umi.com/northwestern:14786

11. Adaptive Computer Experiments for Metamodeling

Description:: Computer simulation experiments are commonly used as an inexpensive alternative to real-world experiments to form a metamodel that approximates the input-output relationship of the real-world experiment. The metamodel can be useful for decision making and making predictions for inputs that have not been evaluated yet since it can be evaluated...
Keyword:: Metamodel, Gaussian process, Composite grid design, Sequential computer experiment, and Computer experiment
Subject:: Industrial engineering and Statistics
Creator:: Erickson, Collin
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:14683 and etdadmin_upload_663282

12. Spatial Statistics Analysis with Artificial Neural Network

Description:: The spatial autoregressive model has been widely applied in science, in areas such as economics, public finance, political science, agricultural economics, environmental studies and transportation analyses. The classical spatial autoregressive model is a linear model for describing spatial correlation. In this work, we expand the classical model to include time...
Keyword:: Neural network, Maximum likelihood method, Spatial analysis, and Noise injection
Subject:: Mathematics, Geography, and Statistics
Creator:: Wang, Wenqian
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/12/2020
Date Created:: 2019-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_663006 and http://dissertations.umi.com/northwestern:14671

13. Asymptotic Uncertainty Quantification and Its Application in Efficient Sampling and Learning

Description:: The ever growing desire for accurate estimation and efficient learning necessitates the efforts to quantitatively characterize uncertainties for models. In this thesis, four problems pertaining to uncertainty quantification are discussed: A sequential stopping framework of constructing fixed-precision confidence regions is proposed for a class of multivariate simulation problems where variance...
Keyword:: Ranking and Selection, Reinforcement Learning, Statistical Learning, Stochastic Gradient Descent, and Uncertainty Quantification
Subject:: Applied mathematics, Statistics, and Computer science
Creator:: Zhu, Yi
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 01/21/2021
Date Created:: 2020-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:15080 and etdadmin_upload_741745

14. Topics in Microbiome Data Analysis: Normalization and Differential Abundance Test and Large-Scale Human Microbe-Disease Association Prediction

Description:: The advent of sequencing technologies has generated a large amount of biological and medical data. These data such as genetic sequencing data and lab experimental evidence data can help understand critical biomedical problems. This dissertation makes contribution in three different but related applications in biomedical research. In Chapter 2, we...
Keyword:: Normalization, Metagenomics, Graph neural network, Clustering, and Differential abundance test
Subject:: Biostatistics, Statistics, and Bioinformatics
Creator:: Ma, Yuanjing
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 01/21/2021
Date Created:: 2020-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:15057 and etdadmin_upload_736532

15. Several New Advances for Gaussian Process Models

Description:: Gaussian process provides a principled and flexible approach for modeling the response surface or the latent function in many areas, including machine learning, statistics and computer experiment. In literature, Gaussian process models have already demonstrated their effectiveness and usefulness in a variety of applications. In this dissertation, we mainly focus...
Keyword:: variable selection, variational inference, Gaussian process, and lifted Brownian random field
Subject:: Statistics
Creator:: Yu, Yang
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 01/21/2021
Date Created:: 2020-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:15098 and etdadmin_upload_742660

16. Stochastic Noise in Gene Expression Impacts Developmental Self-organization

Description:: Cells are often precisely organized into patterns within developing tissues. This precision must emerge from biochemical processes within, and between cells, that are inherently stochastic. I investigated the impact of stochastic gene expression on self-organized pattern formation, focusing on Senseless (Sens), a key target of Wnt and Notch signaling during...
Keyword:: gene expression, microRNA, cell fate, Drosophila, self-organization, and noise
Subject:: Applied mathematics, Statistics, and Developmental biology
Creator:: Giri, Ritika
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/01/2021
Date Created:: 2020-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_763812 and http://dissertations.umi.com/northwestern:15251

17. Prediction of CRISPR-Cas9 Cleavage Efficiency Through Markov Feature Engineering and Boosting-Based Transfer Learning

Description:: In the short amount of time that genetic manipulation has been possible through CRISPR technology, myriad applications have been developed. Results from one of the most promising applications of this technology, pooled screens, have shown that single guide RNAs (sgRNAs), RNA sequences used to target specific regions of the genome,...
Keyword:: sgRNA design, CRISPR efficiency, and Machine learning
Subject:: Biostatistics, Statistics, and Bioinformatics
Creator:: Zarate, Oscar Alberto
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/01/2021
Date Created:: 2020-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:15353 and etdadmin_upload_773909

18. Topics in Statistical Modeling for Unstructured Text Data with Application to Commonsense Inference

Description:: Commonsense inference is a critical capability of modern artificial intelligence (AI) systems. The machines need commonsense knowledge to perform tasks exactly like human being does. Learning commonsense inference from text has been a long standing challenge in the field of natural language processing due to reporting bias -- people do...
Keyword:: data augmentation, commonsense inference, statistical modeling in text data, NLP, deep learning, and language modeling
Subject:: Statistics and Computer science
Creator:: Yang, Yiben
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/01/2021
Date Created:: 2020-01-01
Resource Type:: Dissertation
Alternate Identifier:: http://dissertations.umi.com/northwestern:15163 and etdadmin_upload_750023

19. Multidisciplinary and Dynamic Decisions in Simulation-Based Design

Description:: Modern design practices rely more and more on computer simulations due to their low cost compared with physical experiments. However, it is still an elusive task to fully unleash the advantages of the simulation models while mitigating their disadvantages for designing complex engineering systems. In simulation-based design, computer simulation models...
Keyword:: Design optimization, Bayesian optimization, Uncertainty quantification, Simulation-based design, Model calibration, and Multidisciplinary design
Subject:: Statistics, Design, and Mechanical engineering
Creator:: Tao, Siyu
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/01/2021
Date Created:: 2020-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_773686 and http://dissertations.umi.com/northwestern:15350

20. Methods for Multi-Objective Genetic Clustering of Time-Evolving Data

Description:: Sequential batches of time-evolving data for a set of persistent identifiable entities (e.g. online shopping behavior by month for a customer ID, or economic figures by year for a collection of countries) can exhibit temporal shifts in their underlying clustering structure. Methods for recovering this evolutionary clustering structure exploit natural...
Keyword:: clustering and multi-objective
Subject:: Statistics
Creator:: Alleman, Austin
Owner:: Scholarly Digital Publishing
Language:: en
Date Uploaded:: 02/01/2021
Date Created:: 2020-01-01
Resource Type:: Dissertation
Alternate Identifier:: etdadmin_upload_763977 and http://dissertations.umi.com/northwestern:15255

LIBRARIES | ARCH

Limit your search

Type

Resource type

Creator

Keyword

Subject

Language

1. Applications of Statistical Language Models in Complex Network Community Detection and Definition Modeling

2. Subgroup Identification in Longitudinal Studies

3. Variable Selection for High Dimensional Compositional Data with Application in Metagenomics

4. Embedding a Randomized Experiment Within a Regression Discontinuity Design: A Meta-Analytic Approach

5. Statistical Methods for Assessing Replication: A Meta-Analytic Framework

6. Quantification of Microstructure Induced Uncertainty in Multiscale Materials with Random Processes

7. Investigation of diffusion of innovations among homogeneous professional groups

8. Statistical methods for the network-based analysis of genomic data

9. Estimating Network Metrics via Random Walk Sampling

10. Efficient Estimation with Smooth Penalization

11. Adaptive Computer Experiments for Metamodeling

12. Spatial Statistics Analysis with Artificial Neural Network

13. Asymptotic Uncertainty Quantification and Its Application in Efficient Sampling and Learning

14. Topics in Microbiome Data Analysis: Normalization and Differential Abundance Test and Large-Scale Human Microbe-Disease Association Prediction

15. Several New Advances for Gaussian Process Models

16. Stochastic Noise in Gene Expression Impacts Developmental Self-organization

17. Prediction of CRISPR-Cas9 Cleavage Efficiency Through Markov Feature Engineering and Boosting-Based Transfer Learning

18. Topics in Statistical Modeling for Unstructured Text Data with Application to Commonsense Inference

19. Multidisciplinary and Dynamic Decisions in Simulation-Based Design

20. Methods for Multi-Objective Genetic Clustering of Time-Evolving Data

Limit your search

Type

Resource type

Creator

Keyword

Subject

Language

Search Constraints

Search Results

1. Applications of Statistical Language Models in Complex Network Community Detection and Definition Modeling

2. Subgroup Identification in Longitudinal Studies

3. Variable Selection for High Dimensional Compositional Data with Application in Metagenomics

4. Embedding a Randomized Experiment Within a Regression Discontinuity Design: A Meta-Analytic Approach

5. Statistical Methods for Assessing Replication: A Meta-Analytic Framework

6. Quantification of Microstructure Induced Uncertainty in Multiscale Materials with Random Processes

7. Investigation of diffusion of innovations among homogeneous professional groups

8. Statistical methods for the network-based analysis of genomic data

9. Estimating Network Metrics via Random Walk Sampling

10. Efficient Estimation with Smooth Penalization

11. Adaptive Computer Experiments for Metamodeling

12. Spatial Statistics Analysis with Artificial Neural Network

13. Asymptotic Uncertainty Quantification and Its Application in Efficient Sampling and Learning

14. Topics in Microbiome Data Analysis: Normalization and Differential Abundance Test and Large-Scale Human Microbe-Disease Association Prediction

15. Several New Advances for Gaussian Process Models

16. Stochastic Noise in Gene Expression Impacts Developmental Self-organization

17. Prediction of CRISPR-Cas9 Cleavage Efficiency Through Markov Feature Engineering and Boosting-Based Transfer Learning

18. Topics in Statistical Modeling for Unstructured Text Data with Application to Commonsense Inference

19. Multidisciplinary and Dynamic Decisions in Simulation-Based Design

20. Methods for Multi-Objective Genetic Clustering of Time-Evolving Data