The Operating System (OS) kernel is a key component of modern computing infrastructure, yet it is prone to numerous vulnerabilities, many of which cause memory corruptions that can be exploited by attackers to perform malicious activities. While various techniques have been introduced to secure the Linux kernel, it still constantly...
Clustering is a fundamental task in unsupervised learning, which aims to partition the data set into several clusters. It is widely used for data mining, image segmentation, and natural language processing. One of the most popular clustering methods is centroid-based clustering, including k-medians and k-means clustering. k-medians and k-means clustering...
Performing complex reasoning has been a long-standing challenge in artificial intelligence (AI).This thesis describes a class of AI systems designed to reason, extract knowledge, and answer
questions on various domains such as process understanding, elementary science, and math word
problems. Our approach differs from traditional logical reasoning systems since we...
Mission-critical systems are those imperative systems whose failures can result in catastrophic consequences. Traditional techniques, such as manual investigation and testing, cannot ensure the absence of errors and security vulnerabilities within these systems. This dissertation leverages formal methods to comprehensively examine several mission-critical systems and their essential components. For each...
In the late 2000’s, scientific studies in cultural heritage saw a great advancement in macro X-ray fluorescence (XRF) imaging of paintings. These images are used to generate elemental distribution maps, which aid in identifying chemical elements and paint pig- ments as well as their locations throughout the layers of the...
In the Maximum-a-Posteriori (MAP) Inference problem, for any given probability distribution, the goal is to find the point in the support of that distribution with the highest probability. Potts models and Determinantal Point Processes (DPPs) are probabilistic models that were introduced in the context of statistical physics several decades ago....
As our world is increasingly filled with data visualizations, having the skills to leverage data visualizations is essential for participation in society. Confident engagement with data visualizations is critical for being an educated member of society; however, research has shown that it is difficult for individuals to digest and gain...
Machine learning is seeping into every fabric in various practical domains such as autonomous driving, wearable computing, and smart buildings. However, in the actual development and integration, especially when the learning-based components are frequently included as components of large complex systems where the physical instances can be included as interactable...
Task-oriented conversational systems are becoming increasingly popular, as shown by the rise of conversational recommendation systems across multiple platforms (e.g., Google Home, Alexa, and Siri) and domains (e.g., local establishments, e-commerce, books, music, and movies). However, users are still largely limited in what preferences they can express and how, as...
The production and spread of digital news involves a wide range of actors: journalists and the organizations that employ them, social media platforms, audiences, and myriad commentators, citizen journalists, bloggers, and other actors who contribute to the news ecosystem without inhabiting an official role. These actors interact in flexible, often...
Human communication has become increasingly reliant on systems made and managed by large technology companies like Google, Apple, Twitter, and Meta (formerly Facebook). These systems offer people many benefits, but they also present new challenges for society. In recent years, researchers, lawmakers, and journalists have suggested that large technology companies...
This dissertation introduces several novel computational imaging techniques that capture and analyze the 3D surface shapes and internal layered materials. The research proposes user-friendly and non-invasive imaging systems, constructed using only commercial off-the-shelf (COTS) components, which provide accurate measurement of 3D information that was previously inaccessible. The dissertation focuses on...
Due to their widespread applicability, graphs and networks appear in various contexts. The increasing scale of graphs encountered in the real-world requires the developmentof efficient algorithms that run reasonably fast and produce close to optimal solutions.
The main focus of this thesis is the development of fast graph algorithms for...
Task-oriented conversational systems are becoming increasingly popular, as shown by the rise of conversational recommendation systems across multiple platforms (e.g., Google Home, Alexa, and Siri) and domains (e.g., local establishments, e-commerce, books, music, and movies). However, users are still largely limited in what preferences they can express and how, as...
Many computing technologies are primarily useful because of the existence of some set of data created by people, intentionally in some cases and unintentionally in others. For instance, technologies like search engines, recommender systems, classifiers, and language models are all dependent on digital records of things people have said, done,...
A massive amount of data is generated every second all around the world. Machine learning becomes the most attractive solution to consume the data fuel and transform it into productivity. It has yielded great results in many fields, such as healthcare, marketing, finance, etc. Machine learning models are usually designed...
Public-facing data-driven technologies such as social media platforms and search engines rely on data producers, such as users and crowd workers, to be feasible and financially sustainable. Recently, it became clear that the goals of these data-driven technologies do not always align with those of the public, causing public backlashes...
While there is high demand for university computer science (CS) courses, students often struggle when learning to program. Prior work has identified that student perceptions of their programming ability may contribute to these challenges. For example, studies show that students often perceive that they do not belong, are not capable...
The current view in neuroscience holds that the brain, together with its sensory and motor structures and the environment, form a closed-loop system – a sensorimotor loop – in which the brain receives information from the environment and converts it into a motor response while simultaneously making predictions about future...
Memory management and address translation need significant optimizations in order to not behindrances in the near future. Currently, plenty of work has started to address issues within the
current abstraction of the hardware-software codesign of paging. I argue that a new abstraction
is needed in order to properly address this...
The advent of metamaterials—hierarchical structures that manifest properties beyond those found in nature through geometry rather than material composition—inspired new possibilities and research in many fields. In mechanics, periodic metamaterials exhibit behaviors ranging from unprecedented compressibility to extreme stiffness. Numerous geometric classes of metamaterials with these properties have been discovered,...
Wearable visual systems, such as ego-centric wearable cameras, have failed to integrate into everyday life. We have witnessed the abandonment of wearable visual systems as consumer devices (e.g., Google Glass) and as research tools (e.g., SenseCam). While it is natural for some technologies to die out, visual wearable systems are...
In this dissertation, we aim to develop algorithms that achieve optimality with provable complexity guarantees under various settings in reinforcement learning (RL). Specifically, in Markov decision processes (MDPs), we study single-agent and multi-agent online RL, respectively, and offline RL under the presence of unobserved confounders. Single-agent online RL. We design...
In recent years, machine learning on graphs (or networks) has gone from a niche topic with only a few active researchers worldwide, to a heavily invested field with novel use cases for dealing with relationships and/or interactions within complex systems in the natural and social sciences. Traditionally, choosing the right...
X-ray imaging at nano and micro-scale is of great importance for the material science and defense industry. Large penetration depth and low wavelength of x-rays offer an important potential to image objects at high resolution and in a non-invasive process. While the ever-growing community is pursuing novel applications and looking...
Visual Question Answering (VQA) increasingly attracts industry and academia attention. It requires the model to provide a natural language answer by an image and a related natural language question. Meanwhile, it relates to multidisciplinary research such as natural language understanding, visual information retrieval, and multimodal reasoning. As a multimodality task,...
Imagine sitting in a room listening to some friends play a song. Perhaps one friend is playing guitar, another playing bass, and a third is playing drums. The musical content in this scene is extraordinarily complex, yet it contains many types of structure that is easy for us to comprehend....
Wearable-based human activity recognition is well-studied in the machine learning and pervasive computing community. A large corpus of studies focused on using wearable sensors to recognize health-related behaviors that involve high periodicity in the sensed signal, such as sitting, walking, and running. Other activities that occur less frequently throughout the...
Asymmetric relationships between creators and consumers in peer-produced knowledge repositories produce inequitable knowledge representation--or knowledge gaps. These gaps result in unequal access to information, and downstream technologies that leverage peer-produced data perpetuate these inequities. Effective knowledge gap identification represents a necessary first step towards equitable knowledge representation. However, while prior...
The past decade has seen the rapid progress of deep learning, which becomes a game-changing technique in different data-intensive domains, with the availability of large scale data, cost-effective computing hardware and more advanced learning theory and algorithms. Despite of the rapid progress of deep learning methods in daily-life applications, such...
Automated driving has become a very popular topic in the recent years and is becoming more and more of a reality. In this new trend, High Definition (HD) maps play an important role in many ways that will provide a safer and more efficient driving experience, especially in terms of...
Modern data sets are increasingly vast, not only in the number of samples, but also in the number of measurements, or features, that they contain. This high-dimensionality poses a unique set of problems for data analysis due to a set of phenomena known as ``the curse of dimensionality.'' This thesis...
Art has been tied to scientific and technological advancements throughout history, providing methods and mediums for communication, expression, and exploration. Art is a dialogic domain that evolves with the technological advances in society–incorporating technology and computational tools to create new genres of art. We live in an increasingly computational and...
Since the invitation of ARPANet in 1969, network protocols and communication systems have continued to emerge. Especially in the past decade, the prosperity of mobile internet and cloud computing has resulted in a large number of network protocols and communication systems, which have become critical infrastructure for our society. Availability...
The rise and racial gap in maternal mortality and morbidity in the US growing public health crisis. The US maternal mortality rate is double that of peer countries such as the UK and Canada. Even more striking, Black women are 243% more likely to die from childbirth-related causes. According to...
We consider general utility models and information structures of the agents and illustrate when economic conclusions for designing simple mechanisms in classical settings extends for general environments. We show that whether economic conclusions can be generalized depends on the details of the generalizations. For example, in single-item auction, competition and...
Next generation cellular networks are expected to support a massive data traffic volume and satisfy a vast number of users that have latency-critical quality-of-service expectations. Towards serving this demand, it is envisaged that the interference management problem will be the main bottleneck due to the likeliness of a heavily interfering...
At its core, the purpose of microscopy is to make objects and their underlying structures visible under high magnification. With the remarkable progress of electron microscopy, the sub-micron “high” magnification of light microscopy has been completely refashioned to encompass subatomic length scales. Unfortunately, higher-magnification does little to negate existing interpretability...
Existing nonlinear optimization methods have proven reliable over the past few decades for a wide range of applications but have critically relied on accurate function and gradient evaluations. Modern nonlinear optimization problems arising from machine learning and scientific computing applications are increasingly complex and large scale, which make accurate evaluations...
Language models are the foundation of many natural language tasks such as machine translation, speech recognition, and dialogue systems. Modeling the probability distributions of text accurately helps capture the structures of language and extract valuable information contained in various corpora. In recent years, many advanced models have achieved state-of-the-art performance...
We live in an increasingly computational world; one that, in the near term, may require everyone to be computationally literate. Computer science (CS) education has greatly increased its reach in the last two decades with an increasing number of students having access to formal computer science classroom experiences in the...
The dissertation builds on my current research to demonstrate the connection between affect and learning through machine learning and qualitative analysis of interactions where players use a complex systems game. The project is threefold: First, I developed a thinking and learning intervention, the agent-based modeling simulation Ant Adaptation. I showed...
Recent developments in deep learning have led to breakthroughs in rendering novel views from sparse input views of a scene.While the accuracy of these algorithms has improved dramatically, it has come at a huge computational cost.
While developments in graphics hardware have ameliorated some of the computational burdens, deep learning-based...
This thesis studies Bayesian-robustness of algorithm design. The main perspective requires for a single fixed algorithm that its performance is an approximation of the optimal performance when its inputs are independent and identical draws (i.i.d.) from every unknown distribution which is an element of a known, large class of distributions....
In this thesis, we aim to develop efficient algorithms with theoretical guarantees for noisy nonlinear optimization problems, with and without constraints, under various different assumptions. Apart from Chapter 1 which provides relevant backgrounds, the remaining of thesis is divided into four chapters. In Chapter 2, we establish the theoretical convergence...
Over the past decade as smartphones and wearable tracking devices have grown in popularity, more individuals have begun collecting their own health and behavioral data. Innovations in sensor technology now allow individuals to continuously collect data over long periods of time with minimal effort. As a result, more data has...
This research looks at the robotic shape formation problem, which is one of the fundamental problems in robotic swarm systems. Here, the task is to move a group of robots to form a user-specified shape. In this dissertation, the task of shape formation is divided to four problems: (i) using...
Manufacturing processes are known for their intricacies in changing material shapes and properties. New generations of manufacturing technologies, known as flexible manufacturing, are moving toward design freedom, which allows producing parts with optimized geometries and high customizations at an affordable cost even for low-volume productions. Two prominent flexible manufacturing processes...
Human language processing is incremental. In this dissertation, I explore how an incremental perspective can help us clarify our understanding of transformational syntax, which typically proceeds bottom-up. As part of our exploration, I develop an incremental head-driven parsing algorithm for Minimalist Grammars. The two main innovations of this parsing algorithm...
Security and robustness are two critical problems in modern computing system. In this disserta- tion, we study these two problems in both hardware system and learning system.Firstly, we discuss the robustness problem in hardware system. Modern microprocessors suffer from significant on-chip variation at the advanced technology nodes. The development of...