Data-driven Functional Materials Design and Discovery


Functional electronic materials have transformed modern society toward a highly digitized and interconnected global community. The ever-growing demand for electronic devices with superior functionality poses a great challenge to the state-of-the-art field-effect transistors owing to the limited charge density afforded by silicon. Materials scientists and chemists have been working closely to identify novel microelectronic materials, yet the design and discovery of these materials from the atomic-level is anything but trivial. With recent advances in machine learning algorithms as well as the advent of various crystalline materials databases with both experimental and simulated data, we are now able to exploit the strengths of data-driven methods in combination with ab initio simulations to efficiently and effectively discover novel materials with desired functionality. In this thesis, I employ a variety of techniques to address the electronic materials design challenge. Specifically, I focus on the lacunar spinel family, which exhibits a metal-insulator transition upon structural distortion, by applying density functional theory simulations to understand the phase-transition mechanisms and explore the materials phase space. Next, I introduce the adaptive optimization engine (AOE), a novel materials design workflow that learns directly from chemical compositions to realize multiple-property optimization. The AOE frees chemists from solely re- lying on their intuition in materials design. It also enables the co-design of functional materials, and is capable of efficiently identifying the compositions exhibiting superior functionality. Then, I present the deepKNet, a deep neural network which learns from the momentum-space crystal structure genome to make property classifications. The quantitative understanding of the structure-property relationship in crystalline materials is a key step towards efficient materials design where we optimize structure types and chemical compositions in a round-robin fashion. Lastly, I introduce the symbolic regression (SR) technique and its potential applications in materials science. This method is particularly helpful when we want to build a surrogate model mapping input features/descriptors to the output. SR will automatically search for the best function form generated by genetic programming. Unlike other black-box machine learning models, SR offers improved interpretability and insight to the quantitative model, which is invaluable to materials researchers. I hope that my work can inspire more chemists and materials scientists with domain expertise, i.e., synthesis, characterization, theoretical simulation, and informatics, to work collaboratively to further unleash the power of data-driven materials design and discovery.

Alternate Identifier
Date created
Resource type
Rights statement