Utilization and Computational Generation of Enzymatic Reaction Rules to Predict and Analyze Biochemical Pathways


The work in this thesis focuses on computational methods for the identification of novel enzymatic pathways. In particular this work focuses on the utilization of the Biological Network Integrated Computational Explorer (BNICE) software suite to predict de novo enzymatic pathways for the production of commercially relevant compounds and on improvements to this program which have the potential to increase both its universality and the ease with which its predictions can be verified. BNICE uses generalized chemical operators to generate networks of probable biochemical reactions which include not only known enzymatic reactions but also likely reactions not previously found in literature. In the first part of this thesis, BNICE is used to predict enzymatic pathways for the production of propionic acid from pyruvate. 16 such pathways were found which consist of four enzymatic reactions or less. A key reaction in most of these pathways was found to be the reduction of acrylic acid to propionic acid. This reaction was experimentally confirmed by collaborators to be catalyzed by Oye2p from Saccharomyces cerevisiae, a previously unknown reaction for this enzyme. Next, a method is developed for the automatic generation of BNICE chemical operators. Previously, these operators were generated manually, leading to the inability of the operators to describe many enzymatic reactions. This new method allowed for the generation of operator sets capable of describing every atom-balanced reaction in the MetaCyc database. Furthermore, a process is introduced for intuitive adjustment of the specificity of the generated operators by allowing the user to specify the groupings of reactions that should be described by each operator. Finally, this new technique for automatic operator generation in combination with conserved domain database (CDD) superfamily information is used to create a set of operators such that each operator describes reactions associated with similar genes. BNICE is then utilized to apply these operators to every compound in Escherichia coli generating a list of 688,787 compounds. This list of compounds is then compared to the DrugBank database to identify 205 pharmaceutically relevant products which only require the addition of a single reaction for production from Escherichia coli. Furthermore, this method associates each predicted reaction with a CDD superfamily expediting the identification of promising enzyme candidates. These results illustrate the power and flexibility of BNICE and this operator generation program to identify promising enzymatic reactions and to associate these reactions with promising enzymes.

Alternate Identifier
Date created
Resource type
Rights statement