Work

Genomic Interrogation of Pseudomonas aeruginosa Virulence and Antimicrobial Resistance

Public

Pseudomonas aeruginosa is an important gram-negative opportunistic pathogen whose large genome allows it to thrive in diverse environments. There is a wide range of phenotypic variation within the species, which can be attributed both to variation in sequences present in most isolates (the core genome) or the presence or absence of sequences found in only some isolates (the accessory genome). In this dissertation, I present two bacterial genomics studies examining the relationship between the P. aeruginosa genome and phenotypes, one focusing on antimicrobial resistance and the other on virulence. Antimicrobial resistance is a major barrier to treatment of P. aeruginosa infections, with multidrug-resistant infections disproportionally caused by globally distributed sequence types (ST) known as “high-risk clones”. Examining bacterial collections from Northwestern Memorial Hospital, we identified a number of isolates belonging to ST298 which showed substantial drug resistance. ST298, along with the closely related ST446, is part of a larger clonal complex (CC) 446, which has been previously identified as responsible for multidrug-resistant infections around the world. Genomic and phylogenetic analyses identified a subclade of ST298, which we named ST298*, that has caused repeated infections at our institution for at least 16 years and has thus far only been found at our institution. The estimated last common ancestor of this subclade was in 1980, suggesting that it may have been a problem for even longer than appreciated. Many isolates within this subclade harbored a large (~415 kb) plasmid, which contributed to antimicrobial resistance through the presence of a novel class 1 integron. We found that this plasmid was part of a family of large Pseudomonas genus plasmids. In this project, we both uncover a prolonged local epidemic of highly drug-resistant P. aeruginosa and propose that CC446 is an emerging high-risk clone in need of further study. P. aeruginosa isolates show a wide range of virulence in infection models, but it is a complex and combinatorial phenotype with many contributing factors. We took a machine learning approach to predict virulence (high or low) of P. aeruginosa isolates based on genomic content. Using a training set of 115 isolates, we found that the accessory genome could be used to predict virulence level, with nested cross-validation accuracy ranging from 72-75% depending on the algorithm used. We confirmed this finding using a test set of 25 isolates where an accessory genome-based random forest model was able to correctly identify virulence level 72% of the time. Individual accessory genomic elements showed low importance in the accessory genome-based random forest model, which appears to be learning a diffuse genomic fingerprint. We also showed that core genome single nucleotide variants and whole-genome k-mers could be used to predict virulence. While genomic content could be used to predict virulence in P. aeruginosa, it was not predictive of persistence in a collection of early cystic fibrosis isolates. In sum, we found that there is signal within the P. aeruginosa genome that is predictive of an isolate’s virulence in mice. This project can serve as a starting point for future machine learning studies examining the relationship between bacterial genomics and diverse phenotypes.

Creator
DOI
Subject
Language
Alternate Identifier
Keyword
Date created
Resource type
Rights statement

Relationships

Items