Interrogation of the Proteoform Landscape of Immune Activation after Organ Transplantation: Translational Top-Down Proteomics for Next-Generation Diagnostics and Personalized Medicine

Public Deposited

What follows is a strictly post-genomic dissertation. Over the past five years, I have strived to leave the nucleus behind, and even the endoplasmic reticulum and golgi apparatus. The human genome was sequenced over 15 years before the submission of this document, and the world changed somewhat less than anticipated—besides the fact that I can now send a vial of spit and a check to 23andMe® and get an ancestral analysis for only $75 (with a Groupon®). While knowledge of the genetic code is invaluable for basic researchers and the growth of the biotechnology enterprise, its specifically biomedical value is tempered by the reality that disease tends to be extraordinarily complex at the molecular and systems levels. Proteins, the biomolecules that are encoded by sequences of nucleotides as a house is laid out by a structural blueprint, are in fact the molecular effectors of disease. Hence, “the future belongs to proteomics” as HHMI investigator Stanley Fields prophesized in Science shortly after the completion of the Human Genome Project (2001, 291(5507):1221-24). Just like many things can happen to a house between (and after) the blueprint is handed off by the architect to the construction foreman, proteins exist in a very dynamic and changeable world—their abundances change through space and time, they interact and decorate each other with biologically meaningful chemical modifications (or cut each other to pieces), they change in response to environmental stimuli. This landscape—not just post-transcriptional but post-translational—is the true blueprint of phenotype, and it never stops changing. To capture it and understand the whole of it, only the best analytical strategies will stand a chance. These strategies do not exist yet. An underlying theme of my dissertation is that existing proteomics techniques are reductive and miss the point. The most prevalent and widespread proteomics workflows digest proteins and proteomes with commercially available enzymes and then use well-established mass spectrometry instrumentation and methods to identify the resulting peptides by the their distinctive mass-to-charge ratios and those of the amino acids that compose them after fragmentation. This is known as “bottom-up proteomics” because proteins are sequentially degraded and then identified post-analytically by database searching against the annotated genome and parsimonious inference of the original protein family that yielded the detected peptide or collection of peptides. To continue the original house analogy, this is like trying to infer the existence of a blender in the house because you know that somewhere inside it contains an extension cord and a frozen banana—when in reality those items could belong to a toaster and a leftover banana split. The only existing technology that can detect intact proteins with all of their combinations of modifications, termed proteoforms (i.e. the entire toaster and banana split), is appropriately termed “top-down” proteomics. Top-down is extremely difficult and limited by the capabilities of existing technology, from sample preparation to protein fractionation to mass spectrometry instrumentation—just as the genome project was hampered by existing genomics technologies in the late 1980s, before a public/private bubble grew around it and fulfilled Moore’s Law. Still, proteoform-resolved analyses by top-down have recently begun to gain traction as the value of molecular specificity in biomedical problem-solving is realized by researchers. The emergence of top-down proteomics as a truly specific protein characterization tool is exciting, but it may sputter unless the field’s potential translational value and a measurable biomedical impact are realized. In this document, I take top-down proteomics out of the few specialized and highly technical basic research labs that regularly practice it and into the realm of translational medicine, specifically applied to the elucidation of proteoform-resolved biomarkers of clinical acute transplant rejection of the kidney and liver from peripheral blood. The first chapter is a thorough review of the scientific world of top-down, and lays out the trajectory toward achieving biomedical impact in the next 5-10 years. The following chapter summarizes in high technical detail my efforts to establish a discovery top-down workflow for clinical samples analogous to qualitative and quantitative bottom-up “shotgun proteomics” schemes. Two published applications of this technique to kidney and liver transplant rejection biomarkers follow, resolutely setting the cutting edge of translational top-down proteomics as it stands today. The final two chapters summarize my as-yet unpublished efforts to increase the capabilities of top-down protein measurements and add to the value of the technique in translational and clinical research. Chapter 5 lays the foundation for the first quantitative, targeted top-down workflow for proteoform-resolved biomarker validation rapidly deployable across thousands of patients for multiple targets. Chapter 6 shows the value of top-down in completely uncharted territory—in the context of cell subtype-specific measurements and the currently unrealized cell-based human proteome project. An appendix chapter details in layman’s terms the complicated ion physics behind mass spectrometry-based proteomics, and will be useful to start with for those new to mass spectrometry fundamentals. It is my hope that this dissertation helps to lay the foundation for the development of the analytical strategies that will maximize the potential of proteomics measurements in biomedicine and systems medicine, and finally begin to return on the investment of society in the “-omics” sciences in general.

Last modified
  • 01/25/2019
Date created
Resource type
Rights statement