Methods for Protein Identification and Sequencing
Understand common methods, their strengths, limitations, and applications
There is no debate over the utility of genome sequencing to understand the complex molecular networks of cells in health and disease. However, understanding a protein’s amino acid sequence is equally important. This knowledge is essential for understanding its structure and function and can provide valuable insights into biological processes and disease mechanisms.
This article provides an overview of the most common approaches to protein identification and sequencing. We discuss how each method works, the information it offers, its pros, cons, and best applications. Additionally, we explore potential developments in this rapidly evolving field.
Common approaches to protein sequencing
Mass Spectrometry
Mass spectrometry is a commonly used technique for studying proteins. It involves breaking down proteins into smaller peptides, which are separated, fragmented, ionized, and captured by mass spectrometers. The end result is a mass spectrum containing ions characteristic of the sequence of amino acids in the selected peptide.
Mass spectrometry is versatile and can identify various proteins and protein modifications. It can handle complex mixtures and quantify proteins across diverse samples.
However, existing mass spectrometry methods can typically identify fewer than 50 percent of the proteins in a complex sample. Another major limitation of mass spectrometry is its inability to differentiate proteins with similar masses or structures. This low sensitivity can lead to inaccurate identifications or missing relevant proteins. Moreover, the process requires expertise in data analysis due to the complex nature of the results.
Immunoassays
Immunoassays, such as Western blotting and enzyme-linked immunosorbent assay (ELISA), are well-established protein detection and quantification techniques. These methods rely on the specificity of antibodies to recognize target proteins. While immunoassays are not typically used to determine the sequence of amino acids in a protein, they are valuable tools for confirming the presence of specific proteins in a sample.
Immunoassays are sensitive and specific tests that produce rapid results. In addition, they offer the benefit of not requiring complex equipment or the use of radioactive labels.
However, challenges with these techniques do exist. Antibody binding efficiency can be problematic, potentially leading to false positives or negatives. Cross-reactivity with similar proteins can also occur, impacting the accuracy of results. Finally, Western blotting and ELISA are multistep protocols; variations and errors can occur at any step, reducing the reliability and reproducibility of these techniques.
Next-generation sequencing (NGS)
While commonly associated with DNA and RNA sequencing, NGS has extended its applications to protein sequencing, improving the speed and ease in which researchers can correlate biological function with changes in protein sequence and modifications.
New NGS platforms enable very sensitive experiments and allow researchers to study proteins that are in lower abundance. This can help uncover previously hidden aspects of cellular biology and reveal valuable insights into protein variations and their effects.
Best applications for each method
Each method mentioned above excels in specific applications.
Immunoassays are well-suited for confirming the presence of specific proteins, validating results from other techniques, and routine protein quantification. Scientists can also use Western blotting and ELISA techniques to identify the presence and extent of post-translational modifications on proteins, including phosphorylation, glycosylation, or ubiquitination. However, detecting the broad spectrum of post-translational modifications is limited by the number of available antibodies, substantial specificity issues, and lot-to-lot variations, leading to reproducibility problems.
In contrast, mass-spectrometry identifies and quantifies specific post-translational modifications with high sensitivity and resolution. In addition, it is ideal for large-scale proteomic studies and characterizing protein and protein complexes. However, mass spectrometry often fails to identify rare proteins in a complex sample.
NGS can identify rare genetic variations, including mutations and single nucleotide polymorphisms that can impact protein sequences. This is particularly valuable for understanding how these variations translate to differences in protein function and disease susceptibility. Moreover, NGS can excel in detecting modifications that might remain unmatched or obscured in mass-spectrometry or present as unidentified in immunoassays. Additionally, NGS can facilitate de novo sequencing of proteins, a method in which the amino acid sequence of a protein is directly determined without prior knowledge of its DNA. This is useful for discovering novel proteins or protein variations.
The future: integration of protein sequencing with other assays
While the technologies outlined in this article offer advantages, there is a trade-off in protein analysis. Mass-spectrometry can survey a large number of proteins, but its sensitivity can vary depending on the specific approach and instrumentation employed. On the other hand, immunoassays tend to be highly sensitive, but offer a very narrow survey of the proteome. Methods like NGS, which enable single-molecule sequencing of proteins, help bridge this gap by offering comprehensive proteome coverage and high sensitivity. Moreover, the ability to perform de novo sequencing of proteins with NGS can be a game-changer in protein characterization, especially in cases where prior sequence knowledge is limited or absent. These advantages position NGS as a valuable addition to existing proteomic methods.
In the near-term, many studies will likely seek to combine the analysis of protein, DNA, and RNA sequence data. This integrative approach could lead to a deeper understanding of disease mechanisms and help identify key proteins involved in various conditions. In addition, advancements in protein sequencing could enable personalized treatment approaches, tailoring interventions based on an individual’s protein profile.
As these advancements unfold, it’s essential to consider streamlining data analysis pipelines and formulating standardized protocols. Such efforts not only enhance accessibility but also improve the reproducibility of protein sequencing techniques. With further development, protein sequencing methods can become sensitive, scalable, and accessible for various proteomic applications. Additionally, they will likely become increasingly portable, enabling their applications in field-based settings.