Amit Yadav - Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry

Version 1

      Publication Details (including relevant citation   information):

    1.             Dhanashree S. Kelkar1,  
    2.  
    3.             Dhirendra Kumar2,  
    4.  
    5.             Praveen Kumar1,  
    6.  
    7.             Lavanya Balakrishnan1,  
    8.  
    9.             Babylakshmi Muthusamy3,  
    10.  
    11.             Amit Kumar Yadav2,  
    12.  
    13.             Priyanka Shrivastava2,  
    14.  
    15.             Arivusudar Marimuthu1,  
    16.  
    17.             Sridhar Anand4,  
    18.  
    19.             Hema Sundaram4,  
    20.  
    21.             Reena Kingsbury4,  
    22.  
    23.             H. C. Harsha1,  
    24.  
    25.             Bipin Nair5,  
    26.  
    27.             T. S. Keshava Prasad1,  
    28.  
    29.             Devendra Singh Chauhan6,  
    30.  
    31.             Kiran Katoch6,  
    32.  
    33.             Vishwa Mohan Katoch7,  
    34.  
    35.             Prahlad Kumar4,  
    36.  
    37.             Raghothama Chaerkady8,  
    38.  
    39.             Srinivasan Ramachandran2,  
    40.  
    41.             Debasis Dash2    and  
    42.  
    43.             Akhilesh Pandey8,* 

      mcp.M111.011627.

      Abstract:

      The genome sequencing of H37Rv strain of Mycobacterium   tuberculosis was completed in 1998 followed by the whole genome   sequencing of a clinical isolate, CDC1551 in 2002. Since then,   the genomic sequences of a number of other strains have become   available making it one of the better studied pathogenic   bacterial species at the genomic level. However, annotation of   its genome remains challenging because of high GC content and   dissimilarity to other model prokaryotes. To this end, we carried   out an in-depth proteogenomic analysis of the M. tuberculosis   H37Rv strain using Fourier transform mass spectrometry with high   resolution at both MS and MS/MS levels. In all, we identified   3,176 proteins from Mycobacterium tuberculosis representing ~80%   of its total predicted gene count. In addition to protein   database search, we carried out genome database search, which led   to identification of ~250 novel peptides. Based on these novel   genome search specific peptides (GSSPs), we discovered 41 novel   protein coding genes in the H37Rv genome. Using peptide evidence   and alternative gene prediction tools, we also corrected 79 gene   models. Finally, mass spectrometric data from N-terminus-derived   peptides confirmed 745 existing annotations for translational   start sites while correcting those for 33 proteins. We report   creation of a high confidence set of protein coding regions in   Mycobacterium tuberculosis genome obtained by high resolution   tandem mass-spectrometry at both precursor and fragment detection   steps for the first time. This proteogenomic approach should be   generally applicable to other organisms whose genomes have   already been sequenced for obtaining a more accurate catalog of   protein-coding genes.

      Address (URL): http://mcponline.org/content/early/2011/10/03/mcp.M111.011627.abstract