NMR; protein structure determination; protein structure
Abstract :
[en] The knowledge of the tridimensional structure of a protein is essential to design drugs, to predict protein function and to study mechanism of protein function. 3D structure can be determined using two experimental techniques: X-Ray and NMR. However, these techniques have limitations: they are time consuming, manually intensive and sometime technically difficult. Due to these limitations, different approaches that combine the strength of computer and sparse experimental NMR data have been proposed to determine rapidly 3D protein structure. Thus different approaches that use sparse NMR experimental NMR data such as backbone chemical shifts, incomplete sets of NOEs distances and residual dipolar couplings of backbone have been designed. To assess whether if these automated methods can indeed produce structures that closely match those manually refined by experts using the same experimental data, Critical Assessment of Automated Structure Determination of Proteins from NMR Data [1] (CASD-NMR) was created. CASD-NMR concept closely resembles to Critical Assessment of Automated Structure Prediction [2] (CASP) that aim to assess performance of protein structure prediction methods from sequence.
Accordingly to CASD-NMR 2013 [3] recommandations, we are working on the comparison of different automated protein structure determination methods driven by backbone chemical shifts and homology modeling in terms of the fitness of resulting 3D structures with experimental NMR data. For homology modeling, we have been using two approaches: I-TASSER [4] and MODELLER [5] that are best public CASP-certified protein structure prediction servers. Structure calculation guided by NMR backbone chemical shifts were done using different approaches: (i) ROSETTA method that is widely used by the scientific community. Different rosetta based methods exist that used only backbone chemical shifts as experimental data. Moreover, ROSETTA method is probably the most backbone chemical shifts based structure calculation method used and cited by the community. For our comparison, three ROSETTA-family methods have been used: CS-ROSETTA [6], CS-HM-ROSETTA [7] and RASREC CS-ROSETTA [8]. (ii) Cheshire which is the first backbone chemical shifts based method which performed as well as CS-HM-ROSETTA during CASD-NMR-2013, and (iii) CS23D which is a web server that performed 1000-10,000 times faster than competing methods (Cheshire [9] and CS-Rosetta).
Each of these approaches were used to determine 3D structure of a benchmark of 50 proteins. These 50 proteins were randomly selected within proteins for which NMR backbone chemical shifts are available in the BMRB data bank excluding proteins that have been used during CASD or CASP. In addition, we will applied these methods on cold shock proteins for which we have our own experimental data.