[en] Among different ab initio approaches to calculate 3D-structures of proteins out
of primary sequences, a few are using restricted dihedral spaces and empirical
equations of energy as is OSIRIS. All those approaches were calibrated on a few
proteins or fragments of proteins. To optimize the calculation over a larger
diversity of structures, we need first to define for each sequence what are good
conditions of calculations in order to choose a consensus procedure fitting most
3D-structures best. This requires objective classification of calculated
3D-structures. In this work, populations of avian and bovine pancreatic
polypeptides (APP, BPP) and of calcium-binding protein (CaBP) are obtained by
varying the rate of the angular dynamics of the second step of OSIRIS. Then,
3D-structures are clustered using a nonhierarchical method, SICLA, using rmsd as
a distance parameter. A good clustering was obtained for four subpopulations of
APP, BPP and CaBP. Each subpopulation was characterized by its barycenter,
relative frequency and dispersion. For the three alpha-helix proteins, after the
step 1 of OSIRIS, most secondary structures were correct but molecules have a few
atomic contacts. Step 2, i.e., the angular dynamics, resolves those atomic
contacts and clustering demonstrates that it generates subpopulations of
topological conformers as the barycenter topologies show.
Disciplines :
Biochemistry, biophysics & molecular biology
Author, co-author :
Benhabiles, N.
Gallet, X.
Thomas, Annick ; Université de Liège - ULiège > Chimie et bio-industries > Centre de Bio. Fond. - Section de Biologie moléc. et numér.
Brasseur, Robert ; Université de Liège - ULiège > Gembloux Agro-Bio Tech
Language :
English
Title :
A Descriptive Analysis Of Populations Of Three-Dimensional Structures Calculated From Primary Sequences Of Proteins By Osiris
Anfinsen, C.B. 1973. Principles that govern the folding of protein chains. Science 181, 223-230.
Anfinsen, C.B., and Scheraga, H.A. 1975. Experimental and theoretical aspects of protein folding. Adv. Prot. Chem. 29, 205-300.
Bai, Y., and Englander, S.W. 1996. Future directions in folding: The multi-state nature of protein structure. Proteins: Struct. Funct. Genet. 24, 145-151.
Blundell, T.L., Pitts, J.E., Tickle, I.J., Wood, S.P., and Wu, C.W. 1981. X-ray analysis (1.4 Å resolution) of avian pancreatic polypeptide: Small globular protein hormone. Proc. Natl. Acad. Sci. U.S.A. 78, 4175-4179.
Brasseur, R. 1993. Le rôle de la mémoire locale et de l'hydrophobicité dans le repliement des protéines. In Structure et fonction des protéines: Règles et prédictions. 73-76. Cahiers IMABIO no 8, CNRS.
Brasseur, R. 1995. Simulating the folding of small proteins by use of the local minimum energy and the free solvation energy yields native-like structures. J. Mol. Graph. 13, 312-322.
Bryngelson, J.D., Onuchic, J.N., Socci, N.D., and Wolynes, P.G. 1995. Funnels, pathways, and the energy landscape of protein folding: A synthesis. Proteins: Struct. Funct. Genet. 21, 167-195.
Celeux, G., Diday, E., Govaert, G., Lechevalier, Y., and Ralamboudrainy, H. 1989. In Classification automatique des données. Dunod, Paris.
Chothia, C. 1976. The nature of the accessible and buried surface in proteins. J. Mol. Biol. 105, 1-14.
Creighton, T.E. 1993. Physical interactions that determine the properties of proteins. In PROTEINS: Structures and molecular properties. 139-325. W. H. Freeman and Company, New-York.
Creighton, T.E., Darby, N.J., and Kemmink, J. 1996. The roles of partly folded intermediates in protein folding. FASEB J. 10, 110-118.
Dandekar, T., and Argos, P. 1994. Folding the main chain of small proteins with the genetic algorithm. J. Mol. Biol. 236, 844-861.
Diamond, R. 1995. Coordinate-based cluster analysis. Acta Cryst. D51, 127-135.
Diday, E., Lemaire, J., Pouget, J., and Testu, F. 1982. In Eléments d'analyse de données. Dunod, Paris.
Dill, K.A. 1990. Dominant forces in protein folding. Biochemistry 29, 7133-7155.
Dill, K.A., Bromberg, S., Yue, K., Fiebig, K.M., Yee, D.P., Thomas, P.D., and Sun Chan, H. 1995. Principles of protein folding - A perspective from simple exact models. Protein Sci. 4, 561-602.
Doniach, S. 1994. In Statistical Mechanics, Protein Structure and Protein Substrate Interactions. 1-327. Edited by S. Doniach, Plenum Press, New York.
Eisenberg, D., and McLachlan, A.D. 1986. Solvation energy in protein folding and binding. Nature. 319, 199-203.
Elofsson, A., Le Grand, S.M., and Eisenberg, D. 1995. Local moves: An efficient algorithm for simulation of protein folding. Proteins: Struct. Funct. Genet. 23, 73-82.
Fersht, A.R. 1997. Nucleation mechanisms in protein folding. Curr. Opin. Struct. Biol. 7, 3-9.
Finkelstein, A.V., Badretdinov, A.Y., and Gutin, A.M. 1995. Why do protein architectures have Boltzmann-like statistics? Proteins: Struct. Funct. Genet. 23, 142-150.
Fischer, D., Tsai, C.J., Nussinov, R., and Wolson, H. 1995. A 3-D sequence-independent representation of the protein data bank. Protein Eng. 8, 981-997.
Frishman, D., and Argos, P. 1995. Knowledge-based protein secondary structure assignment. Proteins: Struct. Funct. Genet. 23, 566-579.
Glover, I., Haneef, I., Pitts, J., Wood, S., Moss, D., Tickle, I., and Blundell, T.L. 1983. Conformational flexibility in a small globular hormone. X-ray analysis of an avian pancreatic polypeptide at 0.98 Å resolution. Biopolymers 22, 293-304.
Godzik, A., Kolinski, A., and Skolnick, J. 1993. Lattice representations of globular proteins: How good are they? J. Comput. Chem. 14, 1194-1202.
Gunn, J.R. 1996. Minimizing reduced-model proteins using a generalized hierarchical table-lookup potential function. J. Phys. Chem. 100, 3264-3272.
Hendlich, M., Lackner, P., Weitckus, S., Floeckner, H., Froschauer, R., Gottsbacher, K., Casari, G., and Sippl, M.J. 1990. Identification of native protein folds amongst a large number of incorrect models. The calculation of low energy conformations from potentials of mean force. J. Mol. Biol. 216, 167-180.
Holm, L., Ouzounis, C., Sander, C., Tuparev, G., and Vriend, G. 1992. A database of protein structure families with common folding motifs. Protein Sci. 1, 1691-1698.
Holm, L., and Sander, C. 1993. Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123-138.
Honig, B., and Yang, A.S. 1995. Free energy balance in protein folding. Adv. Prot. Chem. 46, 27-58.
Jurs, P.C. 1990. Chemometrics and multivariate analysis in analytical chemistry. In Reviews in Computational Chemistry. Vol. 1. 169-213. Lipkowitz, K.B. and Boyd, D.B., eds. VCH Publishers.
Kabsch, W. 1978. A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallogr. A34, 827-828.
Kamtekar, S., and Hecht, M.H. 1995. The four-helix bundle: what determines a fold? FASEB J. 9, 1013-1022.
Karpen, M.E., Tobias, D.J., and Brooks, C.L., III. 1993. Statistical clustering techniques for the analysis of long molecular dynamics trajectories: Analysis of 2.2-ns trajectories of YPGDV. Biochemistry 32, 412-420.
Karplus, M., and Šali, A. 1995. Theoretical studies of protein folding and unfolding. Curr. Opin. Struct. Biol. 5, 58-73.
Kaufman, L., and Rousseeuw, P.J. 1990. In Finding Groups in Data, an Introduction to Cluster Analysis. John Wiley and Sons, NY.
Kolinski, A., and Skolnick, J. 1994. Monte Carlo simulations of protein folding. I. Lattice model and interaction scheme. Proteins: Struc. Func. Genet. 18, 338-352.
Kolinski, A., Godzik, A., and Skolnick, J. 1993. A general method for the prediction of the three dimensional structure and folding pathway of globular proteins: Application to designed helical proteins. J. Chem. Phys. 98, 7420-7433.
Kolinski, A., and Skolnick, J. 1997. Determinants of secondary structure of polypeptide chains: Interplay between short range and burial interactions. J. Chem. Phys. 107, 953-964.
Kurochkina, N. A., and Lee, B. 1994. Experiences with dihedral angiospace Monte Carlo search for small protein structures. In Statistical mechanics, protein structure, and protein substrate interactions. 147-157. S. Doniach, Ed. Plenum Press, NY.
Lattman, E.E. ed. 1996. In Proteins: Struct. Funct. Genet. John Wiley and Sons, Vol. 23, Number 3.
Lattman, E.E., and Rose, G.D. 1993. Protein folding - what's the question? Proc. Natl. Acad. Sci. U.S.A. 90, 439-441.
Lebart, L., Morineau, A., and Warwick, K.W. 1984. In Multivariate Descriptive Statistical Analysis, Correspondence Analysis and Related Techniques for Large Matrices. John Wiley, NY.
Lee, B., Kurochkina, N., and Kang, H.S. 1996. Protein folding by a biaised Monte Carlo procedure in the dihedral angle space. FASEB J. 10, 119-125.
Levinthal, C. 1969. How to fold graciously. In Mossbauer Spectroscopy in Biological Systems. Proceedings of a meeting held at Allerton House, Monticello, Illinois. 22-24.
DeBrunner, P., Tsibris, J., and Munck, E. eds. University of Illinois, Urbana, IL.
Li, X., Sutcliffe, M.J., Schwartz, T.W., and Dobson, C.M. 1992. Sequence specific 1H NMR assignments and solution structure of bovine pancreatic polypeptide. Biochemistry 31, 1245-1253.
Lins, L., and Brasseur, R. 1995. The hydrophobic effect in protein folding. FASEB J. 9, 535-540.
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., and Teller, A.H. 1953. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087-1092.
Miranker, A.D., and Dobson, C.M. 1996. Collapse and cooperativity in protein folding. Curr. Opin. Struct. Biol. 6, 31-42.
Ngo, J.T., Marks, J., and Karplus, M. 1994. Computational complexity, protein structure prediction, and the Levinthal paradox. In The Protein Folding Problem and Tertiary Structure Prediction. 433-507. Birkhäuser, Boston.
Park, B.H., and Levitt, M. 1995. The complexity and accuracy of discrete state models of protein structure. J. Mol. Biol. 249, 493-507.
Piela, L., Kostrowski, J., and Scheraga, H. 1989. The multiple-minima problem in the conformational analysis of molecules. Deformation of the potential analysis hypersurface by the diffusion equation method. J. Phys. Chem. 93, 3339-3346.
Pohl, F.M. 1971. Empirical protein energy maps. Nature New Biol. 234, 277-279.
Rabow, R.A., and Scheraga, H.A. 1996. Improved genetic algorithm for the protein folding problem by use of a Cartesian combination operator. Protein Sci. 5, 1800-1815.
Ramachandran, G.N., and Sassiekharan, V. 1968. Conformation of polypeptides and proteins. Adv. Prot. Chem. 28, 283-437.
Rooman, M.J., Kocher, J.P., and Wodak, S.J. 1991. Prediction of protein backbone conformation based on seven structure assignments. Influence of local interactions. J. Mol. Biol. 221, 961-979.
Šali, A., Shakhnovich, E., and Karplus, M. 1994. Kinetics of protein folding: A lattice model study of the requirements for folding to the native state. J. Mol. Biol. 235, 1614-1636.
Shakhnovich, E.I., and Karplus, M. 1994. How does a protein fold? Nature 349, 248-251.
Sippl, M.J. 1982. On the problem of comparing protein structures. Development and applications of a new method for the assessment of structural similarities of polypeptide conformations. J. Mol. Biol. 156, 359-388.
Skolnick, J., and Kolinski, A. 1990. Simulation of the folding of a globular protein. Science 250, 1121-1125.
Sosnick, T.R., Mayne, L., and Englander, S.W. 1996. Molecular collapse: the rate-limiting step in two-state cytochrome c folding. Proteins: Struct. Funct. Genet. 24, 413-426.
Srinivasan, R., and Rose, R.G. 1995. LINUS: A hierarchic procedure to predict the fold of a protein. Proteins: Struct. Funct. Genet. 22, 81-99.
Sun, S. 1993. Reduced representation model of protein structure prediction: Statistical potential and genetics algorithms. Protein Sci. 2, 762-785.
Sun, S., Thomas, P.D., and Dill, K.A. 1995. A simple protein folding algorithm using a binary code and secondary structure constraints. Protein Eng. 8, 769-778.
Szebenyi, D.M.E., Obendorf, S.K., and Moffat, K. 1981. Structure of vitamin D-dependent calcium binding protein from bovine intestine. Nature 294, 327-332.
Szebenyi, D.M.E., Obendorf, S.K., and Moffat, K. 1986. The refined structure of vitamin D-dependent calcium binding protein from bovine intestine. J. Biol. Chem. 261, 8761-8777.
Tufféry, P., and Lavery, R. 1993. Packing and recognition of protein structural elements: A new approach applied to the 4-helix bunle of myohemerythrin. Proteins: Struct. Funct. Genet. 15, 413-425.
Troyer, J.M., and Cohen, F.E. 1995. Protein conformational landscape: energy minimization and clustering of a long molecular dynamics trajectory. Proteins: Struct. Funct. Genet. 23, 97-110.
Van Gunsteren, W.F., Luque, F.J., Timms, D., and Torda, A.E. 1994. Molecular mechanics in biology: from structure to function, taking account of solvation. Annu. Rev. Biophys. Biomol. Struct. 23, 847-863.
Verlet, L. 1967. Computer "experiments" on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules. Phys. Rev. 159, 98-103.
Wang, Y., Zhang, H., Li, W., and Scott, R.A. 1995. Discrimining compact nonnative structures from the native structure of globular proteins. Proc. Natl. Acad. Sci. U.S.A. 92, 709-713.
Wilson, C., and Doniach, S. 1989. A computer model to dynamically simulate protein folding: studies with crambin. Proteins: Struct. Funct. Genet. 6, 193-209.
Wolynes, P.G., Onuchic, J.N., and Thirumalai, D. 1995. Navigating the folding routes. Science 267, 1619-1620.
Xie, D., and Freire, E. 1994. Structure based prediction of protein folding intermediates. J. Mol. Biol. 242, 62-80.
Yue, K., and Dill, K.A. 1996. Folding proteins with a simple energy function and extensive conformational searching. Protein Sci. 5, 254-261.