Search databaseBooksAll DatabasesAssemblyBiocollectionsBioProjectBioSampleBioSystemsBooksClinVarConserved DomainsdbGaPdbVarGeneGenomeGEO DataSetsGEO ProfilesGTRHomoloGeneIdentical Protein Web CatalogNucleotideOMIMPMCPopSetProteinProtein ClustersProtein Family ModelsPubChem BioAssayPubChem CompoundPubChem SubstancePubMedSNPSRAStructureTaxonomyToolKitToolKitAllToolKitBookgh Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

You are watching: A protein is a polymer consisting of a specific sequence of

Berg JM, Tymoczko JL, Stryer L. Biochemistry. 5th edition. New York: W H Freeman; 2002.

By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.

Proteins are linear polymers formed by linking the α-carboxyl group of one amino acid to the α-amino group of another amino acid with a peptide bond (also called an amide bond). The formation of a dipeptide from two amino acids is accompanied by the loss of a water molecule (Figure 3.18). The equilibrium of this reaction lies on the side of hydrolysis rather than synthesis. Hence, the biosynthesis of peptide bonds requires an input of free energy. Nonetheless, peptide bonds are quite stable kinetically; the lifetime of a peptide bond in aqueous solution in the absence of a catalyst approaches 1000 years.


Figure 3.18

Peptide-Bond Formation. The linking of two amino acids is accompanied by the loss of a molecule of water.

A series of amino acids joined by peptide bonds form a polypeptide chain, and each amino acid unit in a polypeptide is called a residue. A polypeptide chain has polarity because its ends are different, with an α-amino group at one end and an α-carboxyl group at the other. By convention, the amino end is taken to be the beginning of a polypeptide chain, and so the sequence of amino acids in a polypeptide chain is written starting with the aminoterminal residue. Thus, in the pentapeptide Tyr-Gly-Gly-Phe-Leu (YGGFL), phenylalanine is the amino-terminal (N-terminal) residue and leucine is the carboxyl-terminal (C-terminal) residue (Figure 3.19). Leu-Phe-Gly-Gly-Tyr (LFGGY) is a different pentapeptide, with different chemical properties.


Figure 3.19

Amino Acid Sequences Have Direction. This illustration of the pentapeptide Tyr-Gly-Gly-Phe-Leu (YGGFL) shows the sequence from the amino terminus to the carboxyl terminus. This pentapeptide, Leu-enkephalin, is an opioid peptide that modulates the perception (more...)

A polypeptide chain consists of a regularly repeating part, called the main chain or backbone, and a variable part, comprising the distinctive side chains (Figure 3.20). The polypeptide backbone is rich in hydrogen-bonding potential. Each residue contains a carbonyl group, which is a good hydrogen-bond acceptor and, with the exception of proline, an NH group, which is a good hydrogen-bond donor. These groups interact with each other and with functional groups from side chains to stabilize particular structures, as will be discussed in detail.


Figure 3.20

Components of a Polypeptide Chain. A polypeptide chain consists of a constant backbone (shown in black) and variable side chains (shown in green).

Most natural polypeptide chains contain between 50 and 2000 amino acid residues and are commonly referred to as proteins. Peptides made of small numbers of amino acids are called oligopeptides or simply peptides. The mean molecular weight of an amino acid residue is about 110, and so the molecular weights of most proteins are between 5500 and 220,000. We can also refer to the mass of a protein, which is expressed in units of daltons; one dalton is equal to one atomic mass unit. A protein with a molecular weight of 50,000 has a mass of 50,000 daltons, or 50 kd (kilodaltons).


A unit of mass very nearly equal to that of a hydrogen atom. Named after John Dalton (1766-1844), who developed the atomic theory of matter.

In some proteins, the linear polypeptide chain is cross-linked. The most common cross-links are disulfide bonds, formed by the oxidation of a pair of cysteine residues (Figure 3.21). The resulting unit of linked cysteines is called cystine. Extracellular proteins often have several disulfide bonds, whereas intracellular proteins usually lack them. Rarely, nondisulfide cross-links derived from other side chains are present in some proteins. For example, collagen fibers in connective tissue are strengthened in this way, as are fibrin blood clots.


Figure 3.21

Cross-Links. The formation of a disulfide bond from two cysteine residues is an oxidation reaction.

3.2.1. Proteins Have Unique Amino Acid Sequences That Are Specified by Genes

In 1953, Frederick Sanger determined the amino acid sequence of insulin, a protein hormone (Figure 3.22). This work is a landmark in biochemistry because it showed for the first time that a protein has a precisely defined amino acid sequence. Moreover, it demonstrated that insulin consists only of l amino acids linked by peptide bonds between α-amino and α-carboxyl groups. This accomplishment stimulated other scientists to carry out sequence studies of a wide variety of proteins. Indeed, the complete amino acid sequences of more than 100,000 proteins are now known. The striking fact is that each protein has a unique, precisely defined amino acid sequence. The amino acid sequence of a protein is often referred to as its primary structure.

A series of incisive studies in the late 1950s and early 1960s revealed that the amino acid sequences of proteins are genetically determined. The sequence of nucleotides in DNA, the molecule of heredity, specifies a complementary sequence of nucleotides in RNA, which in turn specifies the amino acid sequence of a protein. In particular, each of the 20 amino acids of the repertoire is encoded by one or more specific sequences of three nucleotides (Section 5.5).

Knowing amino acid sequences is important for several reasons. First, knowledge of the sequence of a protein is usually essential to elucidating its mechanism of action (e.g., the catalytic mechanism of an enzyme). Moreover, proteins with novel properties can be generated by varying the sequence of known proteins. Second, amino acid sequences determine the three-dimensional structures of proteins. Amino acid sequence is the link between the genetic message in DNA and the three-dimensional structure that performs a protein"s biological function. Analyses of relations between amino acid sequences and three-dimensional structures of proteins are uncovering the rules that govern the folding of polypeptide chains. Third, sequence determination is a component of molecular pathology, a rapidly growing area of medicine. Alterations in amino acid sequence can produce abnormal function and disease. Severe and sometimes fatal diseases, such as sickle-cell anemia and cystic fibrosis, can result from a change in a single amino acid within a protein. Fourth, the sequence of a protein reveals much about its evolutionary history (see Chapter 7). Proteins resemble one another in amino acid sequence only if they have a common ancestor. Consequently, molecular events in evolution can be traced from amino acid sequences; molecular paleontology is a flourishing area of research.

3.2.2. Polypeptide Chains Are Flexible Yet Conformationally Restricted

Examination of the geometry of the protein backbone reveals several important features. First, the peptide bond is essentially planar (Figure 3.23). Thus, for a pair of amino acids linked by a peptide bond, six atoms lie in the same plane: the α-carbon atom and CO group from the first amino acid and the NH group and α-carbon atom from the second amino acid. The nature of the chemical bonding within a peptide explains this geometric preference. The peptide bond has considerable double-bond character, which prevents rotation about this bond.
Figure 3.23

Peptide Bonds Are Planar. In a pair of linked amino acids, six atoms (Cα, C, O, N, H, and Cα) lie in a plane. Side chains are shown as green balls.

The inability of the bond to rotate constrains the conformation of the peptide backbone and accounts for the bond"s planarity. This double-bond character is also expressed in the length of the bond between the CO and NH groups. The C-N distance in a peptide bond is typically 1.32 Å, which is between the values expected for a C-N single bond (1.49 Å) and a C═N double bond (1.27 Å), as shown in Figure 3.24. Finally, the peptide bond is uncharged, allowing polymers of amino acids linked by peptide bonds to form tightly packed globular structures.

Figure 3.24

Typical Bond Lengths Within a Peptide Unit. The peptide unit is shown in the trans configuration.

Two configurations are possible for a planar peptide bond. In the trans configuration, the two α-carbon atoms are on opposite sides of the peptide bond. In the cis configuration, these groups are on the same side of the peptide bond. Almost all peptide bonds in proteins are trans. This preference for trans over cis can be explained by the fact that steric clashes between groups attached to the α-carbon atoms hinder formation of the cis form but do not occur in the trans configuration (Figure 3.25). By far the most common cis peptide bonds are X-Pro linkages. Such bonds show less preference for the trans configuration because the nitrogen of proline is bonded to two tetrahedral carbon atoms, limiting the steric differences between the trans and cis forms (Figure 3.26).

Figure 3.25

Trans and Cis Peptide Bonds. The trans form is strongly favored because of steric clashes that occur in the cis form.

Figure 3.26

Trans and Cis X-Pro Bonds. The energies of these forms are relatively balanced because steric clashes occur in both forms.

In contrast with the peptide bond, the bonds between the amino group and the α-carbon atom and between the α-carbon atom and the carbonyl group are pure single bonds. The two adjacent rigid peptide units may rotate about these bonds, taking on various orientations. This freedom of rotation about two bonds of each amino acid allows proteins to fold in many different ways. The rotations about these bonds can be specified by dihedral angles (Figure 3.27). The angle of rotation about the bond between the nitrogen and the α-carbon atoms is called phi (φ). The angle of rotation about the bond between the α-carbon and the carbonyl carbon atoms is called psi (ψ). A clockwise rotation about either bond as viewed from the front of the back group corresponds to a positive value. The φ and ψ angles determine the path of the polypeptide chain.

Dihedral angle—

A measure of the rotation about a bond, usually taken to lie between -180° and +180°. Dihedral angles are sometimes called torsion angles.

Figure 3.27

Rotation About Bonds in a Polypeptide. The structure of each amino acid in a polypeptide can be adjusted by rotation about two single bonds. (A) Phi (φ) is the angle of rotation about the bond between the nitrogen and the α-carbon atoms, (more...)

Are all combinations of φ and ψ possible? G. N. Ramachandran recognized that many combinations are forbidden because of steric collisions between atoms. The allowed values can be visualized on a two-dimensional plot called a Ramachandran diagram (Figure 3.28). Three-quarters of the possible (φ, ψ) combinations are excluded simply by local steric clashes. Steric exclusion, the fact that two atoms cannot be in the same place at the same time, can be a powerful organizing principle.

Figure 3.28

A Ramachandran Diagram Showing the Values of φ and ψ. Not all φ and ψ values are possible without collisions between atoms. The most favorable regions are shown in dark green; borderline regions are shown in light green. (more...)

The ability of biological polymers such as proteins to fold into welldefined structures is remarkable thermodynamically. Consider the equilibrium between an unfolded polymer that exists as a random coil—that is, as a mixture of many possible conformations—and the folded form that adopts a unique conformation. The favorable entropy associated with the large number of conformations in the unfolded form opposes folding and must be overcome by interactions favoring the folded form. Thus, highly flexible polymers with a large number of possible conformations do not fold into unique structures. The rigidity of the peptide unit and the restricted set of allowed φ and ψ angles limits the number of structures accessible to the unfolded form sufficiently to allow protein folding to occur.

See more: What Is 13/9 As A Mixed Number ? How Do You Convert 13 9 To A Mixed Number

By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.