Home — Essay Samples — Science — Protein — Protein Synthesis: Understanding the Process and its Importance

test_template

Protein Synthesis: Understanding The Process and Its Importance

  • Categories: Protein

About this sample

close

Words: 593 |

Published: Feb 7, 2024

Words: 593 | Page: 1 | 3 min read

Table of contents

The process of protein synthesis, a. transcription, b. translation, the role of dna in protein synthesis, the role of rna in protein synthesis, regulation of protein synthesis, importance of protein synthesis.

Image of Alex Wood

Cite this Essay

Let us write you an essay from scratch

  • 450+ experts on 30 subjects ready to help
  • Custom essay delivered in as few as 3 hours

Get high-quality help

author

Dr Jacklynne

Verified writer

  • Expert in: Science

writer

+ 120 experts online

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy . We’ll occasionally send you promo and account related email

No need to pay just yet!

Related Essays

2 pages / 775 words

2 pages / 820 words

5 pages / 2211 words

5 pages / 2168 words

Remember! This is just a sample.

You can get your custom paper by one of our expert writers.

121 writers online

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled

Related Essays on Protein

The Bradford Assay is a form of colorimetric and spectroscopic analysis developed to determine the concentration of a protein; in an aqueous solution. Produced by Marion Bradford in 1976, it was an innovation of its time due to [...]

Proteins are chains of amino acids that fold into three-dimensional shapes. The shape of the protein is very important to its function and the three-dimensional structure is specified by an amino acid sequence. Protein structure [...]

Lactic acid is produced when glucose is broken down and oxidized. During intense exercise when oxygen levels are lower, more lactic acid is made, which produce hydrogen ions and a burning sensation in muscles while they’re [...]

Protein Shakes have always been a powerful source of nourishing your body as these contain all the necessary nutrients, which otherwise are hard to obtain from food. But which protein shakes are the best for you? This is a very [...]

A stem cell transplant is a treatment for some types of cancer. For example in the case if person might have leukemia, multiple myeloma, or some types of lymphoma. It also treat some blood diseases with stem cell transplants. [...]

Mosquitoes belonged to the family of nematocerid flies which was the Culicidae (from the Latin culex, genitive culicis, meaning "midge" or "gnat"). Female mosquito was responsible to carry the vector that leads to the disease [...]

Related Topics

By clicking “Send”, you agree to our Terms of service and Privacy statement . We will occasionally send you account related emails.

Where do you want us to send this sample?

By clicking “Continue”, you agree to our terms of service and privacy policy.

Be careful. This essay is not unique

This essay was donated by a student and is likely to have been used and submitted before

Download this Sample

Free samples may contain mistakes and not unique parts

Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.

Please check your inbox.

We can write you a custom essay that will follow your exact instructions and meet the deadlines. Let's fix your grades together!

Get Your Personalized Essay in 3 Hours or Less!

We use cookies to personalyze your web-site experience. By continuing we’ll assume you board with our cookie policy .

  • Instructions Followed To The Letter
  • Deadlines Met At Every Stage
  • Unique And Plagiarism Free

life science protein synthesis essay

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Biology LibreTexts

6.4: Protein Synthesis

  • Last updated
  • Save as PDF
  • Page ID 27396

  • Suzanne Wakim & Mandeep Grewal
  • Butte College

The Central Dogma of Biology

Your DNA , or deoxyribonucleic acid, contains the genes that determine who you are. How can this organic molecule control your characteristics? DNA contains instructions for all the proteins your body makes. Proteins , in turn, determine the structure and function of all your cells. What determines a protein ’s structure? It begins with the sequence of amino acids that make up the protein. Instructions for making proteins with the correct sequence of amino acids are encoded in DNA.

How proteins are made

DNA is found in chromosomes. In eukaryotic cells, chromosomes always remain in the nucleus, but proteins are made at ribosomes in the cytoplasm or on the rough endoplasmic reticulum (RER) . How do the instructions in DNA get to the site of protein synthesis outside the nucleus? Another type of nucleic acid is responsible. This nucleic acid is RNA or ribonucleic acid. RNA is a small molecule that can squeeze through pores in the nuclear membrane. It carries the information from DNA in the nucleus to a ribosome in the cytoplasm and then helps assemble the protein. In short:

DNA → RNA → Protein

Discovering this sequence of events was a major milestone in molecular biology. It is called the central dogma of biology . The two processes involved in the central dogma are transcription and translation.

Transcription translation; mRNA to protein

Transcription

Transcription is the first part of the central dogma of molecular biology: DNA → RNA . It is the transfer of genetic instructions in DNA to mRNA. Transcription happens in the nucleus of the cell. During transcription, a strand of mRNA is made that is complementary to a strand of DNA called a gene. A gene can easily be identified from the DNA sequence. A gene contains the basic three regions, promoter, coding sequence (reading frame), and terminator. There are more parts of a gene which are illustrated in Figure \(\PageIndex{3}\).

gene regions

Steps of Transcription

Transcription takes place in three steps, called initiation, elongation, and termination. The steps are illustrated in Figure \(\PageIndex{4}\).

  • Initiation is the beginning of transcription. It occurs when the enzyme RNA polymerase binds to a region of a gene called the promoter . This signals the DNA to unwind so the enzyme can “read” the bases in one of the DNA strands. The enzyme is ready to make a strand of mRNA with a complementary sequence of bases. The promoter is not part of the resulting mRNA
  • Elongation is the addition of nucleotides to the mRNA strand.

Transcription steps, initiation, elongation, and termination

Processing mRNA

In eukaryotes, the new mRNA is not yet ready for translation. At this stage, it is called pre-mRNA, and it must go through more processing before it leaves the nucleus as mature mRNA. The processing may include the addition of a 5' cap, splicing, editing, and 3' polyadenylation (poly-A) tail. These processes modify the mRNA in various ways. Such modifications allow a single gene to be used to make more than one protein. See Figure \(\PageIndex{5}\) as you read below:

  • 5' cap protects mRNA in the cytoplasm and helps in the attachment of mRNA with the ribosome for translation.
  • Splicing removes introns from the protein-coding sequence of mRNA. Introns are regions that do not code for the protein. The remaining mRNA consists only of regions called exons that do code for the protein.
  • Editing changes some of the nucleotides in mRNA. For example, a human protein called APOB, which helps transport lipids in the blood, has two different forms because of editing. One form is smaller than the other because editing adds an earlier stop signal in mRNA.
  • Polyadenylation adds a “tail” to the mRNA. The tail consists of a string of As (adenine bases). It signals the end of mRNA. It is also involved in exporting mRNA from the nucleus, and it protects mRNA from enzymes that might break it down.

transcript splicing process

Translation

The translation is the second part of the central dogma of molecular biology: RNA --> Protein . It is the process in which the genetic code in mRNA is read to make a protein. The translation is illustrated in Figure \(\PageIndex{6}\). After mRNA leaves the nucleus, it moves to a ribosome, which consists of rRNA and proteins. Translation happens on the ribosomes floating in the cytosol, or on the ribosomes attached to the rough endoplasmic reticulum. The ribosome reads the sequence of codons in mRNA, and molecules of tRNA bring amino acids to the ribosome in the correct sequence.

To understand the role of tRNA, you need to know more about its structure. Each tRNA molecule has an anticodon for the amino acid it carries. An anticodon is complementary to the codon for an amino acid. For example, the amino acid lysine has the codon AAG, so the anticodon is UUC. Therefore, lysine would be carried by a tRNA molecule with the anticodon UUC. Wherever the codon AAG appears in mRNA, a UUC anticodon of tRNA temporarily binds. While bound to mRNA, tRNA gives up its amino acid. With the help of rRNA, bonds form between the amino acids as they are brought one by one to the ribosome, creating a polypeptide chain. The chain of amino acids keeps growing until a stop codon is reached.

Ribosomes, which are just made out of rRNA (ribosomal RNA) and protein, have been classified as ribozymes because the rRNA has enzymatic activity. The rRNA is important for the peptidyl transferase activity that bonds amino acids. Ribosomes have two subunits of rRNA and protein. The large subunit has three active sites called E, P, and A sites. These sites are important in the catalytic activity of ribosomes.

Just as with mRNA synthesis, protein synthesis can be divided into three phases: initiation, elongation, and termination. In addition to the mRNA template, many other molecules contribute to the process of translation, such as ribosomes, tRNAs, and various enzymatic factors

Translation Initiation: The small subunit binds to a site upstream (on the 5' side) of the start of the mRNA. It proceeds to scan the mRNA in the 5'-->3' direction until it encounters the START codon (AUG). The large subunit attaches and the initiator tRNA, which carries methionine (Met), binds to the P site on the ribosome.

Translation Elongation: The ribosome shifts one codon at a time, catalyzing each process that occurs in the three sites. With each step, a charged tRNA enters the complex, the polypeptide becomes one amino acid longer, and an uncharged tRNA departs. The energy for each bond between amino acids is derived from GTP, a molecule similar to ATP. Briefly, the ribosomes interact with other RNA molecules to make chains of amino acids called polypeptide chains, due to the peptide bond that forms between individual amino acids. Inside the ribosome, three sites participate in the translation process, the A, P, and E sites. Amazingly, the E. coli translation apparatus takes only 0.05 seconds to add each amino acid, meaning that a 200-amino acid polypeptide could be translated in just 10 seconds.

Translation Termination : Termination of translation occurs when a stop codon (UAA, UAG, or UGA) is encountered (see Figure \(\PageIndex{7}\). When the ribosome encounters the stop codon, the growing polypeptide is released with the help of various releasing factors and the ribosome subunits dissociate and leave the mRNA. After many ribosomes have completed translation, the mRNA is degraded so the nucleotides can be reused in another transcription reaction.

Translation steps

What Happens Next?

After a polypeptide chain is synthesized, it may undergo additional processes. For example, it may assume a folded tertiary shape due to interactions among its amino acids. It may also bind with other polypeptides or with different types of molecules, such as lipids or carbohydrates. Many proteins travel to the Golgi apparatus within the cytoplasm to be modified for the specific job they will do.

Summary of Central Dogma

Transcription translation

  • Relate protein synthesis and its two major phases to the central dogma of molecular biology.
  • Identify the steps of transcription, and summarize what happens during each step.
  • Explain how mRNA is processed before it leaves the nucleus.
  • Describe what happens during the translation phase of protein synthesis.
  • What additional processes may a polypeptide chain undergo after it is synthesized?
  • Where does transcription take place in eukaryotes?
  • Where does translation take place?
  • Contains the codons
  • Contains the anticodons
  • Makes up the ribosome, along with proteins
  • What is the complementary sequence on the other DNA strand?
  • What is the complementary sequence in the mRNA? What is this sequence called?
  • @hat is the resulting sequence in the tRNA? What is this sequence called? What do you notice about this sequence compared to the original DNA triplet on the template strand?
  • Both A and B
  • True or False. Introns in mRNA bind to tRNA at the ribosome.
  • True or False. tRNAs can be thought of as the link between amino acids and codons in the mRNA.

Explore More

Messenger RNA molecules are "spliced" in order to create the mRNA involved in protein synthesis. Learn the process here:

Attributions

  • How proteins are made by Nicolle Rager, National Science Foundation, public domain via Wikimedia Commons
  • Gene structure eukaryote by Thomas Shafee , licensed CC BY 4.0 via Wikimedia Commons
  • Components of a gene by Mandeep Grewal, CC BY 4.0
  • Transcription by Calibuon , released into the public domain via Wikimedia Commons
  • Transcript and splicing by Ganeshmanohar , CC BY-SA 4.0 via Wikimedia Commons
  • Initiation and elongation by Jordan Nguyen, CC BY-SA 4.0 via Wikimedia Commons
  • Protein synthesis by OpenStax, CC BY 4.0
  • Gene regulation by OpenStax, CC BY 4.0
  • Text adapted from Human Biology by CK-12 licensed CC BY-NC 3.0

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

High school biology

Course: high school biology   >   unit 6.

  • Molecular structure of RNA
  • DNA replication and RNA transcription and translation
  • Intro to gene expression (central dogma)

The genetic code

  • Impact of mutations on translation into amino acids

RNA and protein synthesis review

Transcription and translation.

  • Codons and mutations

Structure of RNA

  • RNA uses the sugar ribose instead of deoxyribose .
  • RNA is generally single-stranded instead of double-stranded.
  • RNA contains uracil in place of thymine.

Types of RNA

Central dogma of biology, substitutions.

  • Silent mutations do not affect the sequence of amino acids during translation.
  • Nonsense mutations result in a stop codon where an amino acid should be, causing translation to stop prematurely.
  • Missense mutations change the amino acid specified by a codon.

Insertions and deletions

Common mistakes and misconceptions.

  • Amino acids are not made during protein synthesis. Some students think that the purpose of protein synthesis is to create amino acids. However, amino acids are not being made during translation, they are being used as building blocks to make proteins.
  • Mutations do not always have drastic or negative effects. Often people hear the term "mutation" in the media and understand it to mean that a person will have a disease or disfigurement. Mutations are the source of genetic variety, so although some mutations are harmful, most are unnoticeable, and many are even good!
  • Insertions and deletions that are multiples of three nucleotides will not cause frameshift mutations. Rather, one or more amino acids will just be added to or deleted from the protein. Insertions and deletions that are not multiples of three nucleotides, however, can dramatically alter the amino acid sequence of the protein.

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

NOTIFICATIONS

Role of proteins in the body.

  • + Create new collection

Proteins are molecules made of amino acids. They are coded for by our genes and form the basis of living tissues. They also play a central role in biological processes. For example, proteins catalyse reactions in our bodies, transport molecules such as oxygen, keep us healthy as part of the immune system and transmit messages from cell to cell.

Protein synthesis

A gene is a segment of a DNA molecule that contains the instructions needed to make a unique protein. All of our cells contain the same DNA molecules, but each cell uses a different combination of genes to build the particular proteins it needs to perform its specialised functions.

Protein synthesis has 2 main stages. The 1st stage is known as transcription, where a messenger molecule (mRNA) is formed. This molecule is transcribed from the DNA molecule and carries a copy of the information needed to make a protein. In the 2nd stage, the mRNA molecule leaves the nucleus for the cytoplasm where the cell’s ribosomes read the information and start to assemble a protein in a process called translation

During translation, the ribosomes read the mRNA sequence of bases 3 at a time. These 3-letter combinations (called codons) each code a particular amino acid. For example, the base sequence TTT codes for the amino acid lysine.

There are 4 bases (adenine, thymine, guanine and cytosine) and therefore 64 (4 3 ) possible codons specified using some combination of 3 bases. However, only 20 amino acids are required to build all of the proteins in our bodies (some amino acids are specified by more than 1 codon). It is the particular sequence of amino acids that determines the shape and function of the protein.

Protein synthesis, like many other biological processes, can be affected by environmental factors. These include maternal nutrition, temperature stress , oxygen levels and exposure to chemicals

Different types of proteins

There are many different types of proteins in our bodies. They all serve important roles in our growth, development and everyday functioning. Here are some examples:

  • Enzymes are proteins that facilitate biochemical reactions, for example, pepsin is a digestive enzyme in your stomach that helps to break down proteins in food.
  • Antibodies are proteins produced by the immune system to help remove foreign substances and fight infections.
  • DNA-associated proteins regulate chromosome structure during cell division and/or play a role in regulating gene expression, for example, histones and cohesin proteins
  • Contractile proteins are involved in muscle contraction and movement, for example, actin and myosin
  • Structural proteins provide support in our bodies, for example, the proteins in our connective tissues, such as collagen and elastin.
  • Hormone proteins co-ordinate bodily functions, for example, insulin controls our blood sugar concentration by regulating the uptake of glucose into cells.
  • Transport proteins move molecules around our bodies, for example, haemoglobin transports oxygen through the blood.

Alternative roles for proteins

Each protein has a specific role in our body. However, scientists have discovered that some proteins perform more than 1 role.

For example, Dr Julia Horsfield leads the Chromosome Structure and Development Group at the University of Otago. Her lab investigates how cohesin proteins, which regulate chromosome structure during cell division, are also involved in making sure that genes are switched on or off at the correct times during development. Julia and her colleagues focus on the impact of a reduction in cohesin proteins on gene expression in zebrafish and use these results to better understand particular human diseases

Useful link

Visit the Learn Genetics website to go on animated tours covering DNA, genes, chromosomes, proteins, heredity and traits.

See our newsletters here .

Would you like to take a short survey?

This survey will open in a new tab and you can fill it out after your visit to the site.

Issue Cover

  • Previous Article
  • Next Article

Cover Image

issue cover

Introduction

Part 1: the structural properties of proteins, part 2: approaches to study protein structure, concluding comments, data availability, competing interests, author contribution, acknowledgements, abbreviations, further reading and resources, uncovering protein structure.

ORCID logo

  • Split-Screen
  • Article contents
  • Figures & tables
  • Supplementary Data
  • Peer Review
  • Open the PDF for in another window
  • Cite Icon Cite
  • Get Permissions

Elliott J Stollar , David P Smith; Uncovering protein structure. Essays Biochem 8 October 2020; 64 (4): 649–680. doi: https://doi.org/10.1042/EBC20190042

Download citation file:

  • Ris (Zotero)
  • Reference Manager

Structural biology is the study of the molecular arrangement and dynamics of biological macromolecules, particularly proteins. The resulting structures are then used to help explain how proteins function. This article gives the reader an insight into protein structure and the underlying chemistry and physics that is used to uncover protein structure. We start with the chemistry of amino acids and how they interact within, and between proteins, we also explore the four levels of protein structure and how proteins fold into discrete domains. We consider the thermodynamics of protein folding and why proteins misfold. We look at protein dynamics and how proteins can take on a range of conformations and states. In the second part of this review, we describe the variety of methods biochemists use to uncover the structure and properties of proteins that were described in the first part. Protein structural biology is a relatively new and exciting field that promises to provide atomic-level detail to more and more of the molecules that are fundamental to life processes.

Proteins are one of the most important classes of molecules for life and underpin the field of biochemistry. To fully understand their role, it is essential to explore both their structure and function and this review focuses on how we uncover protein structure. To understand structure, we explore the chemical nature of amino acids which are the building blocks of proteins. We consider how interactions between amino acids help proteins fold and fluctuate as they adopt a variety of structures. Furthermore, to understand how we experimentally study protein structure, we explore fundamental concepts in physics and associated computational methods. This topic is truly interdisciplinary and in addition to biochemistry, spans the fields of biophysics, structural biology and computational biology.

We start by describing the four levels of protein structure and how a variety of protein domains and architectures exist. Proteins are biological molecules produced in living cells, and we must also consider how a long chain of amino acids that are produced from the ribosome can transition to a folded structure that is central to the protein’s function. As such, we consider protein folding thermodynamics and also what happens when proteins misfold inside a cell. We also explore other universal properties of proteins that include their ability to change their shape known as conformational change. In particular, although proteins usually exist in one dominant conformation, we discuss how proteins actually exist in a population (ensemble) of rapidly interconverting conformations that allow them to be flexible and adapt their shapes required for function. We then discuss in detail the primary techniques used to study protein structure and dynamics that have provided these insights. Given the interdisciplinary nature of this topic, along the way, we have provided some stand-alone boxes to give more details about the fundamental science behind these concepts.

Proteins are one of the four major molecules that direct life that includes nucleic acids (deoxyribonucleic acid (DNA), RNA), lipids (fats) and polysaccharides (sugars). All of these large ‘macromolecules’ are carbon-based covalent compounds that use weak reversible non-covalent interactions to fold and interact with their targets, giving the molecules and their complexes distinct shapes and dynamics. Proteins are polymers of typically hundreds of amino acids joined together by peptide bonds, whereas shorter polypeptides (less than 30 amino acids) are typically referred to as peptides. Each amino acid has a common structure containing a central α carbon atom (C α ) that is joined to an amino group (–NH 2 ) and a carboxylic acid group (–COOH) both of which are used to form peptide bonds. What is most interesting, is that for 19 of the 20 different amino acids, the C α group is also bonded to a different R group, giving every amino acid its unique ‘side chain’. The side chain gives the amino acid distinctive structural and chemical properties as side chains differ in size, shape, polarity, charge and hydrophobicity ( Figure 1 ). Amino acids are also chiral and can be configured in two possible mirror images (stereoisomers) as the C α group is bonded to four unique groups that form a chiral centre. As mirror images, stereoisomers cannot be superimposed, in the same way, your hands are mirror images and cannot be rotated to match. The two stereoisomers for each of the 19 chiral amino acids are denoted as d and l , however only the l -stereoisomer is used in nature to construct proteins (glycine has hydrogen for a side chain and is not chiral).

l -Amino acids

(A) All 20 amino acids have a common structure with distinct chemical and physical properties that are determined by their R groups (side chains). Each has its own name (i.e. Alanine), three letter abbreviation (Ala) and one letter code. They are grouped according to their size, charge, polarity, and, in certain cases, by special features they impart the polypeptide backbone. Amino acids are shown as residues in short polypeptide chains with an N- and C-termini as indicated at ends. Carbon atoms do not show the letter C and are represented at bond junctions, also hydrogens attached to carbons are not shown (this representation is commonly used in organic chemistry). The polypeptide backbone is shown in black and the side chains are coloured. (B) Nonpolar residues typically have side chains that lack polar bonds and have non-polar bonds instead (i.e. they have many C–H bonds). The non-polar amino acids are hydrophobic, as they tend to cluster together to get away from water. (C) Polar amino acids are hydrophilic, meaning that their side chains interact strongly with water and each other. (D) Aromatic residues are unique in that they contain rings with alternating double bonds (tryptophan and tyrosine cannot be easily categorised as hydrophobic or hydrophilic; each has a large side chain with polar and non-polar features). (E) Charged residues are fully ionised at pH 7 and exist predominantly in their deprotonated, negatively charged form or protonated, positively charged form. In addition to side chains, the N- and C-termini of the polypeptide chain are ionised at physiological pH. (F) Glycine and Proline are shown as amino acids and are classed as special cases. Glycine has a hydrogen for a side chain and allows polypeptides to be flexible. Proline can only exist in two conformations because its side chain is directly bonded to its amino group which constrains the backbone into a narrower range of shapes.

( A ) All 20 amino acids have a common structure with distinct chemical and physical properties that are determined by their R groups (side chains). Each has its own name (i.e. Alanine), three letter abbreviation (Ala) and one letter code. They are grouped according to their size, charge, polarity, and, in certain cases, by special features they impart the polypeptide backbone. Amino acids are shown as residues in short polypeptide chains with an N- and C-termini as indicated at ends. Carbon atoms do not show the letter C and are represented at bond junctions, also hydrogens attached to carbons are not shown (this representation is commonly used in organic chemistry). The polypeptide backbone is shown in black and the side chains are coloured. ( B ) Nonpolar residues typically have side chains that lack polar bonds and have non-polar bonds instead (i.e. they have many C–H bonds). The non-polar amino acids are hydrophobic, as they tend to cluster together to get away from water. ( C ) Polar amino acids are hydrophilic, meaning that their side chains interact strongly with water and each other. ( D ) Aromatic residues are unique in that they contain rings with alternating double bonds (tryptophan and tyrosine cannot be easily categorised as hydrophobic or hydrophilic; each has a large side chain with polar and non-polar features). ( E ) Charged residues are fully ionised at pH 7 and exist predominantly in their deprotonated, negatively charged form or protonated, positively charged form. In addition to side chains, the N- and C-termini of the polypeptide chain are ionised at physiological pH. ( F ) Glycine and Proline are shown as amino acids and are classed as special cases. Glycine has a hydrogen for a side chain and allows polypeptides to be flexible. Proline can only exist in two conformations because its side chain is directly bonded to its amino group which constrains the backbone into a narrower range of shapes.

Once amino acids are linked together to form a polypeptide chain, the sidechains and backbone groups interact with each other through many weak interactions to include van der Waals, hydrogen bonds, electrostatic interactions as well as the hydrophobic effect to bring about a protein’s shape and target interactions ( Figure 2 ). For example, the side chain of lysine has a long hydrocarbon chain which is non-polar, yet the end of the chain is positively charged allowing it to interact with other molecules using any of the weak interactions described above. Glutamic acid on the other hand is a similar size but carries a negative charge and has fewer potential ways of interacting. In addition to the sidechain interactions, the peptide bond carries a dipole due to the electronegative properties of the bound oxygen, allowing it to form hydrogen bonds with backbone and mainchain groups. Since there are 20 different amino acids, and they can be arranged in any order, there are a vast number of possible linear combinations and organisms have evolved tens of thousands of different proteins and peptides. Proteins do not usually exist as extended chains, and through sidechain and backbone interactions they fold in on themselves, leading to a unique shape. Each shape has a way of moving and interacting with other molecules bringing about its function. The range of shapes means proteins are extremely versatile, sometimes acting as enzymes to catalyse chemical reactions, sometimes as a type of messenger that binds to a specific partner to relay a message and other times acting as a structural scaffold within the cell ( Figure 3 ). In contrast, DNA usually adopts the classic double-helical structure regardless of its sequence, which suits its function to store genetic information.

Intermolecular interactions

Interactions between amino acid side chains help to stabilise the folded structures of proteins and allow proteins to interact with each other. These interactions can include (A) van der Waals interactions when molecules with complementary shapes approach each other. These molecules can be uncharged and only contain non-polar bonds yet at close contact, an instantaneous dipole can be induced in these non-polar bonds allowing weak electrostatic interactions between oppositely (partially) charged groups. Although an individual van der Waals force is weak, many such interactions across non-polar surfaces can allow two proteins to interact with each other. Non-polar groups can also be attracted to each other through the hydrophobic effect, which will be considered when discussing protein folding. (B) Hydrogen bonding occurs when two interacting molecules each contain dipoles (i.e. they contain polar covalent bonds), where the electrostatic attraction occurs between a partially negative N or O atom (with a lone pair of electrons) and a partially positive hydrogen atom that is covalently bound to a different N or O atom. Unlike van der Waals interactions, these bonds are not just dependent on the magnitude of the partial charges and the distance between them but also dependent on orientation of the groups involved. When the Hydrogen is linear with the covalently attached N or O and the interacting N or O (i.e. all three atoms and the lone pair of electrons appear on a line) the strength is maximal. As such, the proteins must fold and interact with other proteins using very precise geometries that satisfy this directional dependence in order to form hydrogen bonds that are strong and significant. (C) Ionic interactions (salt bridges) are attractive interactions between oppositely charged ions, since ions contain more charge than the other dipoles discussed above, they are the strongest intermolecular interaction involving charge, and (D) disulphide bonds are sulphur–sulphur covalent bonds formed by the oxidation of two cysteine residues which can be formed within a single protein chain or between two separate chains. Given these bonds are covalent, they are the strongest overall intermolecular bond, however, the bonds can be broken if a protein is exposed to reducing environments and becomes reduced.

Interactions between amino acid side chains help to stabilise the folded structures of proteins and allow proteins to interact with each other. These interactions can include ( A ) van der Waals interactions when molecules with complementary shapes approach each other. These molecules can be uncharged and only contain non-polar bonds yet at close contact, an instantaneous dipole can be induced in these non-polar bonds allowing weak electrostatic interactions between oppositely (partially) charged groups. Although an individual van der Waals force is weak, many such interactions across non-polar surfaces can allow two proteins to interact with each other. Non-polar groups can also be attracted to each other through the hydrophobic effect, which will be considered when discussing protein folding. ( B ) Hydrogen bonding occurs when two interacting molecules each contain dipoles (i.e. they contain polar covalent bonds), where the electrostatic attraction occurs between a partially negative N or O atom (with a lone pair of electrons) and a partially positive hydrogen atom that is covalently bound to a different N or O atom. Unlike van der Waals interactions, these bonds are not just dependent on the magnitude of the partial charges and the distance between them but also dependent on orientation of the groups involved. When the Hydrogen is linear with the covalently attached N or O and the interacting N or O (i.e. all three atoms and the lone pair of electrons appear on a line) the strength is maximal. As such, the proteins must fold and interact with other proteins using very precise geometries that satisfy this directional dependence in order to form hydrogen bonds that are strong and significant. ( C ) Ionic interactions (salt bridges) are attractive interactions between oppositely charged ions, since ions contain more charge than the other dipoles discussed above, they are the strongest intermolecular interaction involving charge, and ( D ) disulphide bonds are sulphur–sulphur covalent bonds formed by the oxidation of two cysteine residues which can be formed within a single protein chain or between two separate chains. Given these bonds are covalent, they are the strongest overall intermolecular bond, however, the bonds can be broken if a protein is exposed to reducing environments and becomes reduced.

Proteins have diverse structures and functions

Proteins are the workhorses for all living organisms and as such have an enormous range of functions that are facilitated by a range of different structures and associated dynamics. Note, the proteins shown here are not to scale and are coloured by polypeptide chain. Some proteins function to provide a structural scaffold, such as the 180 copies of envelope proteins that make up the Zika Virus outer shell which contains the RNA necessary to infect (pdb code: 5ire). Three conformations of the envelope protein are coloured differently to reveal the incredible symmetry that generates an icosahedron (20 faces) shell. Multiple copies of the monosacharide N-acetyl glucosamine are also shown (cyan). The outer shell of the virus is shown by representing the atoms in the proteins as spheres generating a surface or space-filling representation. Some proteins function as enzymes, which catalyse chemical reactions by reducing the activation barrier that must be crossed when substrates convert into products, such as Hexokinase which catalyses the first step in glycolysis (pdb code: 2yhx). This protein has a large upper (sub)domain and a smaller lower (sub)domain which creates the active site between them where catalysis occurs. When glucose binds to the active site, the domains clamp down and the mouth of the active site closes which facilitates conversion into glucose-6-phosphate using ATP. The protein is shown with a transparent surface and only the polypeptide backbone is shown inside as a cartoon representation, with thin loops connecting α-helices as spiralled tubes and β-strands as thick arrows, where the end of the arrow indicates the C-terminus. Many proteins function by binding to another protein, membrane or small molecule, to allow transport of molecules and to signal within and between cells in response to outside stimuli. For example, antibodies in the blood bind to foreign antigens (usually proteins from a foreign microorganism or virus) and elicit an immune response, which requires precise protein interactions to avoid interactions with self-proteins (pdb code: 1igt). Typically, these β-sheet rich antibodies are made up of four polypeptide chains (two long heavy chains in yellow and cyan and two shorter light chains in pink and green) that together form a stem with two flexible arms that connect to two binding sites where antigen binding occurs. The binding sites are unique for every antibody and the flexibility and dynamics of these sites allow every antibody to recognise a unique foreign molecule and attack it from multiple angles. Other examples of protein interactions include the DNA binding domain from the transcription factor Oct1 binding to DNA (pdb code: 1oct). This interaction needs to be very specific in order to only bind to the correct DNA promoter sequence so that only specific genes are turned on. The sugar–phosphate backbone of DNA is represented as a cartoon and the four DNA bases are coloured differently to highlight the unique sequence recognised by Oct1. Hormones are an important class of molecules that also rely on precise protein–protein interactions. For example, the α-helical protein insulin is a small hormone that is made of up of two chains (green and cyan) held together by disulphide bonds (pdb code: 4ins). Insulin is essential for maintaining blood glucose levels by binding to the insulin receptor found on the outside of many tissues such as liver, muscle and heart cells. Insulin binding promotes the uptake of glucose in the blood after a meal and controls many different metabolic processes by changing the activity of enzymes and transporter proteins. Finally, proteins interact specifically with small molecules to transport them across membranes or to other locations in our bodies. For example, deoxy haemoglobin is a heterotetrameric protein made up of two α subunit chains (green) and two β subunit chains (blue) that transports oxygen (pdb code: 2hhb). Each chain folds into an α-helical domain that includes a ring-like haem group (pink) containing an iron atom. Oxygen binds reversibly to these iron atoms and allows this crucial gas to be transported from the lungs in the blood to other tissues in the body. Abbreviations: ATP, adenosine triphosphate; pdb, Protein Data Bank; RNA, ribonucleic acid.

Proteins are the workhorses for all living organisms and as such have an enormous range of functions that are facilitated by a range of different structures and associated dynamics. Note, the proteins shown here are not to scale and are coloured by polypeptide chain. Some proteins function to provide a structural scaffold, such as the 180 copies of envelope proteins that make up the Zika Virus outer shell which contains the RNA necessary to infect (pdb code: 5ire ). Three conformations of the envelope protein are coloured differently to reveal the incredible symmetry that generates an icosahedron (20 faces) shell. Multiple copies of the monosacharide N-acetyl glucosamine are also shown (cyan). The outer shell of the virus is shown by representing the atoms in the proteins as spheres generating a surface or space-filling representation. Some proteins function as enzymes, which catalyse chemical reactions by reducing the activation barrier that must be crossed when substrates convert into products, such as Hexokinase which catalyses the first step in glycolysis (pdb code: 2yhx ). This protein has a large upper (sub)domain and a smaller lower (sub)domain which creates the active site between them where catalysis occurs. When glucose binds to the active site, the domains clamp down and the mouth of the active site closes which facilitates conversion into glucose-6-phosphate using ATP. The protein is shown with a transparent surface and only the polypeptide backbone is shown inside as a cartoon representation, with thin loops connecting α-helices as spiralled tubes and β-strands as thick arrows, where the end of the arrow indicates the C-terminus. Many proteins function by binding to another protein, membrane or small molecule, to allow transport of molecules and to signal within and between cells in response to outside stimuli. For example, antibodies in the blood bind to foreign antigens (usually proteins from a foreign microorganism or virus) and elicit an immune response, which requires precise protein interactions to avoid interactions with self-proteins (pdb code: 1igt ). Typically, these β-sheet rich antibodies are made up of four polypeptide chains (two long heavy chains in yellow and cyan and two shorter light chains in pink and green) that together form a stem with two flexible arms that connect to two binding sites where antigen binding occurs. The binding sites are unique for every antibody and the flexibility and dynamics of these sites allow every antibody to recognise a unique foreign molecule and attack it from multiple angles. Other examples of protein interactions include the DNA binding domain from the transcription factor Oct1 binding to DNA (pdb code: 1oct ). This interaction needs to be very specific in order to only bind to the correct DNA promoter sequence so that only specific genes are turned on. The sugar–phosphate backbone of DNA is represented as a cartoon and the four DNA bases are coloured differently to highlight the unique sequence recognised by Oct1. Hormones are an important class of molecules that also rely on precise protein–protein interactions. For example, the α-helical protein insulin is a small hormone that is made of up of two chains (green and cyan) held together by disulphide bonds (pdb code: 4ins ). Insulin is essential for maintaining blood glucose levels by binding to the insulin receptor found on the outside of many tissues such as liver, muscle and heart cells. Insulin binding promotes the uptake of glucose in the blood after a meal and controls many different metabolic processes by changing the activity of enzymes and transporter proteins. Finally, proteins interact specifically with small molecules to transport them across membranes or to other locations in our bodies. For example, deoxy haemoglobin is a heterotetrameric protein made up of two α subunit chains (green) and two β subunit chains (blue) that transports oxygen (pdb code: 2hhb ). Each chain folds into an α-helical domain that includes a ring-like haem group (pink) containing an iron atom. Oxygen binds reversibly to these iron atoms and allows this crucial gas to be transported from the lungs in the blood to other tissues in the body. Abbreviations: ATP, adenosine triphosphate; pdb, Protein Data Bank; RNA, ribonucleic acid.

Since almost every function crucial to life is mediated by proteins, any changes to their structure due to damage, mutation or modification explains the cause of the disease at the molecular level. The classic example is that of haemoglobin. When an individual inherits a variant haemoglobin gene where the glutamic acid (R group is charged) at residue position six is changed to valine (R group is hydrophobic), this leads to sickle cell disease. This one amino acid difference changes the surface of haemoglobin by removing the negative charge and forming a hydrophobic ‘sticky’ patch (in the absence of oxygen), causing deoxyhaemoglobin to clump together. Since this protein is in high concentrations in red blood cells, it converts the cells from a standard disc into a sickled shape, which reduces cell lifetimes, leading to anaemia, and can result in the blockage of capillaries leading to tissue damage.

To understand protein structures in more detail, we next explore the four levels that determine their shape.

Protein structure

Protein structure is described at four different levels. The arrangement of amino acids in a polypeptide chain is referred to as its primary structure. Each amino acid in a polypeptide chain is referred to as a residue and the linked series of carbon, nitrogen and oxygen atoms are known as the main chain or protein backbone. The first amino group at the start of the peptide chain is known as the N-terminus, and the end with the carboxylic acid group is the C-terminus. When we count or write the residues in a polypeptide chain, we start with the N-terminus. The location of disulphide bonds that covalently link different parts of the polypeptide chain together are also considered part of the primary structure. These bonds are formed between two cysteine residues via their side chain thiol groups (–SH) and they significantly stabilise protein structures.

Protein secondary structure refers to the way the primary structure of a protein arranges itself as a result of regular hydrogen bonds forming between the backbone C = O and NH groups of each peptide bond. However, the peptide bond itself cannot rotate as it has double bond character due to resonance stabilisation ( Figure 4 ), where the nitrogen donates its lone pair of electrons to the carbonyl carbon, pushing electrons towards the oxygen. This results in the electrons being delocalised over multiple atoms, which increases bond stability and decreases rotation ( Figure 5 A). Therefore, rotation can only occur about the bond between the C α and the C = O group, (the phi (φ) angle) and the C α and the NH group, (the psi (ψ) angle). In effect, the polypeptide backbone chain is composed of a repeating series of two rotatable bonds followed by one non-rotatable (peptide) bond. However, not all 360 o of the psi and phi angles are possible as neighbouring sidechains can clash due to steric hindrance. In effect, for certain angles and amino acid combinations, the atoms cannot be in the same physical place and this partly explains why some amino acids have a higher propensity (likelihood) to form different types of secondary structure. Within these restraints, the two principal local conformations that avoid steric hindrance and maximise backbone–backbone hydrogen bonding are the α-helix and the β-sheet secondary structures ( Figure 5 ).

Resonance stabilisation causes the peptide bond to have double-bond character and carry a dipole

Brackets: The double-headed arrow signifies that the peptide bond is a hybrid of two states. With resonance, the nitrogen is able to donate its unhybridised lone pair of electrons to the carbonyl carbon and push electrons from the carbonyl double bond towards the oxygen, forming the oxygen anion. Right hand image: The resonance structure of the peptide bond is shown in purple. The nitrogen has a tendency to share its lone pair of electrons with the carbonyl carbon, delocalising electrons among the nitrogen, carbon and oxygen atoms. Also shown is the individual dipole moment (arrow) associated with the bond. The dashed line indicates the resonance of the peptide bond and the additional stability results in a non-rotatable peptide bond.

Brackets: The double-headed arrow signifies that the peptide bond is a hybrid of two states. With resonance, the nitrogen is able to donate its unhybridised lone pair of electrons to the carbonyl carbon and push electrons from the carbonyl double bond towards the oxygen, forming the oxygen anion. Right hand image: The resonance structure of the peptide bond is shown in purple. The nitrogen has a tendency to share its lone pair of electrons with the carbonyl carbon, delocalising electrons among the nitrogen, carbon and oxygen atoms. Also shown is the individual dipole moment (arrow) associated with the bond. The dashed line indicates the resonance of the peptide bond and the additional stability results in a non-rotatable peptide bond.

Protein secondary structural elements

(A) Diagram of a generic polypeptide chain. Residue side chains are denoted as R. Coloured rectangles indicate sets of six atoms that are coplanar due to the double-bond character of the peptide bond. Arrows indicate the bonds that are free to rotate with the angle of rotation about the N–Cα known as phi and about the Cα–C known as psi. Note that only peptide backbone bonds are labelled, in most cases the R group bond is free to rotate. (B) Line drawing of the chemical structure of the polypeptide backbone of three β-strands within a β-sheet. Hydrogen bonds between the main chain –CO and –NH groups are shown as dotted lines. Parallel sheets contain β-strands that run in the same direction, whereas antiparallel sheets contain β-strands that run in the opposite direction to its neighbour. (C) Cartoon representation (also known as a ribbon diagram) of an antiparallel β-sheet region from a larger protein. In this example, three β-strands are connected by a short loops. Arrows representing β-strands point towards the C-terminus by convention. The hydrogen bonds holding the sheets together are shown as dotted lines. (D) Side view of the same β-sheet showing the individual residue sidechains. The atoms are coloured with carbon in pink, sulphur in yellow, oxygen in red and nitrogen in blue. Note the residues on the non-polar side are mainly constructed from non-polar carbon containing residues whereas the residues on the polar side have oxygen and nitrogen atoms and are a mixture of ionic and polar sidechains. Each strand has a slight twist that can be seen in the image. (E) Stick representation of an α-helix with the sequence NH2–SGEFARICRDLSHIG–COOH. Hydrogen bonds between backbone atoms are indicated with dashed lines. The atoms are coloured with carbon in light blue, sulphur in yellow, oxygen in red and nitrogen in blue. Note the peptide bonds in an α-helix all point in the same direction and are bonded to a residue four places along the chain. (F) Cartoon representation of the same α-helix as seen in larger protein structures. (G) Rotated view of the α-helix, side chains radiate outwards, away from the centre of the helix.

( A ) Diagram of a generic polypeptide chain. Residue side chains are denoted as R. Coloured rectangles indicate sets of six atoms that are coplanar due to the double-bond character of the peptide bond. Arrows indicate the bonds that are free to rotate with the angle of rotation about the N–C α known as phi and about the C α –C known as psi. Note that only peptide backbone bonds are labelled, in most cases the R group bond is free to rotate. ( B ) Line drawing of the chemical structure of the polypeptide backbone of three β-strands within a β-sheet. Hydrogen bonds between the main chain –CO and –NH groups are shown as dotted lines. Parallel sheets contain β-strands that run in the same direction, whereas antiparallel sheets contain β-strands that run in the opposite direction to its neighbour. ( C ) Cartoon representation (also known as a ribbon diagram) of an antiparallel β-sheet region from a larger protein. In this example, three β-strands are connected by a short loops. Arrows representing β-strands point towards the C-terminus by convention. The hydrogen bonds holding the sheets together are shown as dotted lines. ( D ) Side view of the same β-sheet showing the individual residue sidechains. The atoms are coloured with carbon in pink, sulphur in yellow, oxygen in red and nitrogen in blue. Note the residues on the non-polar side are mainly constructed from non-polar carbon containing residues whereas the residues on the polar side have oxygen and nitrogen atoms and are a mixture of ionic and polar sidechains. Each strand has a slight twist that can be seen in the image. ( E ) Stick representation of an α-helix with the sequence NH 2 –SGEFARICRDLSHIG–COOH. Hydrogen bonds between backbone atoms are indicated with dashed lines. The atoms are coloured with carbon in light blue, sulphur in yellow, oxygen in red and nitrogen in blue. Note the peptide bonds in an α-helix all point in the same direction and are bonded to a residue four places along the chain. ( F ) Cartoon representation of the same α-helix as seen in larger protein structures. ( G ) Rotated view of the α-helix, side chains radiate outwards, away from the centre of the helix.

The α-helix is a right-handed coil in which backbone NH group hydrogen bonds to the backbone C = O group of the amino acid located four residues earlier along the protein sequence. This results in a polypeptide chain that twists in a regular coil shape with the R-groups pointing outwards away from the peptide backbone. It takes approximately 3.6 residues to complete a full turn of a helix.

β-sheets are composed of two or more extended polypeptide chains called β-strands that run alongside each other. They can be arranged in either a parallel or antiparallel manner. The residues arrange themselves in a regular zigzag manner with the adjacent peptide bonds pointing in opposite directions. In this arrangement, the NH group and the C = O group of each amino acid is hydrogen-bonded to the C = O group and NH group respectively on the adjacent strands. Chains can run in opposite directions, forming an antiparallel β-sheet, or in the same direction, forming a parallel β-sheet. Sidechains from each of the residues point away from the sheets and alternate in opposite directions between residues. It is common to see a pattern of alternating hydrophilic and hydrophobic residues in the primary structure, giving the β-sheets hydrophilic and hydrophobic faces.

The overall three-dimensional (3D) appearance of a protein is known as its tertiary structure and is brought about by the interactions between the side chains (R groups) and the way in which the secondary structure packs together to fold the protein. Quaternary structure refers to how multiple folded protein chains (called subunits) interact and arrange to form a larger multisubunit protein complex. Examples of the tertiary and quaternary structures were seen in some of the first proteins that had their structures solved using X-ray crystallography, as seen in Figure 3 . Protein structures are often viewed as models in which the β-strands are represented as arrows and the α-helix as a ribbon or tube. For example, haemoglobin is an α-helical protein with a quaternary structure comprising four subunits, known as a tetramer. The structure can be seen in Figure 3 as a cartoon covered by a transparent molecular surface of the protein. As more proteins were solved, it became clear that there were many different protein shapes and folds, and they appeared to be organised into distinct units called protein domains. Currently, there are approximately 165000 protein structures, and their tertiary and quaternary structures are classified into groups according to two major classification systems called CATH and SCOP. We will focus on the concept of protein domains in the next section.

Protein domains

A protein’s shape comes from the arrangement of secondary structure elements such as α-helices and β-sheets into recognisable conformations called motifs (or super secondary structure). Motifs are short segments of a protein’s structure, and the same arrangement can be found in many different proteins. For example, the β-turn links β-strands together and consists of four consecutive residues which allow the polypeptide chain to fold back on itself by nearly 180 degrees. The β-α-β motif consists of parallel β-strands that are connected by an α-helix that crosses the two strands. Secondary structure elements and motifs are arranged in individual proteins into compact independent 3D structures called domains. Unlike motifs, domains fold independently of the rest of the full-length polypeptide chain. Larger proteins are often formed of multiple different domains linked together with each domain having a structural or functional role ( Figure 6 ). The arrangement of secondary structure elements that describe a protein domain’s shape is called its fold. For example, a Rossman fold has 2× β-α-β motifs with a shared middle β-strand forming the domain. This particular domain is found in many larger proteins, giving it the ability to bind nucleotides. Remarkably, there are only ∼2200 recognisable protein folds despite the vast number of amino acid combinations possible.

Motifs, Domains and Full-length proteins

(A) Secondary structure often packs into motifs. These motifs are stable easily folded arrangements but cannot exist independently. (B) A protein domain is a conserved part of a given full-length protein sequence with a defined tertiary structure that can evolve, function and exist independently of the rest of the protein chain. Each domain forms a compact 3D structure and often can be independently stable and folded usually with a distinct function. (C) Large proteins are usually made up of several independently folded domains. The protein is represented by a straight line from the N- to C-termini with any protein domains it contains represented in boxes. The amino acid sequence is highlighted at the C-terminus and due to its low complexity of just proline (P) and alanine (A) is predicted to be disordered.

( A ) Secondary structure often packs into motifs. These motifs are stable easily folded arrangements but cannot exist independently. ( B ) A protein domain is a conserved part of a given full-length protein sequence with a defined tertiary structure that can evolve, function and exist independently of the rest of the protein chain. Each domain forms a compact 3D structure and often can be independently stable and folded usually with a distinct function. ( C ) Large proteins are usually made up of several independently folded domains. The protein is represented by a straight line from the N- to C-termini with any protein domains it contains represented in boxes. The amino acid sequence is highlighted at the C-terminus and due to its low complexity of just proline (P) and alanine (A) is predicted to be disordered.

Globular (roughly spherical) and soluble proteins (for example, enzymes found in the cytoplasm)

Membrane proteins within the cell or organelle membrane (for example, receptors)

Fibrous proteins (characterised by the presence of repetitive sequence motifs, for example, collagen)

Intrinsically disordered proteins (described later in the text)

There are many other ways to classify protein domains, and two of the most commonly used systems are the CATH and Structural Classification of Proteins (SCOP) systems ( Table 1 ). Both are hierarchical domain classification systems in which proteins are organised into different levels based on the structural and sequence similarities, and each has websites that you can explore.

Over evolution, multicellular organisms have generated new large proteins by mixing and matching existing domains into new combinations. Since each domain has a particular function (such as binding or catalysis or gene activation), these new proteins will have a unique combination of properties depending on the domains they contain. Proteins sharing more than a few common domains are usually encoded by members of evolutionarily related genes. They therefore make up gene families that have a common ancestor and equivalent domains within the family have high-sequence conservation. These domains are called orthologues and the proteins they reside in usually play a similar role in all species. Genes for proteins that share only one or a few domains may belong to a gene superfamily. Superfamily members can have one function in common, but their sequences are otherwise unrelated. Similar domains found in different full-length proteins in the same organism are called paralogues. Often, they diverged from a common ancestor a long time ago and these domains usually only have the most essential structural and functional properties conserved. A large protein with various domains will each need to fold from the initial linear polypeptide chain, and this process is considered next.

Protein folding

A protein domain in its functional and/or assembled form is referred to as being in its native state. This state results from the amino acid side chains present on the polypeptide chain making favourable interactions with each other and stabilising the protein. However, when a protein domain is first translated by a ribosome from mRNA it exists as a linear chain of amino acids which lack structure and is referred to as being unfolded or ‘denatured’. In this state, these interactions are yet to form. If the unfolded protein domain were to randomly search through all possible conformations it could make by testing out all the possible combinations of interactions, the process of finding the native state would take longer than the age of the universe! However, most protein domains can fold spontaneously into their ‘native state’ on the order of 10 −6 to 10 −1 s. The process by which the unfolded protein domain gains its compact 3D native state is known as protein folding and is studied by thermodynamics and kinetics [For a reminder of the fundamentals of thermodynamics and kinetics please refer to the Essential Chemistry for Biochemists article in this series and the Thermodynamics Box ( Box 1 )].

To understand why protein folding occurs, we must consider the field of thermodynamics that aims to understand whether any chemical reaction will occur. In other words, is it favourable for a reaction to convert its reactants into products? This will involve recapping some of the basics covered in the Essential Chemistry for Biochemists review in this series. To begin with, it is important to define a system as the reaction we are interested in (i.e. the protein folding reaction to include the unfolded and folded proteins and any solvent or solute molecules that interact with these proteins) and the surroundings as everything else in the universe that is outside the system. As biochemists, to make a prediction about a system reaction, we are interested in three system quantities called enthalpy, H; entropy, S and Gibbs free energy, G and the first and second laws of thermodynamics will help us appreciate where these quantities come from.

The first law of thermodynamics states that the total amount of energy in the universe is constant and that energy can neither be created nor destroyed, but it can be transformed from one form to another. From this law, we can start to keep track of energy, for example if heat energy is lost from a reaction as products are made then the energy of the system will go down, however the energy of the surroundings will go up as that heat energy will just transfer over. The change in heat content for a reaction is defined as ΔH and depends on the bonds that have been broken and the new bonds that have been formed during the reaction. In a reaction, whenever a bond is formed, heat energy is released to the surroundings and whenever a bond is broken heat energy is taken up by the system from the surroundings. Therefore, to calculate ΔH, one must consider the sum of the broken bond energies and the sum of the formed bond energies. If more energy has been released from bond formation than the energy taken up from breaking bonds, then energy will be released and ΔH will be negative or exothermic. Conversely, if less energy has been released from bond formation than the energy taken up from breaking bonds then energy will be absorbed and ΔH will be positive or endothermic. For protein folding, the interactions involved are usually the weak non-covalent bonds we discussed earlier involving the hydrophobic effect, hydrogen bonds, van der Waals and other electrostatic interactions. When a protein folds, most often more energy is released from forming these bonds than the energy taken up from breaking any pre-existing bonds that are present in the unfolded state (i.e. ΔH is negative). However, sometimes, proteins can fold even when ΔH is positive. To appreciate why this is the case, we must also take into account the second law of thermodynamics.

  • The second law of thermodynamics states that the entropy of the universe always increases, in other words, for protein folding to be favourable to occur, the entropy of the universe must increase as a result of this process. Entropy is often described as disorder, which is a familiar term to most of us in a physical sense, for example, as we have seen in the main text, water molecules that surround and interact with an unfolded protein are quite ordered and constrained and it is only when proteins fold and expel these water molecules that they can leave the protein surface and move around more and essentially increase their disorder. A better way to think of entropy is to do with the number of ways energy can be distributed in a system. For example, if an object is hot, it has lots of thermal energy concentrated in one place (in the object). However, if you place that object in some cold water, heat always transfers to the water and heats it up as the thermal energy is dispersed and spread away from the object into the water. This happens as energy dispersal increases the number of ways that energy can be distributed. In fact, whenever there is greater movement of bonds or atoms in molecules there are more ways to distribute energy. In an exothermic reaction, energy is released to the surroundings and increases the entropy of the universe as the energy has now been dispersed. Therefore, the entropy of the universe can increase in two ways, either through an increase in entropy of the system (ΔS > 0) or through dispersal of energy from the system to the surroundings (ΔH < 0). The quantity of Gibbs free energy is used to keep track of the entropy change of the universe ( eqn 1 ). Δ S universe = − Δ G / T (1) ΔS universe , Change in entropy of the universe; ΔG, Change in Gibbs free energy as products are made (i.e. unfolded to folded); T, Temperature (in Kelvin).
  • When ΔG is negative, ΔS universe is positive and the reaction will occur and vice versa . It is hard to keep track of entropy and enthalpy changes in the whole universe and fortunately, we can simply focus on the entropy and enthalpy change of the protein folding reaction (system only) and ignore changes in the surroundings because ΔG is also a function of the enthalpy and entropy of the system reaction ( eqn 2 ). Δ G = Δ H − T Δ S (2) ΔH, Change in enthalpy as products are made from reactants (i.e. unfolded to folded); ΔS, Change in entropy as products are made from reactants (i.e. unfolded to folded).

It should be noted that biochemists cannot predict ΔH and ΔS and must rely on experimental calorimetry measurements to determine these values. As can be seen in ( eqn 2 ), for a protein (and its surrounding interacting water molecules) to fold spontaneously, it will have more free energy in the unfolded state and less free energy in the folded state. To represent the change in free energy of a protein ensemble, it is useful to show the reaction progress that is measured experimentally using a classical energy diagram as described in the main text.

Every spontaneous (favourable) reaction in nature results in lowering its free energy as dictated by the laws of thermodynamics. For example, the folding of protein domains is a spontaneous reaction when a negative change in Gibbs free energy (G) occurs, and the protein domain moves to a lower energy state. Change of Gibbs free energy (ΔG) has two components that are influenced by temperature; change of enthalpy (H, a measure of the formed and broken bond energies in the system) and change of entropy (S, a measure of the change of system ‘disorder’) as seen in ( eqn 2 ) in the Thermodynamics Box ( Box 1 ). The driving force for protein folding is a result of hydrophobic collapse, hydrogen bond formation, electrostatic interactions and van der Waals interactions that lower the free energy. According to ( eqn 2 ), for a negative ΔG and for protein folding to become thermodynamically favourable, the change in these interactions must result in either a favourable change in system enthalpy (ΔH) and/or entropy (ΔS).

When amino acids form new hydrogen bonds, van der Waals and other electrostatic interactions it results in releasing heat, while breaking these bonds with water results in absorbing heat. Therefore, the relative amount of bond formation to bond breakage in the unfolded and folded states will determine ΔH. However, the basis of the hydrophobic effect (collapse) is an increase in the entropy of protein-associated water and is the most important driving force in protein folding. When a protein domain is present in its unfolded state, water molecules have to order themselves in ice-like structures around the hydrophobic groups of the polypeptide chain which forces order on the system and so has less entropy than the free water molecules. Solvent entropy is increased by the protein domain collapsing and placing the hydrophobic side chains into the middle of the protein ( Figure 7 A). As a result, the hydration shells around the side chains are no longer required, and these water molecules become disordered (free to sample multiple states and interactions), causing a positive change in entropy for the system (ΔS). It should be noted that as a protein domain folds, the polypeptide chain loses entropy as it adopts a single dominant folded conformation (shape), however this decrease in entropy is often offset by the hydrophobic effect described above.

Two state folding of a small protein

(A) Hydrophobic collapse. In the compact fold (to the right), the hydrophobic amino acids (shown as black spheres) collapse towards the centre to become shielded from aqueous environment. (B) The classical view of protein folding. Diagram represents the free energy of the native and denatured ensembles of a protein under conditions where the native state is favoured as the native state has a lower free energy than the unfolded state. The free energy difference between these states (ΔG) is a measure of the stability of the protein. The transition state ensemble is a population of short-lived and partially folded conformations that cannot be directly observed in experiments but must be passed through to fold and defines the activation barrier for folding (ΔG# folding) and unfolding (ΔG# unfolding).

( A ) Hydrophobic collapse. In the compact fold (to the right), the hydrophobic amino acids (shown as black spheres) collapse towards the centre to become shielded from aqueous environment. ( B ) The classical view of protein folding. Diagram represents the free energy of the native and denatured ensembles of a protein under conditions where the native state is favoured as the native state has a lower free energy than the unfolded state. The free energy difference between these states (ΔG) is a measure of the stability of the protein. The transition state ensemble is a population of short-lived and partially folded conformations that cannot be directly observed in experiments but must be passed through to fold and defines the activation barrier for folding (ΔG # folding) and unfolding (ΔG # unfolding).

For any given protein, several different folding pathways exist that allow the same protein to reach its native state by different routes. Experimentally we cannot distinguish these individual ‘microscopic’ pathways and can only monitor the ‘macroscopic’ changes along the reaction coordinate using spectroscopic methods (for example, by measuring the fluorescence or CD signal changes as the protein folds in real time). If we represent the associated free energy of the ‘macroscopic’ ensemble of pathways, we generate a classical energy diagram ( Figure 7 B) that shows the free energy of the protein as it goes from an unfolded ensemble (left) to a folded ensemble (right). Often, small protein domains of a few hundred amino acids can fold in a single step, passing through a high energy transition state ensemble. However, larger protein domains often pass through a number of intermediate states that are stable but not fully folded, before the process is complete. The classical view is useful to interpret experimental measurements however theoretical and computational studies are now working on the new view of folding that tries to understand and represent the microscopic pathways. Here, proteins are multistate objects that fold through multiple unpredictable routes and intermediate conformations. This folding is represented by a more complex funnel-shaped energy landscape in which the proteins energy and number of conformational states decreases as the protein moves down the funnel.

Protein domains fold because the native state releases water to a more disordered state (increasing entropy) and the new bonds (compared with the old bonds) usually result in heat being released, decreasing the enthalpy. Together, this causes the Gibbs free energy to decrease and makes folding spontaneous. However, just because the folded protein is lower in energy than the unfolded protein, this only indicates that the process is favourable to occur. The speed at which it occurs (the rate of the reaction) is independent of ΔG and instead is governed by the size of the barrier between the energy of the unfolded protein ensemble and the energy of the transition state ensemble (also known as the activation barrier or ΔG # for folding). The lower the energy of the transition state ensemble, the faster the protein folds, which can be as fast as microseconds. We still do not fully understand how to predict how a protein domain will fold, how favourable it will be and how fast it will proceed. One approach to learn the rules is to study how humans engage in a protein folding game. If you want to get involved and have fun trying to fold your own protein using your computer, please visit the ‘ fold.it ’ site.

Every domain within a full-length protein will usually fold independently; however, sometimes, one or more domains can misfold, and the protein sometimes gets tangled up, forming a protein aggregate, which is described next.

Protein misfolding

Most of the time, folded proteins stay folded, however, under certain conditions, normally stable natively folded proteins can partially unfold and assemble into a multisubunit aggregated form known as amyloid. The formation of amyloid is associated with a range of increasingly common human disorders, including Alzheimer’s and Parkinson’s diseases as well as type II diabetes where it builds up in organs and tissues throughout the body. Different proteins can form amyloid and each is associated with its own disease. What is intriguing about amyloid is for a range of protein structures, the final amyloid material they adopt shares a remarkably similar structure.

Under an electron microscope, amyloid looks like long unbranching fibres composed of filaments that wrap around each other like threads in a rope. At the protein structural level, the filaments are made up from parallel extended β-sheets structures, known as cross-β. Individual β-strands are stacked in-register, one on top of the other, running perpendicular to the fibril axis. Sidechains protrude from the sheets, and the hydrogen bonds that hold the sheets together run along the length of the cross-β fibril. Since β-sheets are held together by peptide backbone interactions that all proteins can make, this helps explain why so many proteins can adopt this structure ( Figure 8 ). As well as their similar structures, amyloid fibrils from different proteins all have the ability to bind histological dyes.

Cross-β structure of amyloid material

NMR atomic-resolution structure of an amyloid triplet fibril (right) fitted into a cryo-EM reconstruction (centre). The background image of the fibril (left) was taken using Transmision Electron Microscopy (scale bar, 50 nm). The constituent β-sheets are shown in a ribbon representation in blue; oxygen, carbon and nitrogen atoms are shown in red, grey and blue, respectively. Note that in a cross-β struture β-strands are stacked one on top of the other. Image adapted with permision from Fitzpatrick, Debelouchina, Bayro, Clare, Caporini, Bajaj, Jaroniec, Wang, Ladizhansky and Müller (2013) Atomic structure and hierarchical assembly of a cross-β amyloid fibril. Proc. Natl. Acad. Sci. U.S.A.110, 5468–5473. Abbreviations: cryo-EM, cryogenic electron microscopy; NMR, nuclear magnetic resonance.

NMR atomic-resolution structure of an amyloid triplet fibril (right) fitted into a cryo-EM reconstruction (centre). The background image of the fibril (left) was taken using Transmision Electron Microscopy (scale bar, 50 nm). The constituent β-sheets are shown in a ribbon representation in blue; oxygen, carbon and nitrogen atoms are shown in red, grey and blue, respectively. Note that in a cross-β struture β-strands are stacked one on top of the other. Image adapted with permision from Fitzpatrick, Debelouchina, Bayro, Clare, Caporini, Bajaj, Jaroniec, Wang, Ladizhansky and Müller (2013) Atomic structure and hierarchical assembly of a cross-β amyloid fibril. Proc. Natl. Acad. Sci. U.S.A. 110 , 5468–5473. Abbreviations: cryo-EM, cryogenic electron microscopy; NMR, nuclear magnetic resonance.

Amyloid assembly starts with the protein adopting a partially unfolded conformation. This state is neither fully folded nor unfolded and retains secondary structural elements such as β-sheets and α-helices. However, it loses the defined tertiary structure and tight packing of a folded protein. Experimentally, low pH, high temperature and low concentrations of denaturants are also all known to promote adoption of partially unfolded conformations. Mutations, where one amino acid is swapped for another, can also cause the partially unfolded conformation to be more easily adopted. As people age, there is also a gradual breakdown in the cell’s ability to remove occasionally misfolded proteins and explains why some amyloid diseases like early-onset Alzheimer’s are hereditary and age-related. It was initially thought that the amyloid fibrils were responsible for bringing about disease, but it is now more accepted that it is a structure called an oligomer, populated in the early stages of amyloid formation that is the most toxic entity. Oligomers are flexible and soluble existing in several forms. They bring about a toxic gain of function, and solving their structure remains one of the significant challenges of structural biology. In fact, all folded and misfolded proteins regardless of their structure, have a range of flexibility and dynamics that is central to their function and this is considered next.

Protein dynamics

Proteins are not static and often appear to change their initial native folded structure to allow binding or catalysis to occur. For example, the enzyme hexokinase ( Figure 3 ) that is involved in the first step of glycolysis (the breakdown of glucose) changes conformation. This enzyme contains two (sub)domains and the active site is found between them. Interfaces between protein domains are an ideal place to create active sites as the two parts can shift relative to each other in response to what happens between them. When the substrate glucose binds to the active site region in the open conformation, the two domains change their position to ‘clamp down’ on the substrate to form a closed conformation. This conformational change allows hexokinase to position its catalytic residues around glucose. Once enclosed in the active site, the substrate is phosphorylated using a molecule of bound adenosine triphosphate (ATP), resulting in the production of glucose-6-phosphate.

Hexokinase and other proteins in general are not just limited to a few conformational states, instead proteins are better thought of as dynamic molecules undergoing exchange between states. They are continually undergoing motions where atoms vibrate, bonds wiggle and at times more significant fluctuations occur as the protein samples other possible conformations. These structural changes and dynamic motions are essential for substrate binding and many other functions. With computer simulations, we are starting to visualise the complete process in real-time, in molecular movies generated by molecular dynamic simulations. These movies highlight why folding, structure, dynamics and interactions are central to understanding protein biology. As we will see, some intrinsically disordered proteins (IDPs) naturally exist unfolded all of the time yet do not form amyloid and are therefore even more dynamic, having many more interactions with other biomolecules.

An interesting consequence of conformational change is that after one ligand has bound to a protein, it may change the shape of a separate binding site such that the binding affinity of another ligand at that distant site also changes. In other words, the second ligand may have a different affinity to its target protein depending on whether the first ligand is bound. This concept is known as allostery and is central to the regulation of proteins and enzymes. For example, in haemoglobin, there are four subunits, each containing a haem group that binds oxygen ( Figure 3 ). Oxygen binding at the four haem sites does not necessarily happen simultaneously. Once the first haem binds oxygen, it introduces small changes in the structure of the corresponding protein chain (subunit). These changes nudge the neighbouring chain causing a subtle rotation into a different shape, which allows further oxygen molecules to bind more easily ( Figure 9 ). This effect is called positive allostery as it makes the next event more likely to occur. Allostery is central to regulating metabolic pathways as enzymes at the start of the pathway can be inhibited when the levels of product rise too high via feedback inhibition. The final product usually causes a conformational change in the first (committed) step enzyme such that its substrate can no longer bind as well to its active site. This process is called negative allostery ( Figure 10 ).

Positive allostery in haemoglobin

This diagram illustrates the ‘sequential’ model of cooperativity, which suggests that oxygen binding to one subunit of haemoglobin starts a sequence of conformational changes in the other haemoglobin subunits, which increase their affinity for oxygen, and that this happens in a sequence. The binding of oxygen (blue circle) in one subunit causes a structural change in a neighbouring subunit (purple) that makes them more able to bind another oxygen molecule.

This diagram illustrates the ‘sequential’ model of cooperativity, which suggests that oxygen binding to one subunit of haemoglobin starts a sequence of conformational changes in the other haemoglobin subunits, which increase their affinity for oxygen, and that this happens in a sequence. The binding of oxygen (blue circle) in one subunit causes a structural change in a neighbouring subunit (purple) that makes them more able to bind another oxygen molecule.

Feedback inhibition in metabolic pathways

The production of the metabolite E in this four-step metabolic pathway allows it to bind to the first enzyme in the pathway to turn it off, thus regulating the amount of E in the cell. When levels of E drop, the pathway will be turned back on again as the first enzyme is no longer inhibited. Frequently this feedback inhibition is caused by negative allostery that involves a change in the conformation of the active site by another molecule binding elsewhere on the enzyme.

The production of the metabolite E in this four-step metabolic pathway allows it to bind to the first enzyme in the pathway to turn it off, thus regulating the amount of E in the cell. When levels of E drop, the pathway will be turned back on again as the first enzyme is no longer inhibited. Frequently this feedback inhibition is caused by negative allostery that involves a change in the conformation of the active site by another molecule binding elsewhere on the enzyme.

Proteins undergo many other types of motions such as internal vibrations and rotations of methyl groups and collective motions of groups of atoms such as wigwag motions of long sidechains or flipping of short peptide loops. Each of these movements is extremely important and is also often central to the protein’s function ( Table 2 ). As the number of exchanging conformations increases, it is simply not possible to represent a protein with a single structure. Instead, one must describe them as a population of multiple interconverting conformations known as a structural ensemble. Structural ensembles are especially relevant when a protein has a large intrinsically disordered region (IDR) that has a low number of bulky hydrophobic amino acids so that in isolation it remains unfolded despite other parts of the protein being folded. Some proteins are so flexible and dynamic that they are classed as being intrinsically disordered proteins and have no defined secondary structure at all. This is a relatively recent understanding as unfolded proteins were thought to result only from conditions such as extreme heat or acidity or from severe mutations. In fact, there appears to be a continuum for proteins with some, where one structure dominates to others that are fully disordered and better described as a dynamic ensemble of unstructured conformations.

Over the past 20 years, there has been considerable interest in disordered proteins as approximately one-third of human proteins contain disordered regions that are 30 or more amino acid residues long. Due to their fluctuating structures disordered proteins offer many advantages for cellular function. The flexibility of a disordered protein, means that the protein can easily be accessed by enzymes such as kinases that can post-translationally modify them (in the case of a kinase enzyme this would add a phosphate group). In many cases, when a disordered protein or region binds a target, it undergoes a conformational change to a better defined structure ( Figure 11 ). The same protein can act as a molecular hub and bind a range of molecules including small ligands, membrane surfaces or other proteins. Folding on binding does not always have to happen, and multiple binding events can simultaneously work together on a disordered protein to change its structure and dynamics. The disordered protein may need to bind several molecules before it gains a 3D shape. For the correct combination of stimuli, this will create a new specific ensemble that forms an appropriate binding site to bind the next target in a signalling cascade, leading to the correct response. It is, therefore, not surprising that disordered proteins provide a way to regulate cell signalling. In this process, signals that come from outside the cell get converted into responses inside the cell. The same disordered protein can process multiple stimuli enabling quick and flexible responses to the changing conditions that cells face. Disordered proteins are also involved in cell cycle activities, transcription and translation, cargo transport and apoptosis. Another exciting area for disordered proteins is their ability to self-assemble into multiprotein complexes and still maintain a fairly extended, non-globular shape, as would be expected for independently folded proteins. These extended conformations allow disordered proteins to become a molecular glue that has a much larger surface area for contacts between proteins and cements the complex together, as can be seen in the assembly of the yeast ribosome.

Cartoon of the coupled folding and binding

PUMA is an intrinsically disordered protein (green) that folds on binding to the folded MCL-1 protein (white). Before binding, PUMA is modelled as an ensemble of rapidly interconverting unfolded states.

PUMA is an intrinsically disordered protein (green) that folds on binding to the folded MCL-1 protein (white). Before binding, PUMA is modelled as an ensemble of rapidly interconverting unfolded states.

Finally, disordered proteins with ‘multivalent’ or ‘multiple interaction’ sites have been shown to engage in rapid dynamic exchanging interactions with each other that can cause liquid–liquid phase separation. In these cases, instead of forming amyloid or a defined large complex, some disordered proteins come together to form a separate liquid phase inside the cell that is enriched with these multivalent molecules. The new liquid phase that is formed is called a biomolecular condensate and allows the cell to organise and concentrate molecules involved in a given biochemical reaction just like classic membrane-bound organelles such as the mitochondria. For this reason, they are often referred to as membrane-less organelles. Some of these liquid phases can be seen directly under the microscope, such as the nucleolus and Cajol bodies and the molecules within them carry out distinct roles in the cell. Disordered proteins are therefore extremely important in the cell and in the future will prove central to further understanding how protein structure explains function.

The insights concerning protein folding, structure, dynamics and interactions have all come from using a range of experimental tools, which we will now explore.

There are a wide number of tools available to the structural biologist to allow protein structure and dynamics to be determined. Basic spectroscopic methods such as circular dichroism (CD) or fluorescence give general information about structure, whereas high-resolution methods such as X-ray crystallography, nuclear magnetic resonance (NMR) and cryogenic electron microscopy (cryo-EM) can provide atomic descriptions of protein structure and dynamics. Each of these tools require pure protein, usually in the form of recombinant proteins. Today a wide variety of different biological organisms can be genetically modified to create the required protein synthetically in large quantities, which has led to huge progress in methods that can study protein structure and dynamics.

Spectroscopy and light

To study proteins, we use electromagnetic radiation (see Box 2 , Properties of Light Box ) to probe their structural and functional properties using a fundamental experimental technique called spectroscopy. Spectroscopy is the study of the interaction of electromagnetic radiation (light) with matter, in our case proteins. Several closely related events can occur depending on the amount of energy that the radiation carries. In the first example of absorption, electromagnetic radiation is captured by a protein sample, which converts the energy of the photon into internal energy. Atoms within proteins are composed of a nucleus containing neutrons, protons and dispersed electrons. Electrons, however, are not merely floating within the atom but are instead fixed within electron orbitals. There are multiple electron orbitals within an atom, and each has its an energy level associated with it. Since the energy levels of matter are quantised, only light of energy that can cause transitions from one existing energy level to another will be absorbed. The amount of energy carried by a light photon depends on wavelength. The shorter the wavelength, the higher the energy carried by a photon; hence, ultraviolet (UV) light carries more energy than visible light. When a molecule absorbs a photon of the correct energy, an electron is promoted from its ground state to an excited state. This occurs if the energy of the photon, corresponding to the energy gap between the ground state and an empty higher energy level (the excited state). After absorption, the energy is then lost to the solvent as heat (thermal energy) when the electron drops back to the ground state. An absorption spectrum measures the amount of light that passes through a sample at a variety of wavelengths. The spectrum depends on the type and arrangement of atoms in the sample and can make absorption spectra useful for identifying different molecules. In this way, absorption spectroscopy can be used to reveal some very basic information about the structure and conformational states of a protein.

Light is a type of energy. The nature of light is best explained based on the idea of wave-particle duality. This means that in certain experiments, light acts as a particle (photon) with discrete energy (quanta). In other experiments, light can also act as a wave that oscillates in the direction of travel carrying electromagnetic radiation ( Figure B1 ). Each wave is made up of an electric field and a magnetic field that oscillate perpendicular to each other and is described by a periodic function, for example a cosine operation. These oscillations consist of successive troughs and crests in the electric and magnetic fields where the distance between two adjacent crests or troughs is called the wavelength (λ) which is related to the frequency of repeats within a given distance. The peaks and troughs from the electric and magnetic fields are in phase with each other and reach minima (troughs) and maxima (peaks) together. The amplitude (the height of the wave) determines how bright or dim this light is.

  • The visible light that can be seen by the human eye is radiation within a small portion of the electromagnetic spectrum. This spectrum also includes radiowaves, microwaves, infrared (IR), (visible) light, UV, X-rays and γ rays, which are named according to their wavelength. Visible light has a wavelength in the 400–700 nm range (10 −9 metres) whereas radio is within the metre to kilometre range (10 3 metres). The wavelength is determined by the frequency of a wave or its rate of oscillation (how long it takes to complete one repeat) and is measured in Hertz where one Hertz is equal to one oscillation per second. Shorter wavelengths have a higher Hertz and longer wavelengths have lower Hertz. The frequency of an electromagnetic wave is directly related to the energy of the photon with shorter wavelengths having higher energy, this means that an X-ray beam is higher energy than a radiowave. Frequency is converted into wavelength using ( eqn 3 ): f = c / λ (3) where f is the frequency (Hz), λ is the wavelength (m) and c is the speed of light (3 × 10 8 ms −1 ).

Besides absorption, an electromagnetic wave can also be scattered or refracted, due to its interaction with the atom resulting in a deflection from its straight path. When considering many waves scattering together as they pass through an object, this process is called diffraction and all scattered waves can be collected on a detector to form a pattern called a molecular transform. The light scattered from diffraction can be detected only at specific angles when the resulting scattered light waves interfere constructively as little or nothing will be detected when light interferes destructively. This is seen in Figure B2 , if the crest of one wave lines up with the crest of another wave they are in phase and undergo constructive interference (the waves add up). If the waves are out of phase, for example the crest of one wave lines up with the trough of another wave they undergo destructive interference (the waves of equal amplitudes cancel each other out). As such, the scattered light that remains has been ‘transformed’ by constructive interference, which relies on the spacing and structural relationship of the atoms in the molecule being studied, hence the name molecular transform.

(B1) A wave of light can be described by two periodic functions representing the electric and magnetic fields that are perpendicular to each other, where their amplitude changes along the x-axis. If you draw a beam of light in the form of a wave, the distance between two crests is called the wavelength. The frequency that the waves repeat themselves determines their wavelengths. For most of our text, we only show the electric component. (B2) When light waves are in phase (start at the same position within the periodic function), light interferes constructively and they add together to make a bigger wave (top panel). Light interferes destructively annihilating each other when waves are out of phase, for example when the peak of one wave is aligned with the trough of another (bottom panel).

(B1) A wave of light can be described by two periodic functions representing the electric and magnetic fields that are perpendicular to each other, where their amplitude changes along the x-axis. If you draw a beam of light in the form of a wave, the distance between two crests is called the wavelength. The frequency that the waves repeat themselves determines their wavelengths. For most of our text, we only show the electric component. (B2) When light waves are in phase (start at the same position within the periodic function), light interferes constructively and they add together to make a bigger wave (top panel). Light interferes destructively annihilating each other when waves are out of phase, for example when the peak of one wave is aligned with the trough of another (bottom panel).

For a given wavelength, if the extinction coefficient (ε) is known then by measuring the amount of light that a protein sample absorbs (A) and using the known pathlength (l), we can work out its molar concentration (c).

Circular Dichroism

Circular Dichroism (CD) spectroscopy is a form of UV light absorption spectroscopy that is used to determine the secondary structure of proteins. To understand how the process works, we must investigate the properties of light. Each wavelength of light has associated time-dependent electric and magnetic fields that oscillate between peaks and troughs in the direction of travel. The intensity of the light (amplitude) is a measure of the relative height of the wave. Wavelength is a measurement of the distance between the peaks in metres. Light waves are said to be in phase if the peaks and troughs of the waves line up (see Box 2 , Properties of Light Box ). It is possible, by use of filters, to generate plane polarised light with an electric field that oscillates in just a single plane. If you are looking into the path of this light and could see it coming towards you, vertically polarised light would oscillate up and down in a single plane. If you combine in-phase horizontally polarised light with vertically polarised light, you will generate plane polarised light wave that oscillates back and forth at 45 degrees (average of the two) ( Figure 12 A). Something exciting happens when you combine two perpendicular plane polarised light waves of equal amplitude, but that differ in phase by a quarter as they generate circularly polarised light. The result is an electric vector that rotates either clockwise (left) or anticlockwise (right) as it propagates. In this case, if you could see the peak of the electric field as the wave came towards you, it would appear to rotate (Figure 12B). This circularly polarised light is shown as a spiral and referred to as left- and right-circular polarised light (LCP or RCP in Figure 12 C). To view some excellent movies that illustrate how circularly polarised light is generated from combining plane polarised light, see the references in further reading.

CD spectroscopy

Light waves can travel at any angle and through the use of a special polarising lens, light can be selected for a single plane i.e. in a vertical (represented in red) or horizontal (represented in green) plane. (A) When horizontally and vertically polarised light are combined in phase the resulting plane polarised light wave oscillates back and forth at 45 degrees (represented in blue). (B) Circularly polarised light consists of two perpendicular plane waves of equal amplitude and ¼ of a wavelength difference in phase. At a single point in space, the circularly polarised light will trace out a circle over one period of the wave shown here as a spiral. Depending on the rotation direction, it is called left-handed (LCP) or right-handed (RCP) circularly polarised light. (C) A chiral molecule such as a protein (indicated as red box) will absorb LCP and RCP as indicated by the size of each spiral to the right of the red box. CD instrument allows the absorption of LCP and RCP circularly polarised light to be measured. (D) LCP and RCP are represented as vectors on the detector. When both LCP and RCP are absorbed the same amount (left), their combination leads to a linear (blue) vector that oscillates up and down. However, when different absorption of the LCP and RCP occurs (in this case RCP has been absorbed by the protein leading to decreased amplitude) their combination leads to elliptically polarised light. This happens as when the short vector from RCP is combined with the longer vector of LCP, the resultant rotating (blue) vector now describes an ellipse. The angle made by the big axis of the ellipse with respect to the original polarisation plane is measured in degrees (θ). Only the electric components of light waves are shown for clarity (the magnetic component is always perpendicular to the electric component).

Light waves can travel at any angle and through the use of a special polarising lens, light can be selected for a single plane i.e. in a vertical (represented in red) or horizontal (represented in green) plane. ( A ) When horizontally and vertically polarised light are combined in phase the resulting plane polarised light wave oscillates back and forth at 45 degrees (represented in blue). ( B ) Circularly polarised light consists of two perpendicular plane waves of equal amplitude and ¼ of a wavelength difference in phase. At a single point in space, the circularly polarised light will trace out a circle over one period of the wave shown here as a spiral. Depending on the rotation direction, it is called left-handed (LCP) or right-handed (RCP) circularly polarised light. ( C ) A chiral molecule such as a protein (indicated as red box) will absorb LCP and RCP as indicated by the size of each spiral to the right of the red box. CD instrument allows the absorption of LCP and RCP circularly polarised light to be measured. ( D ) LCP and RCP are represented as vectors on the detector. When both LCP and RCP are absorbed the same amount (left), their combination leads to a linear (blue) vector that oscillates up and down. However, when different absorption of the LCP and RCP occurs (in this case RCP has been absorbed by the protein leading to decreased amplitude) their combination leads to elliptically polarised light. This happens as when the short vector from RCP is combined with the longer vector of LCP, the resultant rotating (blue) vector now describes an ellipse. The angle made by the big axis of the ellipse with respect to the original polarisation plane is measured in degrees (θ). Only the electric components of light waves are shown for clarity (the magnetic component is always perpendicular to the electric component).

If left- and right-circularly polarised light are superimposed, and after absorbance, the amplitudes are equal, the result is back to generating plane polarised light ( Figure 12 D, left). However, if the amplitudes are unequal because one absorbs more than the other as the light passes through a protein sample (as seen in Figure 12 C), the resulting light is elliptically polarised light ( Figure 12 D, right). The angle made by the big axis of the ellipse with respect to the original polarisation plane is measured in degrees (θ) which are the units seen on a raw CD spectrum. Since this value is usually quite small, it is often quoted in millidegrees (1/1000 of a degree). Symmetrical molecules absorb left- and right-circularly polarised light equally. Non-symmetric/chiral molecules such as proteins that contain secondary structure interact with the light and absorbed left- and right-circularly polarised components differently. Differences in the absorption of left- and right-handed circularly polarised light by the secondary structural components of a protein over a range of wavelengths give rise to a CD spectrum.

When working with proteins, mean residue molar CD (Δε MR ) is used which reports the molar CD for individual protein residues instead of the whole protein. This allows direct comparison between proteins of different sizes. To do this the mean residual concentration (Molarity multiplied by number of of amino acids) is used in place of the Molarity in the above equation, essentially treating the protein as a solution of its free amino acids.

  CD spectroscopy is well suited to proteins as the peptide bonds that dictate secondary structure are optically active. Different secondary structures types absorb left- and right-circularly polarised light to different amounts meaning α-helix and β-sheets have different Far UV-CD spectra with recognisable shapes ( Figure 13 ). An α-helical protein, for example, will have a positive peak at ∼190 nm and negative peaks at 222 and 208 nm giving a characteristic double-humped spectrum in the far UV wavelength range (between 180 and 260 nm). These spectra can be compared to reference spectra that exist for proteins that are 100% α-helix, β-sheet or random coils ( Figure 13 ), as well as more complex libraries of protein with mixed structures. A mathematical process known as deconvolution can then be used to work out the relative fractions of each secondary structural type by summing different combinations of these reference spectra. Another common use of CD with proteins uses the absorbance of the side chains of Phe, Tyr and Trp in the near UV wavelength range (250–350 nm) to give limited information about the tertiary structure of a protein. The absorption of the side chains tells us how well the secondary structure elements are packed together as well as indicating interactions with ligands that bind to the protein surface.

Characteristic CD spectra

CD spectroscopy can be used to estimate the secondary structural content of a protein. Each secondary structural type has a characteristic spectrum. α-helical proteins like Insulin (blue) have a double hump spectrum with peaks at negative bands at 222 and 208 nm and a positive band at 193 nm. Proteins with well-defined antiparallel β-sheets like Immunoglobulins (red) have negative bands at 218 nm and positive bands at 195 nm. Disordered proteins such as the micro-exon gene 14 (green) have very low signal above 210 nm and negative bands near 195 nm.

CD spectroscopy can be used to estimate the secondary structural content of a protein. Each secondary structural type has a characteristic spectrum. α-helical proteins like Insulin (blue) have a double hump spectrum with peaks at negative bands at 222 and 208 nm and a positive band at 193 nm. Proteins with well-defined antiparallel β-sheets like Immunoglobulins (red) have negative bands at 218 nm and positive bands at 195 nm. Disordered proteins such as the micro-exon gene 14 (green) have very low signal above 210 nm and negative bands near 195 nm.

CD is widely used to see if proteins are folded following purification and before attempting more involved techniques such as X-ray crystallography. CD gives detailed information about a protein’s secondary structure, but it does not tell us about the precise 3D structure, and for that, we need more complex methods such as X-ray crystallography, NMR and cryo-EM.

X-ray crystallography

Although CD spectroscopy indicates the secondary and tertiary structure of a protein in solution, it does not provide structural detail at the atomic level. X-ray crystallography however reveals the accurate structure of biomolecules held within crystals. X-rays are used as they have a wavelength that closely approximates the length of covalent bonds. This means they are ideal for resolving atoms separated by these distances. Modern crystallography methods are usually performed at cryogenic temperatures allowing large (between 2 and 100 nm) complex structures to be determined. One of the most important structures determined was that of the ribosome and led to the award of the Nobel Prize in 2009 (Chemistry) to Ada Yonath alongside Venkatraman Ramakrishnan and Thomas A. Steitz. Such discoveries and innovation have transformed our understanding of biology by allowing access to the atomic detail of biomolecules.

The hanging drop vapour diffusion method is a common method used to form crystals. The process begins by using a highly concentrated pure protein in a buffered solution. The protein sample is suspended as a drop over a liquid reservoir of buffer in a sealed container. The drop contains a lower concentration of buffer components than the reservoir. Equilibrium between the drop and the reservoir is achieved by the water vapour leaving the drop and moving to the reservoir. The movement of water between the drop and the reservoir increases the concentration of the protein until it becomes supersaturated and starts to form a crystal.

Crystals are a highly ordered arrangement of individual protein molecules. When an X-ray beam is focussed on a protein crystal, the electric component of the electromagnetic X-ray waves interacts with the atom’s electron clouds surrounding the nuclei of the atoms leading to diffraction ( Figure 14 ). The diffracted X-rays generate spots (also called reflections) on a detector (digital camera) that have an intensity. The spots are the result of reflections of the crystal at a certain angle (2 θ ) relative to the original beam according to the geometric laws of constructive interference in crystals first described by Bragg (see Box 3 , Bragg’s Law box ). The waves that generated the measured reflections can be represented as a periodic cosine wave by; y = Acos(x + θ . A is the amplitude, which can be calculated from the intensities of the diffraction spots (square root of the intensities) and θ , which is the phase of the wave and unfortunately cannot be recorded. Due to the flat nature of the detector, only a subset of diffraction spots are recorded at any given X-ray-to-crystal angle. Therefore, the experiment is repeated with the crystal rotated to multiple different orientations which allow the angle of the incoming X-rays to change with respect to the crystal providing new reflections. After all diffraction patterns are recorded, for any given protein, a dataset of spots is collated that corresponds to many of the possible constructive interference diffraction events for the crystal. Unlike visible light microscopy, there is no lens to refocus these rays and we need the mathematical power of a computer to convert the X-ray data into electron density which allows us to form an image of the protein. X-ray crystallography thus requires four main components: an X-ray source, a protein crystal, a detector and a computer.

The X-ray crystallography set up

Protein crystals are made up of a repeating array of unit cells that contain one or more copies of a protein. When these crystals are exposed to X-rays, the light changes its path and those diffracted X-rays that undergo constructive interference are measured on a detector and are called reflections. Experiments are repeated for multiple orientations of the crystal and all measured reflections are combined to create a full set of data to be analysed by a computer to generate a protein structure.

Protein crystals are made up of a repeating array of unit cells that contain one or more copies of a protein. When these crystals are exposed to X-rays, the light changes its path and those diffracted X-rays that undergo constructive interference are measured on a detector and are called reflections. Experiments are repeated for multiple orientations of the crystal and all measured reflections are combined to create a full set of data to be analysed by a computer to generate a protein structure.

Bragg determined that constructive interference (see Box 2 , Properties of Light Box ) between diffracted X-rays will only occur for certain values of d and θ when the term 2d.sin θ is an integer multiple of the x-ray wavelength λ , (i.e. 1λ, 2λ, 3λ,…). The equation is formally written nλ = 2d.sin θ , where d is the distance between parallel planes and θ is the angle of approach between the incoming X-rays and the crystal ( Figure B3 and Figure 14 ). This means that a reflection will only be recorded for certain incoming X-ray-to-crystal angles when the equivalent repeating electrons in a protein crystal are found on a set of planes with the appropriate interplanar distance. For example, when X-rays with a wavelength of 1.54 Angstroms (1.54 × 10 −10 m) hit a protein crystal such that θ is 45 degrees, only electrons that repeat on planes separated by 1.1, 2.2 or 3.3 Angstroms will contribute to a recorded reflection. That is why the crystal is rotated as it allows different values of θ to be used in the experiment and thus generates information about repeating electrons found on different Bragg planes. The smaller the interplanar spacing d, the larger the X-ray-to-crystal angle will be for strong diffraction to occur, which means a higher resolution structure that has more details from electrons spaced closer together need crystals that diffract to higher angles. Crystallographers can determine which set of planes are responsible for a given reflection (i.e. what value of h, k and l they have) by the relative position of the spot on the detector, furthermore, from the spot intensities, they can determine how many electrons lie on these planes.

Two in-phase waves R1 and R2 (shown as straight lines instead of oscillating waves) are scattered by an angle θ, relative to the periodic array (red dots). If the additional distance travelled by R2 (i.e. two times the distance BC) is a whole number of wavelengths, n, then the waves will remain in phase and give constructive interference. If the extra distance travelled by R2 to cover 2BC was a fraction of a wavelength (for example 0.5 of a wavelength), then the peaks and troughs of R2 would be shifted relative to R1 and the waves would cancel out through destructive interference. For R2, a line is shown going through point B to indicate that the waves diffract with an angle of 2θ with respect to the original X-ray beam.

Two in-phase waves R 1 and R 2 (shown as straight lines instead of oscillating waves) are scattered by an angle θ, relative to the periodic array (red dots). If the additional distance travelled by R 2 (i.e. two times the distance BC) is a whole number of wavelengths, n, then the waves will remain in phase and give constructive interference. If the extra distance travelled by R 2 to cover 2BC was a fraction of a wavelength (for example 0.5 of a wavelength), then the peaks and troughs of R 2 would be shifted relative to R 1 and the waves would cancel out through destructive interference. For R 2 , a line is shown going through point B to indicate that the waves diffract with an angle of 2θ with respect to the original X-ray beam.

Ordered crystals act to amplify diffraction and convert a very weak scattering that would result from one protein molecule into a strong diffraction pattern described by Bragg’s Law (see Box 3 , Bragg’s Law box ). The intensity of each spot in a diffraction pattern from a crystal is determined by how many electrons lie on a particular set of imaginary parallel planes called Bragg planes that cross through the crystal. Each set of planes is named by having a unique integer value for the letters h, k and l, which describe the spacing and direction of these planes in three dimensions. There are billions of copies of protein molecules in a crystal which are arranged in a regular lattice, which means that the parallel Bragg planes cross the same place in the protein for every copy. For any given set of planes, this allows all copies of the protein to contribute to the same given diffraction event through constructive interference, increasing the signal of the spots on the detector.

Ultimately, it is the electron density around all the protein atoms in a crystal that scatters X-rays to form the diffraction patterns. The electron density and diffraction patterns are converted between each other by a mathematical operation called a Fourier Transform ( Figure 15 ). To convert a set of reflections into an electron density map, we need the amplitudes of each reflection as well as the phase. The amplitude is measured as the square root of the spot intensity; however the phase is unknown. It is also possible to back-calculate an estimate of the amplitude and phase of possible diffraction peaks from the known electron density (atomic positions).

Fourier Transformation

The Fourier mathematical operation sums the contributions of several simple functions with different frequencies, amplitudes and phases (on left) to make a complicated function (on right). Simple functions could be used to describe each reflection in a diffraction pattern or each electron position in a protein crystal. Complicated functions generated after transforming a set of reflections or set of electrons could be a complete electron density map or a complete diffraction pattern respectively. To get sufficient signal from a crystal, Bragg’s law must be obeyed, which is only satisfied for certain diffraction events that limits the number of reflections to be Fourier transformed after a diffraction experiment. In this example, all waves are in phase but most waves representing reflections are usually out of phase with each other, meaning they would not all start at the same point on the curve and their phases would need to be estimated in order to solve the phase problem. NMR also uses the Fourier transformation to convert a complicated FID generated from multiple atoms into a series of simple functions with different frequencies and amplitudes.

The Fourier mathematical operation sums the contributions of several simple functions with different frequencies, amplitudes and phases (on left) to make a complicated function (on right). Simple functions could be used to describe each reflection in a diffraction pattern or each electron position in a protein crystal. Complicated functions generated after transforming a set of reflections or set of electrons could be a complete electron density map or a complete diffraction pattern respectively. To get sufficient signal from a crystal, Bragg’s law must be obeyed, which is only satisfied for certain diffraction events that limits the number of reflections to be Fourier transformed after a diffraction experiment. In this example, all waves are in phase but most waves representing reflections are usually out of phase with each other, meaning they would not all start at the same point on the curve and their phases would need to be estimated in order to solve the phase problem. NMR also uses the Fourier transformation to convert a complicated FID generated from multiple atoms into a series of simple functions with different frequencies and amplitudes.

As the phase information of the reflections cannot be measured, we do not have the complete mathematical description of these functions and therefore we cannot apply a Fourier Transform to them to get to the electron density. This is known as the phase problem. One way to solve this is called Molecular Replacement, where we use a model protein structure (previously determined) that we assume is very similar to the unknown structure for which we have a set of measured diffraction intensities. A computer program tries all possible positions and orientations of this model in the unknown crystal to find a match between the measured diffraction pattern and the pattern predicted for each model orientation being tested (calculated by a Fourier Transformation). When the correct orientation has been found, we assume the model has crystallised in the unknown crystal with this orientation and borrow the phase values generated from this model for the unknown phases. With these phases solved by molecular replacement and the experimentally observed amplitudes, an initial map of the unknown structure is now possible by Fourier transformation.

Using features of the initial electron density map, we start to build an atomic model by threading the known polypeptide chain sequence into the density. Once we have built an initial model, we can calculate all the theoretical reflections and phases that would result from the model structure being in the crystal. This is done in a computer by applying a Fourier transform to a series of simple functions that represent the electron positions in that crystal. We use these new theoretical phases and combine them with the experimental intensities to apply another Fourier transform that creates a new and improved electron density map. The phases improve because the original phases were based on an estimate from a similar structure (using molecular replacement), however, now we can build into the map and define solvent molecules, bound ligands and avoid building into a density that appears to be noise. Thus, applying a Fourier Transform to this model will give a new set of wave functions with more accurate phases than the starting set of reflections. The process of model building, calculating new phases, creating an improved electron density map and making an improved model, which overall, is the process of refinement, continues until no further improvements can be made. At this stage of the refinement, different electron density maps are calculated that allow crystallographers to see any mistakes made in the model that are not consistent with the experimental reflections, locate missing atoms and make adjustments to the final structure. In Figure 16 , where the final electron density map has been calculated, there is clear evidence of the peptide backbone and sidechains and the primary amino acid sequence is easily fitted into this map.

An electron density map

Electron density map can be calculated using the information from the intensities of experimental reflections combined with the best possible phases. A model (shown as sticks and balls) can be built into this electron density (sticks). Post-refinement electron density is from human synaptotagmin 1 C2B domain.

Electron density map can be calculated using the information from the intensities of experimental reflections combined with the best possible phases. A model (shown as sticks and balls) can be built into this electron density (sticks). Post-refinement electron density is from human synaptotagmin 1 C2B domain.

Once the structure has been fully refined, the x, y, z coordinates of each of the atoms within it are shared with other scientists by depositing them into a global database called the Protein Data Bank (PDB). X-ray crystallography finds the structure of proteins fixed in the crystal lattices. The crystals can contain 20–80% solvent, and protein molecules are generally observed to be in the active state as has been demonstrated for many enzymes. However, structures tend to represent a snapshot of the protein, like taking a photo of an object, and some dynamic information is missing. We will next review the process of studying the structure of proteins using Nuclear Magnetic Resonance (NMR), which studies proteins completely free in solution.

NMR can reveal the structure and dynamics of biomolecules in solution, which is how they exist inside cells. In fact, NMR has been used to directly study proteins in the cell, even if they are unfolded. We will see that NMR can be used to reveal all the atomic positions within proteins and how these move and change in real-time when interacting with other molecules such as other proteins or drugs. Such information can tell the structural biologist how a protein can exert its function inside the cell. With this information, we can better understand how proteins lose function in disease, how to engineer them to be more effective or how to design drugs to alter their behaviour.

NMR is based on the observation that certain atomic nuclei have a property called a spin magnetic moment. A common analogy is that each nucleus behaves as if it were a tiny bar magnet pointing in a particular direction. Only magnetically active atoms with a spin value of ½ can be observed by NMR, for example, 1 H, 15 N, 13 C. A typical protein sample at millimolar concentrations will contain 10 17 molecules and therefore 10 17 copies of each given atom ( Figure 17 ). The ‘bar magnets’ that represent many identical copies of these atoms point in all directions and on an average, they cancel out any net magnetic moment/dipole. However, something special happens when this sample is placed in an NMR superconducting magnet, as the spin ½ nuclei are now exposed to a very strong external magnetic field (B 0 ). After a short time (few seconds), equilibrium is reached and the bar magnets start to rotate (or precess) around the external field all at the same resonance (or Larmor) frequency, although at a range of different angles, like a large collection of spinning tops that are tilted to various degrees from the ground (second column in Figure 17 ). However, instead of cancelling out, there are now slightly more in an orientation parallel to the external magnetic field (B 0 ). This occurs as this orientation has the lowest energy with the external magnetic field (B 0 ). This slight bias within the group produces on average, a slight net upward magnetic moment, which we call bulk magnetisation. At equilibrium this magnetisation is stationary pointing along the z-axis, and no net magnetisation is present in the x–y plane and the spins are out of phase and do not precess together in synchrony.

How bulk magnetisation is generated and manipulated for multiple copies of the same atom

A given atom in a protein is represented as many vectors (with different directions) as there will be many copies in the sample (bottom panel). The individual vectors average (or sum) to generate a bulk magnetisation vector (thick black line) with properties that represent all of these identical atoms (top panel). Before an external magnetic field (B0) is applied, individual vectors point in all directions and no bulk magnetisation vector is present (left). However, after a B0 field is applied (grey arrow in bottom panel) the sample generates a net magnetisation along the magnetic field direction (the z-axis) which can be represented by a bulk magnetisation vector (thick black arrow in top panel). When a short RF-Pulse (along the x-axis) has been applied, the bulk magnetisation is nudged into the x–y plane and immediately afterwards starts to rotate about the z-axis in a corkscrew motion at its Larmor frequency (chemical shift) as it returns back to its equilibrium position. The x-component of the rotating bulk magnetisation following the pulse is measured by the spectrometer’s coil as a decaying oscillating electric field called an FID. The RF-pulse is effective as it generates a short-lived oscillating B1 magnetic field in the coil, along the x-axis, which is at the same Larmor frequency of the nuclei under study, allowing it to rotate magnetisation toward the x–y plane. This is similar to effectively pushing a child on a swing, one constant push (constant B1) is not as effective as pushing with the natural frequency of the swing (oscillating B1). This vector model only really applies to spins that are not ‘coupled’ to another spin and for a deeper understanding of NMR, we would need to consider the subatomic quantum realm, where conventional/familiar, classical physics, does not apply and is beyond the scope of this text.

A given atom in a protein is represented as many vectors (with different directions) as there will be many copies in the sample (bottom panel). The individual vectors average (or sum) to generate a bulk magnetisation vector (thick black line) with properties that represent all of these identical atoms (top panel). Before an external magnetic field (B 0 ) is applied, individual vectors point in all directions and no bulk magnetisation vector is present (left). However, after a B 0 field is applied (grey arrow in bottom panel) the sample generates a net magnetisation along the magnetic field direction (the z-axis) which can be represented by a bulk magnetisation vector (thick black arrow in top panel). When a short RF-Pulse (along the x-axis) has been applied, the bulk magnetisation is nudged into the x–y plane and immediately afterwards starts to rotate about the z-axis in a corkscrew motion at its Larmor frequency (chemical shift) as it returns back to its equilibrium position. The x-component of the rotating bulk magnetisation following the pulse is measured by the spectrometer’s coil as a decaying oscillating electric field called an FID. The RF-pulse is effective as it generates a short-lived oscillating B 1 magnetic field in the coil, along the x-axis, which is at the same Larmor frequency of the nuclei under study, allowing it to rotate magnetisation toward the x–y plane. This is similar to effectively pushing a child on a swing, one constant push (constant B 1 ) is not as effective as pushing with the natural frequency of the swing (oscillating B 1 ). This vector model only really applies to spins that are not ‘coupled’ to another spin and for a deeper understanding of NMR, we would need to consider the subatomic quantum realm, where conventional/familiar, classical physics, does not apply and is beyond the scope of this text.

When a very short, yet carefully chosen length, radiofrequency (RF) electrical pulse is applied through a wire coil close to the sample but wrapping around the x-axis, it generates a weak varying magnetic field (B 1 ) that is perpendicular to B 0 . This transverse RF magnetic field tips the bulk magnetisation away from the vertical axis exactly into the x–y plane. This happens as the RF-pulse is oscillating at the same Larmor frequency which ‘excites’ the spins, causing them to precess together in synchrony. Following the pulse, spins start to precess out of phase again, and the bulk magnetisation returns to align with the z-axis in a corkscrew motion, precessing around this z-axis at its distinctive resonance (Larmor) frequency ( Figure 17 ). The x-component of the rotating bulk magnetic field that is generated immediately after the pulse causes a simple oscillating electrical current with exponentially decaying amplitudes that is recorded as a time-dependent free-induction decay (FID) in the same coil that generated the pulse. The key to this experiment is measuring the oscillating signal away (perpendicular) from the B 0 field, which is much stronger and would mask this signal if you tried to measure along the z-axis. For clarity, Figure 17 only shows multiple copies of the same atom, however a protein contains many different atoms and so the FID we record is a mixture of different oscillations at different frequencies. Fourier transformation of this complicated FID by a computer generates a frequency-dependent spectrum consisting of signals separated by the Larmor frequencies of the atoms in the molecule. The number of signals is equal to the number of magnetically different atoms in the molecule. The position of signals is called the chemical shift and is measured in ppm (parts per million) units relative to the frequency of a standard chemical included in the sample. Using the ppm scale instead of a Larmor frequency scale makes spectra independent of the B 0 magnetic field used for a given NMR spectrometer. For a protein, when using an RF-pulse designed to only perturb hydrogens, there could be as many as 1000 1 H nuclei within the combined amino acids.

Each 1 H atom in a protein is surrounded by a unique chemical environment from electrons in nearby atoms in the biomolecule that leads to a slightly different Larmor frequency compared with other 1 H atoms. These nearby electrons have the effect of shielding the nuclei from the full strength of the external B 0 magnetic field, which in turn affects its rate of precession (Larmor frequency). For example, if electrons are pulled away from a hydrogen, the Larmor frequency for that hydrogen is shifted downfield as the atom is less shielded, which causes the chemical shift to increase. This is seen for amide hydrogens which are attached to nitrogen, an electronegative atom ( Figure 18 ).

An 1 H FID for a protein and its Fourier Transform

The FID on the left is the sum of FIDs for each different Hydrogen nucleus in the protein. Fourier transformation of this FID creates a set of component frequencies (seen as a peak for each individual FID). Conversion of Larmor frequency (Hz) into chemical shift (ppm) as seen in the 1D 1H NMR spectrum of a protein allows for values to be independent of the magnet strength used. Each peak represents the hydrogen atoms connected to different carbons or nitrogens in the protein. The chemical shifts are different because the 1H nuclei all experience slightly different magnetic environments based on their chemical group and position in the protein and thus their bulk magnetisation vectors rotate at slightly different frequencies. Hydrogens found in common chemical groups (in amides, aromatics, aliphatics, methyl etc.) are indicated above the spectrum. The well-dispersed peaks between 6 and 10 ppm in the backbone amide region indicate that the protein is well folded. It is common to make a higher dimensional spectrum such as the 2D spectrum that plots the chemical shift values for pairs of atoms connected by a covalent bond to better resolve the overlapping signals. Abbreviation: 2D, two dimensional.

The FID on the left is the sum of FIDs for each different Hydrogen nucleus in the protein. Fourier transformation of this FID creates a set of component frequencies (seen as a peak for each individual FID). Conversion of Larmor frequency (Hz) into chemical shift (ppm) as seen in the 1D 1 H NMR spectrum of a protein allows for values to be independent of the magnet strength used. Each peak represents the hydrogen atoms connected to different carbons or nitrogens in the protein. The chemical shifts are different because the 1 H nuclei all experience slightly different magnetic environments based on their chemical group and position in the protein and thus their bulk magnetisation vectors rotate at slightly different frequencies. Hydrogens found in common chemical groups (in amides, aromatics, aliphatics, methyl etc.) are indicated above the spectrum. The well-dispersed peaks between 6 and 10 ppm in the backbone amide region indicate that the protein is well folded. It is common to make a higher dimensional spectrum such as the 2D spectrum that plots the chemical shift values for pairs of atoms connected by a covalent bond to better resolve the overlapping signals. Abbreviation: 2D, two dimensional.

Although studying one nucleus from one atom in a protein can be informative, in order to study all of them, we must know which chemical shift belongs to which atom in the protein. This requires a series of experiments that measure multiple different types of magnetically active nuclei ( 1 H, 13 C, 15 N) on recombinant proteins that have incorporated 13 C and 15 N isotopes. With these labelled proteins, we can determine how atoms are connected together using experiments that are designed to transfer bulk magnetisation from one atom to another through ‘coupled’ or connecting bonds in the protein, ultimately telling us the chemical shift of every atom.

A simple H-N two dimensional (2D) spectrum can be recorded on a recombinant protein that has incorporated the relatively inexpensive 15 N isotope. After our assignment experiments described above, we can label each peak in this spectrum as an amino acid according to the chemical shift value for its backbone amide nitrogen and hydrogen. This is very useful as it provides a unique “fingerprint” identification of the protein in an experiment that takes less than 30 min to run ( Figure 19 ). The simple H-N 2D spectrum is incredibly powerful, as it is an excellent check on the condition of a protein, before embarking on lengthy experiments (such as structure determination). It can tell you if the protein is folded, by checking if the peaks are well dispersed in the spectrum and not simply concentrated in the middle of the spectrum (between 7.8 and 8.8 ppm), which would indicate an unfolded or intrinsically disordered protein. It can tell you if the protein is aggregated by examining the shape of the peaks if they are spread out and broadened then that could indicate some form of self-association. It can also indicate if parts of your protein are dynamic as usually these peaks are missing in the spectrum. Crucially, H-N 2D spectra are frequently used to look at interactions with other proteins, ligands or drugs. Binding partners often are unlabelled (contain the natural 12 C and 14 N isotopes) to ensure they will not contribute to the spectra. However, when they are added to a labelled protein, we can quickly tell which of its amino acids are involved in binding, as these peaks will shift due to the new environment created by the binding partner. This allows us to map the binding surface on to the protein and estimate the strength of binding by titrating the binding partner into the protein and recording a series of 2D spectra to follow the peak positions.

1 H 15 N-HSQC of a small protein domain

Each numbered peak in this 2D spectrum represents an amino acid in a simple protein domain through its backbone (or sidechain) amide group. An amide group has one nitrogen and one hydrogen and given each amino acid is in a slightly different chemical environment based on how the protein has folded and which sidechain it contains, the chemical shift values for each N and H pair are different for each amino acid. This creates a unique “fingerprint” identification for every protein.

Each numbered peak in this 2D spectrum represents an amino acid in a simple protein domain through its backbone (or sidechain) amide group. An amide group has one nitrogen and one hydrogen and given each amino acid is in a slightly different chemical environment based on how the protein has folded and which sidechain it contains, the chemical shift values for each N and H pair are different for each amino acid. This creates a unique “fingerprint” identification for every protein.

To gain the full 3D structure of a protein, we need to assign all atoms to chemical shift values. This information is then used to determine which atoms are close together in space (not through bonds) through a variation of NMR experiment called Nuclear Overhauser Effect SpectroscopY (NOESY) experiment. In this experiment, the transfer of magnetisation from one hydrogen atom to another nearby hydrogen atom in 3D space is recorded. The size or strength of the bulk magnetisation vector after the transfer has occurred in a NOESY experiment tells us how close that atom was to the nearby atom. After identifying all possible Nuclear Overhauser Effects (NOEs) for the protein, we produce a series of atom–atom distances that connect the polypeptide to itself and help define its fold. We use a computer to find the fold that is consistent with all of these measured distances by doing a series of molecular dynamics simulations which is repeated approximately 100 times. With enough NOE restraints, most of the simulations will ‘converge’ on an ensemble of equivalent and similar low energy structures that all ‘fit’ with the distance restraints used.

Protein NMR spectroscopy is powerful as once an ensemble of structures is determined, further experiments are performed that detail the dynamics of each atom and the bonds they form. NMR thus gives information about how the protein moves in solution and combined with additional molecular dynamics techniques ( Table 3 ), it is possible to estimate its conformational ensemble. NMR dynamics experiments can also be performed on assigned proteins which do not have an NMR structure determined, and the results simply mapped on to a model determined previously making the process quicker. As such, NMR can quickly provide a wealth of information as soon as protein has been purified. NMR is an essential tool as protein motions are central to function inside the cell as we saw when we considered how proteins fold and their associated dynamics.

One of the drawbacks of X-ray crystallography is the need for a crystal to produce the diffraction patterns and a drawback of NMR is there is a limitation on the size of the protein that can be studied. In the 1970s, Nigel Unwin was trying to determine the shape of a protein called bacteriorhodopsin. Unable to produce a crystal of the molecule, electron microscopy (EM) was used to gain the structural outline of this protein, demonstrating how it can move protons across a membrane. Improvements in the methodology enabled Richard Henderson in 1990 to determine the first atomic-resolution images of bacteriorhodopsin using newly developed cryo-EM methods. The development of cryo-EM by Jacques Dubochet, Joachim Frank and Richard Henderson led to the Nobel Prize in 2017. It opened the door to the structural determination and functional understanding of very large complex protein structures without the need for crystallisation. Such is the progress and quality of cryo-EM images that the images now rival those of X-ray crystallography, with all the additional advantages for easier sample preparation.

Transmission EM (TEM) operates on the same basic principles as a light microscope but uses a beam of electrons to examine the structures of cells and tissues. The electron beam has a wavelength of approximately 10 −10 m (about the size of an atom) meaning TEM can reveal the internal structure of cells that cannot be seen by the longer wavelength used in light microscopy. The incoming and outgoing lenses of a light microscope are replaced by a series of coil-shaped electromagnetic lenses through which the electron beam travels to produce magnified images. Only parts of the beam are transmitted through the sample depending on their thickness and electron transparency. A final lens then refocusses this and projects an image of the sample onto a camera detector. To help improve the contrast of the very thin samples, heavy metal stains are often used to bind the proteins and stop the transmission of the electrons. The image then shows regions of the specimen where the electron transmission has been prevented. Biological molecules such as individual proteins and complexes are not compatible with the high vacuum needed for TEM as the high energy electrons burn the protein and evaporate the water that surrounds them.

Cryo-EM uses the same principle as TEM but cools the samples to cryogenic temperatures and embeds them in an environment of vitreous ice, allowing protein and protein complexes to be studied. To do this an aqueous protein sample solution is applied to a grid-mesh and plunge-frozen in liquid ethane. The process is so quick that the water molecules do not have time to arrange into a crystalline lattice. In this ‘vitrified’ sample, the water is disordered but the 3D structure of the biomolecules in the sample is retained. Stains are not needed here as the surrounding buffer allows for enough contrast to observe the specimen, to improve contrast multiple images are taken instead. Randomly orientated proteins are struck by the electron beam, producing a faint image on the detector. A computer then decides what is a faint molecular image of the proteins and what is the background. Similar images are then placed (grouped) together. Thousands of similar images are averaged by the computer to generate high signal to noise 2D images ( Figure 20 ) that are used to clean-up the dataset from contamination and other junk particles. Software is then used to calculate how all the good molecular images relate to each other and generates a high-resolution 3D image or density map. The amino acid chain is then threaded into this map in a similar process to X-ray crystallography. Cryo-EM offers a significant advantage in that through the direct acquisition of the images, the specimen can be statistically analysed allowing for the reconstruction of the structural information and different conformations can be determined in the same sample. It is also possible to control the chemical environment, which in turn allows for effective examination of different functional states of different types of molecules. The final major advantage of cryo-EM is that large intact complexes can be studied allowing the 3D structure of ribosomes, proteins and viruses, almost to the atomic scale.

Cryo-EM process

Image processing outline illustrated with data from the small pore-forming toxin lysenin. To capture the initial images, protein samples are transferred onto a copper mesh grid coated with a perforated carbon film. The sample is then flash frozen in ethane at −190°C, causing the water to vitrify and capturing the proteins in random orientations within the holes of the carbon film. A beam of electrons is then use to capture a faint trace image of the protein. The computer determines what is protein and what is background. Similar images of the protein in the same orientation are placed into the groups. Using thousands of similar images of the protein, the computer generates a high-resolution 2D image by averaging all the faint images. A 3D image is then calculated by working out how the 2D images relate to each other producing an electron density map from which the structure is then determined. Image from Savva (2019) A beginner’s guide to cryogenic electron microscopy. Biochemist41, 46–52.

Image processing outline illustrated with data from the small pore-forming toxin lysenin. To capture the initial images, protein samples are transferred onto a copper mesh grid coated with a perforated carbon film. The sample is then flash frozen in ethane at −190°C, causing the water to vitrify and capturing the proteins in random orientations within the holes of the carbon film. A beam of electrons is then use to capture a faint trace image of the protein. The computer determines what is protein and what is background. Similar images of the protein in the same orientation are placed into the groups. Using thousands of similar images of the protein, the computer generates a high-resolution 2D image by averaging all the faint images. A 3D image is then calculated by working out how the 2D images relate to each other producing an electron density map from which the structure is then determined. Image from Savva (2019) A beginner’s guide to cryogenic electron microscopy. Biochemist 41 , 46–52.

Cryo-EM of proteins and their complexes promises to revolutionise structural biology as many life processes depend on large dynamic macromolecular assemblies, however like all the methods described here there are some naunces. For example, it can be difficult to prepare a grid that has a well-represented number of orientations as sometimes the proteins will preferentially align towards the hydrophobic air–water interface, on occasions the proteins will denature, and screening multiple grids with different conditions can be expensive. Nevertheless, datasets sufficient for high-resolution structures can be recorded in just a few hours or overnight and the amount of protein required is much less than X-ray crystallography or NMR, and the samples do not have to be as pure, all of which helps balance the cost of this incredibly powerful technique.

Other methods

There are a vast range of other methods that are also used to study protein structure and their interactions, many of which can be performed in just one day and yield complementary information to the techniques mentioned above. Table 3 gives an overview of some of the more common methods and their applications.

The structural organisation of proteins and their range of shapes and conformations.

What influences the thermodynamics of protein folding.

How proteins’ ability to change their shape enables them to bind new partners as well potentially misfold and aggregate, bringing about disease.

How to experimentally determine protein structures and their interactions at the molecular level.

We hope this will inspire readers to view some of the suggested resources which provide more detail on uncovering protein structure.

CD data for an immunoglobulin was simulated from the pdb code: 1igt using the PDB2CD [Mavridis and Janes (2017) PDB2CD: a web-based application for the generation of circular dichroism spectra from protein atomic coordinates. Bioinformatics 33 , 56–63]. CD Data for the soluble domain of micro-exon gene 14 protein (CD0004062000) and insulin (CD0000040000) were extracted from the PCDDBID [Whitmore, Miles, Mavridis, Janes and Wallace (2017) PCDDB: new developments at the Protein Circular Dichroism Data Bank. Nucleic Acids Res. 45 (D1), D303–D307] [Lopes, Orcia, Araujo, DeMarco and Wallace (2013) Biophys. J. 104 , 2512–2520]. CD figures were created by use of Szilágyi (2019): EMANIM: interactive visualisation of electromagnetic waves. Web application available at URL https://emanim.szialab.org . All other pdb codes for figures are indicated in their legends.

The authors declare that there are no competing interests associated with the manuscript.

E.J.S. and D.S., both contributed to writing the review and producing figures; however, E.J.S. took the lead in writing while D.S. took the lead in producing the figures.

We acknowledge Marie Phelan, Igor Barsukov, Ed Yates, Lia Ball, Christos Savva, Kathryn Garner, Svetlana Antonyuk and reviewers for critical comments on this work. We also acknowledge Bryan Sutton for providing the electron density figure.

change in a system

two dimensional

three-dimensional

magnetic field aligned to the z-axis

circular dichroism

cryogenic electron microscopy

deoxyribonucleic acid

electron microscopy

free-induction decay

Gibbs free energy

frequency in Hertz

intrinsically disordered protein

nuclear magnetic resonance

Nuclear Overhauser Effect

Nuclear Overhauser Effect SpectroscopY

Protein Data Bank

parts per million

temperature

transmission EM

melting temperature at which 50% of a protein is unfolded

ultraviolet

Get Email Alerts

  • Online ISSN 1744-1358
  • Print ISSN 0071-1365
  • Submit Your Work
  • Language-editing services
  • Recommend to Your Librarian
  • Request a free trial
  • Accessibility
  • Sign up for alerts
  • Sign up to our mailing list
  • The Biochemist Blog
  • Biochemical Society Membership
  • Publishing Life Cycle
  • Biochemical Society Events
  • About Portland Press
  • Portland Press Tel
  • +44 (0)20 3880 2795
  • Portland Press Company no. 02453983
  • Biochemical Society Tel
  • +44 (0)20 3880 2793
  • Email: [email protected]
  • Biochemical Society Company no. 00892796
  • Registered Charity no. 253894
  • VAT no. GB 523 2392 69
  • Privacy and cookies
  • © Copyright 2024 Portland Press

This Feature Is Available To Subscribers Only

Sign In or Create an Account

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

K12 LibreTexts

3.19: Protein Synthesis

  • Last updated
  • Save as PDF
  • Page ID 13342

alt

How do you build a protein?

Your body needs proteins to create muscles, regulate chemical reactions, transport oxygen, and perform other important tasks in your body. But how are these proteins built? They are made up of units called amino acids. Just like there are only a few types of blocks in a set, there are a limited number of amino acids. But there are many different ways in which they can be combined.

Introduction to Protein Synthesis

A monomer is a molecule that can bind to other monomers to form a polymer. Amino acids are the monomers of a protein. The DNA sequence contains the instructions to place amino acids into a specific order.

When the amino acid monomers are assembled in that specific order, proteins are made, a process called protein synthesis . In short, DNA contains the instructions to create proteins. But DNA does not directly make the proteins. Proteins are made on the ribosomes in the cytoplasm, and DNA (in an eukaryotic cell) is in the nucleus. So the cell uses an RNA intermediate to produce proteins.

Each strand of DNA has many separate sequences that code for a specific protein. Insulin is an example of a protein made by your cells (Figure below). Units of DNA that contain code for the creation of a protein are called genes .

Amino acid sequence of insulin

Cells Can Turn Genes On or Off

There are about 22,000 genes in every human cell. Does every human cell have the same genes? Yes. Does every human cell make the same proteins? No. In a multicellular organism, such as us, cells have specific functions because they have different proteins. They have different proteins because different genes are expressed in different cell types (which is known as gene expression ).

Imagine that all of your genes are "turned off." Each cell type only "turns on" (or expresses) the genes that have the code for the proteins it needs to use. So different cell types "turn on" different genes, allowing different proteins to be made. This gives different cell types different functions.

Once a gene is expressed, the protein product of that gene is usually made. For this reason, gene expression and protein synthesis are often considered the same process.

  • DNA contains the instructions to assemble amino acids in a specific order to make protein.
  • Each cell type only "turns on" (or expresses) the genes that have the code for the proteins it needs to use.
  • Gene expression and protein synthesis are usually considered the same molecular process.
  • What is a gene?
  • What is an amino acid?
  • If every human cell has the same genes, how can they look and function so differently?
  • What is the relationship between DNA and proteins?

U.S. flag

An official website of the United States government

The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Browse Titles

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan-.

Cover of StatPearls

StatPearls [Internet].

Biochemistry, protein synthesis.

Jacob E. Hoerter ; Steven R. Ellis .

Affiliations

Last Update: July 17, 2023 .

  • Introduction

Our understanding of each of the biological sciences becomes heightened by the study of biochemistry and molecular biology. In the last few decades, advances in laboratory techniques for the study of these microscopic sciences have led us to a greater understanding of the central dogma of molecular biology – that DNA transcribes RNA which then gets translated into protein. Understanding protein synthesis is paramount in studying various medical fields, from the molecular basis of genetic diseases through antibiotic development to expressing recombinant proteins as drugs or clinical laboratory reagents. As one of the foundational concepts in biology, protein synthesis is sufficiently complex that many believe it evolved once, giving the protein synthetic machinery in all organisms on the planet a common ancestry.  Despite having certain underlying similarities in their mechanism, protein synthesis in the three major lines of descent (bacteria, archaea, and eukaryotes) has diverged to the point that substantive mechanistic differences have arisen.  These differences have been exploited in nature as organisms produce compounds targeting the protein synthetic machinery of competitors as they vie for limited resources. Science has modified many of these compounds that target the machinery for protein synthesis in pathogenic microbes for use in the clinic as antibiotics. As our understanding of the mechanisms of protein synthesis continues to grow, there will likely be countless additional applications for this knowledge in medicine, research, and industry.

  • Fundamentals

Protein synthesis involves a complex interplay of many macromolecules.

  • The eukaryotic ribosome has two subunits: a 40S small subunit and a 60S large subunit. Together, the eukaryotic ribosome is 80S. There are several sites of functional significance, but the most important ones are the A (aminoacyl), P (peptidyl), and E (exit) sites. The eukaryotic ribosome is a ribonucleoprotein complex composed of 4 RNAs and 80 proteins. Many of the functions of the ribosome, including catalyzing peptide bond formation, are attributed to ribosomal RNA (rRNA) rather than ribosomal proteins, which instead play a primary role in subunit assembly. Ribosomes can be found either adherent to membranes of the endoplasmic reticulum or free within the cytoplasm. [1]
  • Bacterial ribosomes have two subunits, 30S and 50S, that join to form a 70S particle. In general, bacterial ribosomes are smaller than their eukaryotic counterparts, including fewer ribosomal proteins (55) and shorter rRNAs (3 in total). Certain regions of rRNA and some of the ribosomal proteins remain conserved between bacteria and eukaryotes. Other regions of rRNA and proteins are unique to either eukaryotes or bacteria and account in part for differences in mechanisms of protein synthesis discussed above.  
  • Eukaryotic cells contain a second type of ribosome found within the mitochondrion, which maintains a system of protein synthesis distinct from that found in the cytoplasm. Despite their presence in eukaryotic cells, the origins of the mitochondrial ribosome are traceable to bacteria, consistent with the endosymbiont theory of mitochondrial origins.  Care must be taken during antibiotic development to avoid targeting characteristics of the mitochondrial ribosome shared with bacterial ribosomes.
  • Messenger RNA (mRNA): the mRNA is another type of ribonucleic acid that functions to carry the coding section of a gene for protein synthesis. It contains portions of non-coding and coding sequences. The coding sequence groups nucleotides into codons, which are three specific nucleotides that correspond to a particular amino acid specified by the genetic code. [2]
  • Transfer RNA (tRNA): tRNAs are adaptors bridging the nucleotide sequence found in mRNAs to the amino acid sequence found in a growing protein. Transfer RNAs assume a cloverleaf-like secondary structure with an amino acid linked to its 3’ end through an ester linkage and a stretch of three nucleotides at the base of the cloverleaf referred to as the anticodon. The three bases of the anticodon base pair with complementary codon sequences in an mRNA during the process of protein synthesis.  This base-pairing interaction plays a critical role in the readout of the genetic code from mRNA to protein. There are 20 different aminoacyl-tRNA synthetases, one for each of the 20 common amino acids. Once an amino acid links to its cognate tRNA it is referred to as an aminoacyl tRNA, or “charged” tRNA. [3]
  • Genetic code: The genetic code sequence is three nucleotides originally encoded an organism’s genome that specifies individual amino acids found in proteins. There are 20 common amino acids used by the protein synthetic machinery and 64 potential sequence permutations of the four bases used to specify the 20 amino acids.  Early studies revealed that the code was degenerate, with many of the amino acids specified by multiple 3-base combinations. In general, when multiple codons specify a single amino acid, degeneracy is found at the third or “wobble” position. [1]   Sixty-one of the 64 sequence permutations specify amino acids, whereas three of the sequence permutations serve as “stop” codons to terminate protein synthesis. While initially thought to be the same for all living organisms, scientists now know that there are a small number of deviations from the universal code found in mitochondria and specific bacterial species.
  • Genetic code and human disease: Appropriate readout of the genetic code is essential for human health. Mutations that alter protein-coding sequences can affect proteins in many different ways.  The effect of mutations on the coding sequence can classify as either synonymous or nonsynonymous depending on whether they are predicted to alter the primary structure of a protein. 
  • Synonymous mutations relate to the degeneracy of the code and the fact that changes in base sequence may not have an effect on which amino acid a codon represents (though it should be noted that some synonymous mutations may affect pre-mRNA splicing and so influence a protein’s primary structure). Synonymous mutations typically fall in the third position of a codon.
  • Nonsynonymous mutations fall into three different classes:
  • Missense mutations where there is substitution of one amino acid for another.
  • Nonsense mutations which introduce a premature termination codon in an mRNA sequence.  These mutations typically result in a truncated protein.
  • Frameshift mutations result from insertion or deletion mutations that shift the reading frame of a coding sequence such that sequencing downstream of the mutational event no longer code for the correct amino acid sequence of a protein.      
  • Protein factors– the process of protein synthesis requires multiple non-ribosomal proteins that transiently participate during the initiation, elongation, and termination phases of protein synthesis.  These factors are named for the phase in which they function (for example, eukaryotic initiation factor 2, eIF2). [2]
  • Issues of Concern

The proteins that a cell expresses are the ultimate manifestation of its phenotype. Cells within tissues of the human body have variable phenotypic expression involved in defining tissue organization and function despite having identical genomes due to the differential expression of genes within the genome. While the differential regulation of gene expression primarily occurs at the level of transcription, regulation of gene expression can also take place at the post-transcriptional level, including regulated translation. Because of the importance of protein expression to the phenotypic properties of a cell, errors in the cellular proteome manifested at all levels of the correct readout of genetic information from gene to protein can have broad implications on health.

  • Cellular Level

The eukaryotic cell is compartmentalized, with different cellular compartments defined by biological membranes. The synthesis of components of the translational machinery begins with the transcription of mRNAs, tRNAs, and rRNAs in the nucleus by RNA polymerases II, III, and I, respectively. Transfer RNAs and the mRNAs encoding ribosomal proteins exit the nucleus and the latter get translated in the cytoplasm. Ribosomal proteins then return to the nucleus where they assemble hierarchically on rRNAs being transcribed by RNA polymerase I. This assembly process defines a compartment of nucleus referred to as the nucleolus. Ribosome assembly is a complex process involving hundreds of accessory factors that transiently associate with ribosomal subunits during their maturation. While most of the steps involved in maturing ribosomal subunits occur within the nucleolus before the subunits exiting through nuclear pores, final steps in subunit maturation occur in the cytoplasm. Ribosomes translating most cellular mRNAs do so as free ribosomes in the cytoplasm. In contrast, ribosomes translating mRNAs encoding proteins destined for secretion from the cell or resident proteins of the endoplasmic reticulum, Golgi apparatus, lysosome, or plasma membrane get localized to the endoplasmic reticulum membrane. [4]

 Briefly, translation can be broken down into three phases initiation, elongation, and termination. Initiation consists of identifying the exact site in the sequence of nucleotides in an mRNA to begin translation. This process has significant differences between eukaryotes (described here) and prokaryotes. Upon identification of the start site for translation, elongation ensues as the ribosome moves along the mRNA “reading” groups of three nucleotides that specify each amino acid added to the growing polypeptide chain. Finally, termination occurs when the ribosome encounters one of three termination codons, and the completed protein gets released from the ribosome.

Translation begins with the assembly of an 80S initiation complex on mRNA. This process involves identifying appropriate codon to initiate translation. The AUG codon specifies the amino acid methionine and virtually all proteins specified by the genetic code begin with methionine. In eukaryotes, the AUG used to initiate protein synthesis is usually the first AUG downstream of the cap structure, found at the 5’ end of the mRNA. A protein complex known as eIF4F recognizes the cap structure. The eIF4F complex then recruits the 43S pre-initiation complex comprised of 40S subunits together with a ternary complex formed of the initiator tRNA (Met-tRNA), eIF2, and GTP to the 5’ end of an mRNA. The 40S complex subsequently scans down the mRNA until encountering the first AUG and the 48S initiation complex forms. In addition to eIF4F and eIF2, multiple other initiation factors facilitate the formation of the 48S initiation complex. At this point, the 60S ribosomal subunit joins the 48S initiation complex, all initiation factors are released, and the elongation phase of translation is set to begin. In the 80S initiation complex, the initiator Met-tRNA is base-paired to the initiating AUG in the ribosomal P site with the next codon of the mRNA positioned in the ribosomal A site. Translational re-initiation facilitation occurs by the interaction of the eIF4F complex with both the 5’ cap and the 3’ polyA tail of an mRNA. [5]

As with initiation, elongation requires the use of non-ribosomal proteins known as elongation factors. Eukaryotic EF1A (eEF1A) forms ternary complexes with aminoacyl-tRNAs and GTP.  These ternary complexes enter the empty A site of the ribosome and if an appropriate codon-anticodon interaction forms between the incoming aminoacyl-tRNA and the codon in the A site, GTP will be hydrolyzed and eEF1A released. At this point, the peptidyl-transferase site of the ribosome catalyzes peptide bond formation as the free amino group of the incoming aminoacyl-tRNA attacks the ester bond linking the growing polypeptide to the tRNA in the ribosomal P site. The resultant uncharged tRNA occupying the P site moves to the E (exit) site and leaves the ribosome. The growing polypeptide chain previously in the P site is now elongated by one amino acid as it transfers to the aminoacyl-tRNA in the A site.  The peptidyl-tRNA in the A site is then translocated to back to the P site with the help of eEF2 and GTP. The A site is now empty, and the entire process is repeated over and over again as the ribosome moves down the mRNA.

Termination occurs when eRF1, a release factor structurally analogous to tRNA, recognizes termination codons in an mRNA and recruits eRF3 to hydrolyze the polypeptide chain from the tRNA occupying the P site. Termination of translation completes by the dissociation of the ribosomal subunits, which are now capable of initiating another round of protein synthesis. Multiple ribosomes can translate a single mRNA simultaneously forming complexes known as polysomes. [5] [6] [7]

There are many possible methods of confirming that a particular protein is being synthesized.  

Immunostaining

Because of the large number of proteins synthesized in a typical cell, verifying the presence of a particular protein is understandably challenging. One way to confirm the presence of a specific protein in a clinical specimen is through immunostaining. This technique introduces an antibody to a protein of interest, and the exquisite specificity of the antibody serves for protein detection.

In immunostaining, the specimen is incubated with a primary antibody solution. This antibody can contain a fluorescent molecule on its heavy chain or an enzyme (such as horseradish peroxidase) that will fluoresce in the presence of a suitable substrate. The light released can be visualized under a microscope or exposed to photosensitive film in a dark room for later development. Immunostaining can either be direct where the primary antibody possesses the means of fluorescent detection or indirect, where a secondary antibody raised against the primary antibody is detectable via fluorescence. [8]

Protein Electrophoresis

As with nucleic acids, proteins can be separated based on size and/or charge using gel electrophoresis. Proteins can be run in their native configurations or undergo denaturing before electrophoresis. In denaturing electrophoresis, a detergent such as sodium dodecyl sulfate (SDS) is used to disrupt non-covalent bonding forces within proteins. SDS also gives proteins common charge to mass ratios, so the only force operating during SDS-polyacrylamide gel electrophoresis is the molecular sieving action of the polyacrylamide gel. Proteins separated in this manner can be detected either non-specifically with dyes like coomassie blue or specifically using antibodies in a procedure referred to as Western blotting or immunoblotting. 

  • Pathophysiology

Many human diseases result from changes in protein sequence caused by mutations that alter the correct readout of genetic information from gene to a functional protein. Defects in the protein synthetic machinery also cause a small but growing number of human diseases.  Examples of such pathologies follow.

Sickle Cell Anemia 

Human hemoglobin contains two alpha and two beta chains to create a heterotetramer. In Sickle Cell Anemia, the sixth codon of the beta chain contains a missense mutation, in which glutamic acid, a charged amino acid, is replaced with valine, a neutral amino acid. This single amino acid difference affects the tertiary and quaternary structures of hemoglobin such that it distorts the biconcave shape of erythrocytes into sickle shapes in certain conditions. [9]

Duchenne Muscular Dystrophy

Like many X-linked diseases, DMD primarily affects males at an early age. It is characterized clinically by muscle weakness, calf pseudohypertrophy, and the Gower sign in a child. One of the pathophysiologic origins of this disease is the formation of a premature stop codon in an early exon of the dystrophin gene, which leads to a truncated dystrophin protein which compromises the integrity of the sarcomere and contractile function of the muscle. [10]

Diamond-Blackfan Anemia

While many human diseases result from mutations in the coding sequences of genes that affect protein production, Diamond-Blackfan anemia (DBA) is one of a growing number of conditions resulting from defects in the protein synthetic machinery. DBA is caused by autosomal dominant mutations in genes encoding proteins of either the 40S or 60S ribosomal subunit.  While the exact mechanisms underlying the pathophysiology of DBA are currently unknown, it seems likely that changes in cellular proteomes (the protein composition of a cell) resulting from suboptimal numbers of ribosomes contribute in part to the clinical features of the disease. These clinical features include a deficit in red blood cell production, small size, and a heterogeneous number of congenital anomalies. [11]

  • Clinical Significance

The clinical significance of protein synthesis lies not only in human translation but in differences between human and bacterial translation. The bacterial ribosome (70S) has the same core components and many structurally similar sites compared to the eukaryotic ribosome (80S). However, translational differences between humans and bacteria create targets for antimicrobial drugs. These differences allow certain antibiotics to bind selectively to bacterial ribosomes at low concentrations, targeting bacteria selectively and either inhibiting growth or killing the microbe. Several commonly prescribed antibiotics target specific components of the bacterial ribosome and mRNA. Aminoglycosides target the 30S small ribosomal subunit; specifically, this class binds to the rRNA segment active in the A site. The tetracyclines operate similarly by competing for the A site with charged aminoacyl tRNA. The macrolide antibiotics act on the 50S large ribosomal subunit. When they bind to the rRNA of the large subunit, it prevents the formation of the peptide bond and promotes the early expulsion of the tRNA in the P site. [12] [3]

The clinical manifestations of differences in protein synthesis can also be useful in diagnosis. Native protein electrophoresis can help identify hemoglobinopathies in newborn screenings. Similarly, serum protein electrophoresis can identify characteristic M protein spikes of monoclonal protein expression in multiple myeloma.

  • Review Questions
  • Access free multiple choice questions on this topic.
  • Comment on this article.

Disclosure: Jacob Hoerter declares no relevant financial relationships with ineligible companies.

Disclosure: Steven Ellis declares no relevant financial relationships with ineligible companies.

This book is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) ( http://creativecommons.org/licenses/by-nc-nd/4.0/ ), which permits others to distribute the work, provided that the article is not altered or used commercially. You are not required to obtain permission to distribute this article, provided that you credit the author and journal.

  • Cite this Page Hoerter JE, Ellis SR. Biochemistry, Protein Synthesis. [Updated 2023 Jul 17]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan-.

In this Page

Bulk download.

  • Bulk download StatPearls data from FTP

Related information

  • PMC PubMed Central citations
  • PubMed Links to PubMed

Similar articles in PubMed

  • Erratum: Eyestalk Ablation to Increase Ovarian Maturation in Mud Crabs. [J Vis Exp. 2023] Erratum: Eyestalk Ablation to Increase Ovarian Maturation in Mud Crabs. . J Vis Exp. 2023 May 26; (195). Epub 2023 May 26.
  • Review An Experimental Approach to Genome Annotation: This report is based on a colloquium sponsored by the American Academy of Microbiology held July 19-20, 2004, in Washington, DC [ 2004] Review An Experimental Approach to Genome Annotation: This report is based on a colloquium sponsored by the American Academy of Microbiology held July 19-20, 2004, in Washington, DC . 2004
  • Planning Implications Related to Sterilization-Sensitive Science Investigations Associated with Mars Sample Return (MSR). [Astrobiology. 2022] Planning Implications Related to Sterilization-Sensitive Science Investigations Associated with Mars Sample Return (MSR). Velbel MA, Cockell CS, Glavin DP, Marty B, Regberg AB, Smith AL, Tosca NJ, Wadhwa M, Kminek G, Meyer MA, et al. Astrobiology. 2022 Jun; 22(S1):S112-S164. Epub 2022 May 19.
  • Review The Global Genome Question: Microbes as the Key to Understanding Evolution and Ecology: This report is based on a colloquium, “The Global Genome Question: Microbes as the Key to Understanding Evolution and Ecology,” sponsored by the American Academy of Microbiology and held October 11-13, 2002, in Longboat Key, Florida [ 2004] Review The Global Genome Question: Microbes as the Key to Understanding Evolution and Ecology: This report is based on a colloquium, “The Global Genome Question: Microbes as the Key to Understanding Evolution and Ecology,” sponsored by the American Academy of Microbiology and held October 11-13, 2002, in Longboat Key, Florida . 2004
  • Japan-China Joint Medical Workshop on Drug Discoveries and Therapeutics 2008: The need of Asian pharmaceutical researchers' cooperation. [Drug Discov Ther. 2008] Japan-China Joint Medical Workshop on Drug Discoveries and Therapeutics 2008: The need of Asian pharmaceutical researchers' cooperation. Nakata M, Tang W. Drug Discov Ther. 2008 Oct; 2(5):262-3.

Recent Activity

  • Biochemistry, Protein Synthesis - StatPearls Biochemistry, Protein Synthesis - StatPearls

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

Connect with NLM

National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894

Web Policies FOIA HHS Vulnerability Disclosure

Help Accessibility Careers

statistics

logo-home

Written for

Document information.

  • Public School
  • Life Science Grade 12 Essays

DNA REPLICATION AND PROTEIN SYNTHESIS ESSAY.

  • Institution

Easy to read essay on DNA REPLICATION AND PROTEIN SYNTHESIS

Preview 1 out of 1  pages

mobile-preview

  •   Report Copyright Violation

Preview 1 out of 1 pages

  • Uploaded on September 9, 2019
  • Number of pages 1
  • Written in 2019/2020
  • Professor(s) Unknown
  • Grade Unknown
  • replication
  • protein synthesis
  • Institution Public School
  • Course Life Science Grade 12 Essays

avatar-seller

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through EFT, credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do i get when i buy this document.

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying this summary from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller kumarangovender. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy this summary for R80,00. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

81302 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy summaries for 14 years now

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

LIFE SCIENCES ESSAYS GRADE 10-12

Profile image of France Chavangwane

The document provides a clear structure on how to write the essays. This document has been created from information available from the internet and it is not meant for any business purposes (FREE SUPPLY) but to help South African Life sciences Learners by gathering all the important information together. Not for market purposes only meant at assisting the Learners with a simple clear alternative in the essay writing, With a compilation of essays from Grade 12-10. You have to read the essays with understanding and never try to memorize them, as that is never part of learning. We aimed at creating independent and innovative thinkers of the south African as non-profit organization. Sources 1. I’solezwe lesiXhosa, 17 September, 2015 page 11 2. Life Sciences Academics (Facebook page), DR Marian Ross 3. http://www.testtakingpa.com/study/ 4. South African Department Basic Education Exam question papers and memorandums available from WWW.dbe.gov.za 5. Mr. Chaple's Science Class Blog http://chaplescienceclass.blogspot.com/2017/09/dnastructure.html 6. Eastern Cape Department of Education https://www.ecexams.co.za/ExaminationPapers.htm

Related Papers

Life Sciences have always been a fundamental area of science. The exponential increase in the quantity of scientific information and the rate, at which new discoveries are made, require very elaborate, interdisciplinary and up-to-date information and their understanding. This fourth edition of Life sciences, Fundamentals and practice includes extensive revisions of the previous edition. We have attempted to provide an extraordinarily large amount of information from the enormous and ever-growing field in an easily retrievable form. It is written in clear and concise language to enhance self-motivation and strategic learning skill of the students and empowering them with a mechanism to measure and analyze their abilities and the confidence of winning. We have given equal importance to text and illustrations. The fourth edition has a number of new figures to enhance understanding. At the same time, we avoid excess detail, which can obscure the main point of the figure. We have retained the design elements that have evolved through the previous editions to make the book easier to read. Sincere efforts have been made to support textual clarifications and explanations with the help of flow charts, figures and tables to make learning easy and convincing. The chapters have been supplemented with self-tests and questions so as to check one’s own level of understanding. Although the chapters of this book can be read independently of one another, they are arranged in a logical sequence. Each page is carefully laid out to place related text, figures and tables near one another, minimizing the need for page turning while reading a topic. I have given equal importance to text and illustrations as well. We hope you will find this book interesting, relevant and challenging.

life science protein synthesis essay

Life Sciences have always been a fundamental area of science. The exponential increase in the quantity of scientific information and the rate, at which new discoveries are made, require very elaborate, interdisciplinary and up-to-date information and their understanding. This fourth edition of Life sciences, Fundamentals and practice includes extensive revisions of the previous edition. We have attempted to provide an extraordinarily large amount of information from the enormous and ever-growing field in an easily retrievable form. It is written in clear and concise language to enhance self-motivation and strategic learning skill of the students and empowering them with a mechanism to measure and analyze their abilities and the confidence of winning. We have given equal importance to text and illustrations. The fourth edition has a number of new figures to enhance understanding. At the same time, we avoid excess details, which can obscure the main point of the figure. We have retained the design elements that have evolved through the previous editions to make the book easier to read. Sincere efforts have been made to support textual clarifications and explanations with the help of flow charts, figures and tables to make learning easy and convincing. The chapters have been supplemented with self-tests and questions so as to check one’s own level of understanding. We hope you will find this book interesting, relevant and challenging.

Halus Satriawan

Bekele Gebreamanule

Joyce Wawira

By the end of the course, the learner should be able to: 1. communicate biological information in a precise, clear and logical manner 2. develop an understanding of interrelationships between plants and animals and between humans and their environment 3. apply the knowledge gained to improve and maintain the health of the individual, family and the community 4. relate and apply relevant biological knowledge and understanding to social and economic situations in rural and urban settings 5. observe and identify features of familiar and unfamiliar organisms, record the observations and make deductions about the functions of parts of organisms 6. develop positive attitudes and interest towards biology and the relevant practical skills 7. demonstrate resourcefulness, relevant technical skills and scientific thinking necessary for economic development 8. design and carry out experiments and projects that will enable them understand biological concepts 9. create awareness of the value of cooperation in solving problems 10. acquire a firm foundation of relevant knowledge, skills and attitudes for further education and for training in related scientific field.

Science & Education

Eneku Ronald

Farah Ramzi

TRISNA AMELIA

This book contains concept of biology and the exercise in English language that can help the readers to improve their English skill in biology. There are eight main contents in this book, which are the chemistry of life, an introduction of metabolisms, biotechnology, mechanisms of evolution, classification of living things,reproduction in plant, thermoregulation, and ecology. Hopefully, this book can help the readers to expand their knowledge about English for Biology.

Nature Reviews Genetics

RELATED PAPERS

Enrique SANCHEZ ALBARRACIN

Human Immunology

Iris Hernandez

International Journal of Wildland Fire

Alan Rhodes

Sergio Jimenez Saiz

Sirshak Dutta

Pediatric Rheumatology

Tamara Sarkisian

Clinical Cancer Research

Carlos Lopez

Jacek Biesiada

American Journal of Clinical Oncology

Paul Brien Roda

Tama Leaver

European Journal of Heart Failure

G. Malfatto

Huy Bình Phan

Ahmet Çıtır

Kamolrat Saksomboon Turner

Contreras Juarez

Dos Algarves: A Multidisciplinary e-Journal

Revista Expressão Católica

Iara Marques

León Ferder

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

My Courses

Protein synthesis Grade 12 Life Sciences Notes with Activities Questions and Answers

Protein synthesis Grade 12 Life Sciences Notes with Activities Questions and Answers

On this page you will find Protein synthesis Grade 12 Life Sciences Notes under Nucleic acids, which includes revision activities with questions and answers to help grade 12 learners to prepare for their tests and exams.

---------------------------------------------------------------

Don't Get Stuck , Ask an Expert 👇

life science protein synthesis essay

Table of Contents

Protein synthesis is the process by which proteins are made in each cell of an organism to form enzymes, hormones and new structures for cells.

There are two main processes involved in protein synthesis, namely transcription and t ranslation.

Transcription (takes place in the nucleus)

  • DNA unwinds and splits.
  • One DNA strand acts as a template for forming mRNA.
  • Free nucleotides arrange to form mRNA according to the DNA template. This process is called transcription.
  • The mRNA leaves the nucleus through the nuclear pores. Stage B now takes place when mRNA in the cytoplasm attaches to the ribosome.

Translation (takes place in the cytoplasm on the ribosome)

  • Each tRNA brings a specific amino acid to the mRNA. This is called translation.
  • The amino acids are linked together to form a particular protein.

Protein synthesis

Protein Synthesis Video Lesson

DNA – double-stranded; look for presence of thymine; found in nucleus only. Nuclear membrane – has nuclear pores through which mRNA moves. mRNA – single-stranded; look for presence of uracil; contains a triplet of bases (codon) found in nucleus and cytoplasm. Ribosome – usually mRNA attached to it. tRNA – contains a triplet of bases (anticodon); look for attached amino acid

Question and Answers Activities:

Find short and long questions for Grade 12 Life Sciences, which will help you to prepare for the exams, tests, practical tasks, and assignments.

EXAM TIP : The structure of the DNA and RNA molecule is very important and is often examined almost every year in Grade 12. Make sure that you know the labels of each component. Remember to label the diagram first and then move onto the questions

PDF Downloadable Notes on Protein synthesis

Life Science Study Guides for Grade 12

Nucleic acids   Notes :  The structure of DNA and RNA ,  Differences between DNA and RNA ,  DNA replication and its significance ,  DNA profiling ,  Protein synthesis .

Protein synthesis Grade 12 Life Sciences Notes with Activities Questions and Answers

View all # Life-Sciences-Grade12 Study Resources

We have compiled great resources for Life Sciences Grade 12 students in one place. Find all Question Papers, Notes, Previous Tests, Annual Teaching Plans, and CAPS Documents.

Don’t miss these:

The part in which DNA will be found in a Eukaryotic Cell

APS Score Calculator

Did you see these.

  • DNA Notes: The part in which DNA will be Found in a Eukaryotic Cell
  • Differences between DNA and RNA Grade 12 Life Sciences Notes with Activities Questions and Answers
  • Reproduction in Vertebrates Diversity of Reproductive Strategies
  • What Requirements and Subjects are Needed to Study Gynaecology in South African Universities?
  • Exam Topics: Life Sciences Grade 12 NSC Paper 1 and Paper 2
  • Life Sciences Grade 12 September 2020 Controlled Test Term 3 Past Papers and Memos
  • Life Sciences Grade 12 Past Question Papers and Memorandums pdf Download for 2023
  • Explaining The Largest Biological System on Planet Earth

Send Us a WhatsApp Message

join whatsapp

065 380 8708

Customer Reviews

life science protein synthesis essay

Megan Sharp

life science protein synthesis essay

IMAGES

  1. Protein Synthesis Essay

    life science protein synthesis essay

  2. Protein Synthesis

    life science protein synthesis essay

  3. Grade 12 protein synthesis essay

    life science protein synthesis essay

  4. What Are The Four Stages Of Protein Synthesis

    life science protein synthesis essay

  5. Omnipotent Role of DNA in Protein Synthesis Free Essay Example

    life science protein synthesis essay

  6. Proteins synthesis

    life science protein synthesis essay

VIDEO

  1. Muscle protein synthesis decreases with age

  2. Science protein synthesis video

  3. PROTEIN SYNTHESIS IN 7 MINUTES

  4. Protein Synthesis

  5. Protein synthesis/Final Part/Biochemistry Lippincott

  6. Proteins || Biochemistry || Classification of Proteins#protein #proteins#biochemistry

COMMENTS

  1. Protein Synthesis: Understanding the Process and its Importance: [Essay

    Importance of Protein Synthesis. Protein synthesis plays a critical role in many diseases, including cancer, cystic fibrosis, and Alzheimer's disease. Understanding protein synthesis is essential in the development of new drugs and treatments for these diseases. Protein synthesis is used in biotechnology to produce recombinant proteins for ...

  2. 6.4: Protein Synthesis

    It begins with the sequence of amino acids that make up the protein. Instructions for making proteins with the correct sequence of amino acids are encoded in DNA. Figure 6.4.1 6.4. 1: Transcription and translation (Protein synthesis) in a cell. DNA is found in chromosomes.

  3. Essay on Protein Synthesis

    Essay # 7. Energetics of Protein Synthesis: It has been roughly estimated that the standard free energy of the hydrolysis of a peptide bond is about - 5.0 kcal. The process of protein synthesis is an energy-consuming process and, in E. coli, it consumes as much as 90% of the cellular energy.

  4. The growing toolbox for protein synthesis studies

    Abstract. Protein synthesis stands at the last stage of the central dogma of molecular biology, providing a final regulatory layer for gene expression. Reacting to environmental cues and internal signals, the translation machinery can quickly tune the translatome from a pre-existing pool of RNAs, before the transcriptome changes.

  5. The Steps of Protein Synthesis

    The processes consist of two steps, transcription and translation. Protein Synthesis. Image Credit: Soleil Nordic/Shutterstock.com. The first step involves synthesizing messenger RNA (mRNA), which then leaves the nucleus and travels into the cytoplasm where it attaches to a ribosome. At this point, the second step of translation begins where ...

  6. RNA and protein synthesis review (article)

    Meaning. RNA (ribonucleic acid) Single-stranded nucleic acid that carries out the instructions coded in DNA. Central dogma of biology. The process by which the information in genes flows into proteins: DNA → RNA → protein. Polypeptide. A chain of amino acids. Codon.

  7. Role of proteins in the body

    Protein synthesis. A gene is a segment of a DNA molecule that contains the instructions needed to make a unique protein. All of our cells contain the same DNA molecules, but each cell uses a different combination of genes to build the particular proteins it needs to perform its specialised functions. Protein synthesis has 2 main stages.

  8. Uncovering protein structure

    Abstract. Structural biology is the study of the molecular arrangement and dynamics of biological macromolecules, particularly proteins. The resulting structures are then used to help explain how proteins function. This article gives the reader an insight into protein structure and the underlying chemistry and physics that is used to uncover protein structure. We start with the chemistry of ...

  9. 3.19: Protein Synthesis

    When the amino acid monomers are assembled in that specific order, proteins are made, a process called protein synthesis. In short, DNA contains the instructions to create proteins. But DNA does not directly make the proteins. Proteins are made on the ribosomes in the cytoplasm, and DNA (in an eukaryotic cell) is in the nucleus.

  10. Peptide synthesis at the origin of life

    Small-molecule organocatalysis might have driven the emergence of peptide biochemistry. The chemical origin of life is full of chicken-and-egg conundrums. Among these is the origin of protein synthesis. Nature's protein-based enzyme catalysts are built from the polymerization of amino acids, yet this process itself requires enzymes, adenosine ...

  11. Protein Synthesis and Translational Control: A Historical Perspective

    Abstract. Protein synthesis and its regulation are central to all known forms of life and impinge on biological arenas as varied as agriculture, biotechnology, and medicine. Otherwise known as translation and translational control, these processes have been investigated with increasing intensity since the middle of the 20th century, and in ...

  12. Biochemistry, Protein Synthesis

    Protein synthesis involves a complex interplay of many macromolecules. Ribosomes: The eukaryotic ribosome has two subunits: a 40S small subunit and a 60S large subunit. Together, the eukaryotic ribosome is 80S. There are several sites of functional significance, but the most important ones are the A (aminoacyl), P (peptidyl), and E (exit) sites.

  13. Protein Synthesis Essay

    Protein Synthesis is the process whereby DNA (deoxyribonucleic acid) codes for the production of essential proteins, such as enzymes and hormones. Proteins are long chains of molecules called amino acids. Different proteins are made by using different sequences and varying numbers of amino acids. The smallest protein consists of fifty amino ...

  14. Protein Synthesis ( Read )

    Term. Definition. amino acid. monomer of protein. gene. piece of DNA that passes genetic information from parent to offspring. protein synthesis. process in which cells make proteins; includes transcription (DNA to mRNA) and translation (mRNA to protein).

  15. Proteins: an Essay on HOW LIFE FUNCTIONS AT THE CELLULAR LEVEL

    Legumes are a great source of high-quality protein — 20-45% of their protein is rich in the amino acid lysine . Peas and beans contain 17 - 20% high - quality protein while soybeans have

  16. PDF DNA STRUCTURE AND FUNCTION PROTEIN SYNTHESIS

    The Nucleus. The nucleus is the most conspicuous organelle in all eukaryotic cells. The nucleus stores all the genetic information in the genes of the chromosomes. It is the CEO of the cell directing all the functions for life and, in addition prepares the cell for growth and replication. The nucleus.

  17. PDF RNA & Protein Synthesis

    RNA & PROTEIN SYNTHESIS 12 FEBRUARY 2014 Lesson Description In this lesson we: ... Summary Structure (Structure of RNA from Life Sciences for all, Grade 12, Figure 4.14, Page 193) Types mRNA tRNA ribosomal RNA . Protein Synthesis Test Yourself Select the most correct answer from the options given. ... process in a mini essay. (20) Links Learn ...

  18. PDF Life Science Grade 12 Essay Questions

    Life sciences grade12 essay questions 2018 LIFE SCIENCES GRADE 12 COMPILATION OF POSDIBLE ESSAYS COMPILED BY :TLHAKO JOSIAS YEAR : 2018 ... PROTEIN SYNTHESIS (paper 2) 09 MEIOSIS (both p1 & p2) 10 MEIOSIS AND DOWN SYNDROME (Paper 2) 11 REPRODUCTION IN VERTEBRATES AND THEIR

  19. DNA REPLICATION AND PROTEIN SYNTHESIS ESSAY.

    Grade 12 Life Science Essays. 1. Essay - Hearing and reflex action essay. 2. Essay - Dna replication and protein synthesis essay. 3. Essay - Accommodation, hearing and balance essay. 4. Essay - Eutrophication and acid mine pollution.

  20. (PDF) LIFE SCIENCES ESSAYS GRADE 10-12

    By the end of the course, the learner should be able to: 1. communicate biological information in a precise, clear and logical manner 2. develop an understanding of interrelationships between plants and animals and between humans and their environment 3. apply the knowledge gained to improve and maintain the health of the individual, family and the community 4. relate and apply relevant ...

  21. PROTEIN SYNTHESIS ESSAY...

    PROTEIN SYNTHESIS ESSAY The Perfect Essay for Q4.2, ... PROTEIN SYNTHESIS ESSAY The Perfect Essay for Q4.2, June - 2012 Proteins are essential for the functioning g of all living cells and therefore must be... Log In. Life Science Academics - 2013 ... Life Science Academics - 2013

  22. Protein synthesis Grade 12 Life Sciences Notes with Activities

    Question and Answers Activities: Find short and long questions for Grade 12 Life Sciences, which will help you to prepare for the exams, tests, practical tasks, and assignments. EXAM TIP: The structure of the DNA and RNA molecule is very important and is often examined almost every year in Grade 12. Make sure that you know the labels of each ...

  23. Life Science Grade 12 Protein Synthesis Essay

    Life Science Grade 12 Protein Synthesis Essay - ... The first step in making your write my essay request is filling out a 10-minute order form. Submit the instructions, desired sources, and deadline. If you want us to mimic your writing style, feel free to send us your works. In case you need assistance, reach out to our 24/7 support team.