An Overview to Understanding SARS-COV2 Mutations
Mutations Illustrate Important Virus/Host Interactions
With the ongoing pandemic many media outlets and sources will talk about variants and mutations. Have you ever read any of these pieces and wondered how you can tell if a SARS-COV2 mutation is beneficial to a virus and how it could be beneficial?
Here I’ll give an overview of how you can guess how a mutation may be beneficial to SARS-COV2, and provide a specific mutation to look at. This information won’t be concrete and there will be many nuances, but it will still give a good starting point for when you read articles.
Viral Mutations
All life on Earth follows the general pathway of:
Genome (Genetic material) → mRNA (code for a specific protein) → Protein.
The characteristics of amino acids determine the structure and function of proteins, meaning that if any amino acid changes it may alter how a protein functions. In this case changes in the form of mutations must occur in an organism’s genome since this is the material that is passed on through generations and is used to encode for the amino acids that make up an organism’s proteins. Therefore, the presentation of amino acid changes is indicative of genomic change.
When we look at viruses they tend to have extremely high mutation rates. As noted in Peck et. al. 2018:
Many viruses have high rates of evolution. These high evolutionary rates have been attributed to the large population sizes, short generation times, and high mutation rates of viruses. Mutation rate, specifically, is an important determinant of evolutionary rate across taxa (1,–4). In the context of viruses, the mutation rate is the rate at which errors are made during replication of the viral genome. This is in contrast to the substitution rate, which is the rate at which mutations become fixed, or present within all individuals, in a population. Whereas mutation rates are used to estimate the amount of genetic diversity generated within a population of offspring, substitution rates are used to estimate the rate of evolution for a particular lineage or taxon.
Single-stranded RNA viruses have some of the highest rates of mutations. Their RNA-dependent RNA Polymerase (RdRp) lacks the proofreading abilities that DNA polymerases have. Also, although not fully clear, the single-stranded nature and RNA genome of some viruses confer higher mutation rates relative to double stranded & DNA-based viruses. When it comes to coronaviruses, they contain an exoribonuclease enzyme that helps to proofread the virus’ genome during replication and thus helps to correct for errors in the viral genome, meaning they may undergo fewer mutation rates.
Nonetheless, the lack of full proofreading abilities would indicate a higher level of mutations should occur with SARS-COV2.
Unless selective pressure is put directly onto the replication process, such as the use of antivirals, mutations tend to be random. When someone is infected with a virus all of the viral particles are able to undergo mutations. However, the ones that persist will be the ones that provide benefits to the virus, such as mutations that increase virus binding to host receptors or aids in avoiding the host’s immune system. This is a perfect example of a species undergoing natural selection; many different strains of a virus will develop over the course of an infection but those that are the most well adapted can out compete less well adapted strains and gain dominance.
This is an extremely important point to make; when we talk about variants and how they gain dominance these variants had to contain mutations that allowed them to out compete their weaker, less virulent counterparts.
Amino Acids and Proteins
We can now look at what genomic mutations does to amino acid expression and how that affects a virus’ virulence. The structure, function, and interaction of proteins depend upon the characteristics of the amino acids that make up proteins. If we refer to an amino acid chart we can see that amino acids can be characterized by their side chains. Some amino acids are negatively charged, some positive, some hydrophobic (don’t like water) and some are polar but don’t contain a charge (polar, aprotic).
When the side chains of amino acids interact with one another most of their interactions depend upon electrostatic and van der Waals interactions. These include interactions such as dipole-dipole attraction (the interactions between polar side chains), hydrogen bonding, and hydrophobicity.
Some amino acids, such as charged species, prefer opposite charges. Other amino acids, such as polar ones, will prefer to engage in hydrogen bonding with other residues. The hydrophobic amino acids like to clump together since it allows them to shield themselves from polar/ionic interactions. You’ll usually see these amino acids inside a protein’s structure.
You’ll also notice that there are special case amino acids. These tend to contain unique structural properties. For example, cysteine and selenocysteine are able to form strong, covalent bonds with other cysteine/selenocysteine amino acid residues and help to form tight protein structures. Proline tends to be located near amino acid loops and help to provide “kinks” in the structure. Glycine is the smallest amino acid meaning it is flexible and provides greater movement and bend to a protein.
Applying Mutations to SARS-COV2: N501Y Mutation
With this information in mind we can look at the proteins of SARS-COV2 and see how mutations can change how infectious the virus is. In this case, we’ll pay close attention to the N501Y mutation located in the spike protein. and draw out inferences from there.
When we are looking at mutations you’ll notice the letter/number/letter notation. The first letter indicates what the original amino acid was (N = asparagine). The number indicates which position in the amino acid sequence this mutation is occurring in (the 501st amino acid). The last letter indicates what the new amino acid is ( Y = tyrosine). Putting it all together this tells us that the amino acid residue at position 501 changed from an asparagine to a tyrosine. Note that mutations usually refer to the amino acid expression and not the genetic change since this is what affects the protein’s function.
The N501Y mutation is one of the most common SARS-COV2 mutations and is found in many of the circulating variants.
The mutation is located within the receptor binding domain (RBD) of the spike protein, meaning that it is part of the region that binds to the ACEII receptor. To see why this mutation is important to binding we should look at the ACEII/SARS-COV2 interaction.
As indicated in Shang et. al. 2020 (emphasis mine):
At the SARS-CoV/hACE2 interface, we previously identified two virus-binding hotspots 11,12: hotspot Lys31 (i.e., hotspot-31) consists of a salt bridge between Lys31 and Glu35, and hotspot Lys353 (i.e., hotspot-353) consists of a salt bridge between Lys353 and Asp38. Both salt bridges are weak, as judged by the relatively long distance of these interactions. Burial of these weak salt bridges in hydrophobic environments upon virus binding would enhance their energy due to reduction of the dielectric constant.
The authors noted that these 2 salt bridges are important to binding to ACEII as indicated with previous work looking at SARS-COV/ACEII binding, and that a hydrophobic amino acid may increase binding affinity. Therefore, if a spike protein amino acid residue that is nearby the salt bridge changes into one that is hydrophobic, we should expect to see greater binding and thus greater infectivity.
As we can see there is an amino acid residue nearby the hotspot-353 salt bridge. In this case it is the N501 residue. Therefore, a mutation that changes the N501 residue into something hydrophobic will make the virus more infectious and transmissible. We already know that the most common mutation for N501 is an N501Y mutation, so we just need to validate that it is a change into a hydrophobic amino acid. We can just refer to our amino acid chart and check.
So here we have outlined that the N501 residue of the spike protein is located nearby a critical region of the ACEII receptor. A mutation that led to a tyrosine allows the new hydrophobic residue to shield the salt bridge, reduce the dielectric constant, and increase the binding affinity, and thus making the virus more transmissible.
I’ll provide 2 quick examples of other mutations and discern their effects on viral activity.
D614G Mutation
This mutation is also gaining in prevalence. Unlike the N501Y mutation this mutation is located outside of the RBD of the spike protein. We can also see that it is a mutation that replaces an aspartic acid with a glycine residue, meaning that increased flexibility may play a role in its prevalence.
As noted in Laha et. al. 2020 (emphasis mine):
The structural comparison of wild-type and in-silico generated D614G mutant shows that a change from Aspartic acid to Glycine alters the electro-static potential of the surface of the protein (Fig. 5A). This change creates a favourable environment in a hydrophobic pocket of the S protein (Fig. 5B). Moreover, we have also observed that D614 is at the proximity of the hinge bending region (CTD2/NTD linker) of RBD (Fig. 5C), therefore mutation of D to a small residue G without any side-chain might increase the flexibility for a smooth switch over from inactive DOWN state to the active UP state, makes the mutant containing variants more virulent in terms of its smoother binding with ACE2.
E484K Mutation
This mutation has become a cause for concern. Variants containing this mutation are more likely to escape neutralization from antibodies, meaning that many antibodies that target the RBD of the spike protein may use the E484 residue as a pivotal binding region; removal of it greatly hinders binding and may render antibodies ineffective. Note that the change is from a negatively charged glutamic acid to a positively charged lysine.
As noted in Soh et al. 2021 (emphasis mine):
In addition, the E484K mutation found in the B.1.351 variant has equally been reported to increase binding affinity between the S protein and its receptor ACE2. In effect, the E484K mutation was reported to result in favorable electrostatic interactions, compensating the burial of the charged and polar groups upon binding of RBD with hACE2. This compensation in turn significantly improves the RBD-hACE2 binding affinity (Wang et al., 2021b). While the E484K mutation enhances binding of the S protein to its receptor, it reduces binding of neutralization antibodies. Reduced binding affinity of the neutralization antibodies allows for immune escape which contributes to increased transmissibility of variants bearing this mutation (Collier et al., 2021; Faria et al., 2021; Zhou et al., 2021).
The E484K example outlines something important to think about when examining binding affinities. Proteins aren’t locked into a static position, but are able to wiggle around to a more favorable position. The same goes for binding interactions; both proteins can wiggle until the binding region of both enter into a favorable position, similar to how you and a partner may wiggle around in a bed before finding your comfortable positions.
Rules to Consider
When you read an article and want to parse the information and figure out some answers for yourself use rules that we have outlined here.
Binding Region: If the mutation occurs in an antigen/receptor binding region think about electrostatic interactions. Changing characteristics such as charge or hydrophocity may lead to more favorable binding. Note that unfavorable mutations may occur as well (such as an N501D mutation) but because these are less favorable natural selection will selectively reduce their prevalence.
Outside Binding Regions: Mutations in this area may confer changes in flexiblity of the protein. Think of the special case amino acids; if a mutation removes/adds a proline or glycine it may be important to the structure’s flexibility.
Natural Selection/ Survival of the Fittest is always at play: This concept tends to get lost when discussing mutations but remember that viruses are constantly undergoing selective pressure. Unless external selective pressure such as antivirals are utilized viruses will tend to accumulate random mutations, and whether or not these mutations make the virus better adapted will determine which variants win out. A viral infection will contain all types of variants so it is important to remember this when thinking about why certain strains gain dominance.
More pivotal viral regions have less mutation variability: This goes along with Rule 3 but was not mentioned previously. Proteins that are less important to SARS-COV2 will undergo many more mutations and variability since changes in these proteins are less likely to affect the function of the virus. However, more important regions such as the RBD are double-edged sword regions. As we outlined previously, a mutation that reduces binding will quickly kill off this strain of the virus and thus is less likely to propagate. This means that the RBD tends to be heavily conserved. It’s also why different variants tend to carry similar mutations in vital regions since very few mutations will be beneficial to the virus in these regions.
Using what we have outlined here see if you can figure out some information about the new R.1 variant from this Forbes article. I would personally consider this a good article since it outlines the mutations and a little bit of information about each mutation. Hopefully with the information here you can learn to piece this information together for yourself!
Thank you for reading my newsletter. If you enjoy my articles please consider becoming a free subscriber in order to receive notifications.
And share with others who may find these newsletters interesting.
Also, please consider becoming a paid member. The research and work put into these articles takes many hours and being a paid subscriber allows me to continue to do this full time.
Answer:
In a case study published in the NEJM an immunocompromised patient acquired several SARS-COV2 mutations. As indicated in the table one of these mutations was a Q493K mutation, indicating a swap from a polar side chain to a positively charged one which has greater affinity for the negatively charged glutamic acid. There is also evidence of a Q493R escape mutation as well.
In-Text Citations:
Peck et. al. 2018. Complexities of Viral Mutation Rates. Taken from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6026756/
Shang et. al. 2020. Structural basis of receptor recognition by SARS-CoV-2. Taken from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7328981/
Laha et. al. 2020. Characterizations of SARS-CoV-2 mutational profile, spike protein stability and viral transmission. Taken from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7324922/
Soh et al. 2021. The rapid adaptation of SARS-CoV-2–rise of the variants: transmission and resistance. Taken from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8390340/
Additional Info:
Viral Mutations: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5075021/
D614G Mutation: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693302/
Variants: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8252702/
This is a wonderful overview of how mutations arise and the consequences of mutations. The case examples were very interesting and showed how many variants can arise over time even in just one person. I would think a drug such as molnupiravir (nucleoside analog) would be helpful in such cases where current mAB treatments are not effective - especially as a last ditch effort to save lives of older (non-child bearing age) patients. Although data from India for molnupiravir is not looking good for severe disease. So much we don’t know yet…