Coronaviruses usually cause upper respiratory tract illness and are genetically classified into four major genera:
- Gammacoronavirus, and
The former two genera primarily infect mammals and the latter two usually infect birds.
There are six types of human coronaviruses that have been previously identified. These include:
HCoV229E and HCoV-NL63 which belong to the Alphacoronavirus genus; and HCoVHKU1, HCoV-OC43, which belong to severe acute respiratory syndrome coronavirus (SARS-CoV), and Middle East respiratory syndrome coronavirus (MERS-CoV), which belong to the Betacoronavirus genus.
Another one, called deltacoronaviruses, has not been found in human cases still.
Coronaviruses did not get worldwide attention until the 2003 SARS pandemic, followed by the 2012 MERS and, most recently, the 2019-nCoV outbreaks.
SARS-CoV and MERS-CoV are highly pathogenic.
Coronavirus genome is the genetic constituent of coronavirus, which consists of single stranded RNA (ribonucleoid) and helically symmetrical nucleocapsid proteins.
The coronaviruses genome size ranges between approximately 26,000 and 32,000 bases, and includes a variable number (from 6 to 11) of open reading frames (ORFs).
The first ORF represents approximately 67% of the entire genome which encodes 16 non-structural proteins (nsps), while the remaining ORFs codes for some structural proteins (proteins involved in structural maintenance) and accessory proteins (proteins that helps in the stability of primary proteins).
The four major structural proteins are:
Spike Surface Glycoprotein(S)
Its size is approximately 150 kilodalton. It helps in binding to receptors present on the host cell and determine host tropism.
Small Envelope Protein (E)
It is found in very small quantities in the virion (entire virus particle). Its size is approximately 18-25 kilodalton.
These are highly divergent but have common architecture. E protein helps in the assembly and release of the virus.
Matrix Protein (M)
M protein is the most abundant structural protein in the virion. It is small in size with approximately 25-20 kilodaltons.
Nucleocapsid Protein (N)
N protein is the only protein which is found in the nucleocapsid. N protein helps the viral genome to tether with the replicase-transcriptase complex (RTC) and subsequently in packaging the genome into the viral particles.
Systematic Comparison of 2019-nCoV and Several Other SARS and SARS-Like Viruses
The spike proteins of SARS-CoV and MERS-CoV bind to distinct host receptors via different receptor-binding domains (RBDs).
SARS coronavirus generally uses angiotensin-converting enzyme 2 (ACE2) as one of the main receptors with CD209L as an alternative receptor, whereas MERS-CoV uses dipeptidyl peptidase 4 (DPP4, also known as CD26) as the predominant receptor.
Recent analysis has reported that novel coronavirus 2019 (COVID-19) has a close evolutionary association with the SARS like bat coronaviruses.
For this study, scientists carry out in-depth genome annotations on the first three determined genomes of 2019-nCoV — HB01, HB04, and HB05 — and compared them to related coronaviruses, including 1,008 human SARS-CoV, 338 bat SARS-like CoV, and 3,131 human MERS-CoV.
The researchers found that the amino acids in 2019-nCoV were quite similar to SARS-CoV. They also identified some notable differences, such as:
- The 8a protein was present in SARS-CoV but the same protein was absent in COVID-19.
- The 8b protein was 84 amino acids long in SARS-CoV, but 121 amino acids long in COVID-2019.
- The 3b protein was 154 amino acids long in SARS-CoV, but there was only 22 amino acids long 3b protein in COVID-19.
On the basis of phylogenetic analysis of the whole genomes of the various viruses, the researchers have found that the COVID-19 was in the same Betacoronavirus clade as MERS-CoV, SARS-like bat CoV, and SARS-CoV.
Meanwhile, they also found that coronaviruse genome of 2019 has the highest similarity with a SARS-like bat coronaviruses, and less related to the MERS-CoVs.
Based on the close relationship between COVID-2019 and SARS coronaviruses or SARS like bat CoVs, findings of the amino acid substitutions in different proteins could shed light into how COVID-19 genome differs structurally and functionally from SARS-CoVs.
In total, there were 380 amino acid substitutions between the amino acid sequences of novel coronavirus in 2019 (COVID-19) and the corresponding consensus sequences (repeated sequences of amino acids) of SARS and SARS-like viruses.