GENOME AND GENETIC VARIATIONS OF CORONAVIRUSES

The knowledge of genetics, using innovative technologies and methodologies that allow the analysis and comparison of multiple genomes of viruses, vectors and hosts, as a function of time and space is very informative because it allows the understanding of the origin, evolution and geographical dispersion of the disease.


INTRODUCTION
The term genome, introduced in 1920 by Hans Winkler, was used to refer to all the hereditary information of an organism that is encoded in its DNA or RNA, or simply, to the complete DNA sequence of a set of chromosomes.Today, the concept of genome comprises the information needed to build, maintain and know the evolutionary history of an organism (Cristescu, 2019).

CORONAVIRUS GENOME
The genome of coronaviruses is a single-stranded, positive-sense RNA molecule whose size ranges from 27 to 32 kpb and contains at least six "Open Reading Frames" (ORFs).The first ORF (ORF1a/b), located at the 5' end, occupies about two-thirds of the genome and encodes polyprotein 1a,b (pp1a, pp1b).The remaining ORFs are located at the 3'end and encode at least four structural proteins: the capsid/envelope spiculated gliprotein (S), responsible for recognition of host cell receptors; membrane proteins (M), responsible for the shape of the capsid/envelope; capsid/envelope proteins (E), responsible for virus assembly and release; nucleocapsid proteins (N) involved in genome packaging that play a role in pathogenicity as an interferon (IFN) inhibitor.There are also species-specific structural and accessory proteins, such as HE, 3a/b and 4a/b proteins (Alanagreh et al., 2020).N, 3a, 6, 7a, 7b, 8) (Source: Fernandez-Rua, 2020)

GENETIC VARIATIONS
Despite the small size of the coronavirus genome, the adaptive evolutionary process in different host environments and the wide geographic dispersion allows recording structural and functional genetic changes.Comparative genetic analysis studies have shown that coronavirus genomes retain between 50% and 95% similarity.The genome of SARS-CoV-2, which infects humans, appears to contain up to 15 genes very similar to SARS-CoV found in Manis javanica (Pangolin) and in bats, especially with Beta-CoV found in bats, at 96.2%, and with Bat-CoV-RatTG13, at 79.2% with SARS-CoV (Zhu et al., 2020).
Although the similarity of the genomes is remarkable, the S protein subunit of Pangolin virus showed greater similarity with SARS-CoV-2 than with SARS-CoV and Bat-RaTG13.Group S; and the L84S amino acid substitution mutation, which co-evolves with three other mutations: 55 nsp4-F308Y, ORF3a-G196V, and N-S197L.These last three mutations, together with S197L mutations and P13L substitution in the N protein are infrequent; the fourth most frequent mutation is ORF3a-Q57H found in 734 sequences; this is followed by N-R203K and N-G204R mutations) found in Europe; and finally, we mention nsp6-L37F and ORF3a-G251V mutations, corresponding to Group V. Dorp and Balloux (2020) concluded that the genomic sequences studied share a common ancestor that corresponds to the period when SARS-CoV-2 first infected man.These authors identified regions of the genome that did not vary and 198 recurrent mutations, with approximately 80% producing nonsynonymous changes in the S protein; over 15% recurrent mutations in the Nsp6, Nsp11 and Nsp13 regions of ORF1ab and the S protein, indicating convergent evolution of particular interest in the adaptive process of SARS-Cov-2 to man.Eskier et al., (2020) in studying the effects of RNA-dependent RNA polymerase (RdRp) mutations, in particular the 4408C> mutation, on the mutation rate and dispersal of the virus, concluded that the 14408C>T mutation increases the mutation rate, while the 15324C>T mutation of RdRp, has the opposite effect, suggesting that the 14408C>T mutation may have contributed to the dominance of its co-mutations in different regions.Vankadari (2020), in an analysis of complete viral genomic sequences from 12 different countries, identified 47 SNPs with impact on virulence and response against antivirals, with the Nsp1, RdRp "pico" gliprotein proteins and the ORF8 region mutated within 3 months of human transmission.

CONCLUSIONS
Although the coronavirus genome is small and comprises a single-stranded RNA molecule, the positive direction of the molecule allows it to translate rapidly immediately after infection of host cells, thus producing the proteins necessary for its replication.On the other hand, the high infection rate, the high specificity of the "spike" protein (S) for the host cell receptor, and the wide geographic dispersion are qualities inherent in the viruses' ability to mutate structurally and functionally.The query shows the large number of mutations recorded in virus genomes, with varying degrees of importance, and have been classified as recurrent (frequent in space and time), non-synonymous (point mutations, i.e. involving single nucleotide substitution), and "hot spot" regions (regions where mutations occur at a higher frequency) have been identified.

Fig 3 -
Fig 3 -Genetic similarity and variations of the genomes of SARS-Cov-1, SARS-Cov Yang et all., (2020)  identified six phylogenetic clusters with geographic preference in the analysis of 1932 complete genome sequences of SARS-CoV-2.These authors believe that single nucleotide variations (SNVs) in genomes underlie the results, as 2023 VOL 3 Nº 3 well as contribute to detection, clinical treatment, drug design and vaccine development against the virus.