MOSCOW, June 9. /TASS/: US epidemiologists analyzed the structure of over 10,000 coronavirus specimens from various regions of the world and concluded that currently there are six SARS-CoV-2 varieties spreading across the globe. The research is published on the bioRxiv website.
"The genomic epidemiology of the 10,422 SARS-CoV-2 isolates studied here show six predominant clonal complexes (CCs) circulated/circulating globally," the researchers say.
Since the early days of COVID-19 outbreak in China, scientists seek to understand how and in what direction the coronavirus evolves. It is known today that it accumulates mutations at about the same rate as the flu virus does, however it is still a matter of discussion among scientists what these mutations have led to.
In particular, in early March, Chinese biologists announced that the virus could have split into two subtypes, S and L, which feature different severity of symptoms and spread speed. Other scientists doubted that, noting that changes in virus’s common gene pool could have been caused by random processes, and not by real differences in virulence of the SARS-CoV-2 subtypes. In May, scientists from the UK and Australia revealed that there were three different coronavirus varieties circulating among the human population.
Genetic census of viruses
A group of US epidemiologists led by Paul Planet of the University of Pennsylvania conducted a new, wider analysis of this kind, studying and comparing genomes of over 10,000 coronavirus specimens, harvested from recovered people across the globe throughout the last six months.
To analyze such vast set of data, the scientists used the experience they gained by conducting wide-scale censuses of various strains and types of bacteria. To do that, they utilized a special algorithm, which went through an entire genetic base, deleting those parts of genome and gene fragments that were either entirely the same or identical from a functional standpoint.
Such an approach makes it possible to significantly reduce the volume of the data being analyzed and to single out those variations of bacterial DNA or viral RNA that uniquely change the structure of proteins or signaling molecules and alter the character of their carriers’ vital functions. This method also makes it possible to unite the disease causative agents into large groups, also known as Clonal Complexes (CCs), which include related strains of bacteria and viruses.
Utilizing this approach, the scientists processed genomes of various SARS-CoV-2 strains and combined them into six major CCs, which turned out to be highly unequally spread from a geographical standpoint. For example, viruses from CC256 and CC258 groups are mostly present in the US, CC70 and CC225 are mostly typical of Europe, while a "European" CC300 coronavirus is spreading in South America.
Interestingly, such an analysis allowed Planet and his colleagues to single out mutations, common for a significant share of coronavirus strains, and to discover several regions, whose structure hardly changed throughout the entire epidemic. These regions, the scientists believe, are especially important for the spread of the SARS-CoV-2. Their study might help create drugs and vaccines that would affect all coronavirus strains, the researchers conclude.