Opinion: New Covid-19 Origin Data Are Highly Flawed And Don’t Solve Anything.

Shin Jie Yong

And we will forever be skeptical about the origin of Covid-19.

Image by rawpixel.com

Last month, two major studies on the origin of Covid-19 were published in the leading academic journal, Science, which serve as the smoking gun for the natural or wet market origin of Covid-19.

I was somewhat convinced too as I read the studies, only to find out later that those studies have serious flaws and still solve nothing about the origin of Covid-19, which may, in fact, never get solved.

Study 1: Geographical analyses

In the first study, Worobey et al. extracted the geographical coordinates of the 155 most early Covid-19 cases in Huanan, Wuhan, in December 2019 from the 2021 China-WHO joint investigation report.

Using those coordinates, they created a map to illustrate the proximity of the early Covid-19 cases to the wet market:

Worobey et al. (2022)

From the map, it’s obvious that cases with no known link (market-unlinked) to the market (blue dots) reside closer to the market than cases (market-linked) with a known link (orange dots).

Market-unlinked cases are cases who did not (i) work at the market, (ii) know anyone who did, or (iii) visit the market recently.

And the median distance between market-unlinked cases with the market (4km) is significantly shorter than that between market-linked cases with the market (5.74km). The center point of market-unlinked cases to the market (0.91km) is also significantly shorter than market-linked cases (2.28km).

Worobey et al. also showed that the early Covid-19 cases, on average, reside closer to the market than the general population distribution. Specifically, the general population stays about 8.3 km away from the market. But the early Covid-19 cases were only about 1.95km to 4.28km away.

Next, Worobey et al. analyzed the SARS-CoV-2-positive environmental samples data — showing that (i) distance to the nearest live mammal market and (ii) distance to the nearest human Covid-19 cases are predictive factors of environmental sample positivity — such as cages, hair remover, and carts.

Overall, the crux of this paper is that, unexpectedly, market-unlinked cases reside at locations closer to the market than market-linked cases. This suggests that some of the early Covid-19 cases with no known link to the market — which was used as an argument to discredit the natural origin theory — are linked to the market in terms of proximity.

How study 1 is flawed

This study’s findings depend on one fundamental factor — the 155 early Covid-19 cases data extracted from the 2021 China-WHO joint report.

But this fundamental factor is fundamentally flawed, as identified by Alina Chan, Ph.D., a molecular biologist at the Broad Insitute of MIT and Harvard, co-author of the book Viral: The Search for the Origin of COVID-19.

Based on the 2021 China-WHO joint report, the early Covid-19 cases were searched and examined based on their potential link to the market:

Annexure of the 2021 China-WHO joint report, page 125.

This information is in line with what the China CDC reported in 2020:

“On December 29, 2019, a hospital in Wuhan admitted four individuals with pneumonia and recognized that all four had worked in the Huanan Seafood Wholesale Market, which sells live poultry, aquatic products, and several kinds of wild animals to the public. The hospital reported this occurrence to the local center for disease control (CDC), which lead Wuhan CDC staff to initiate a field investigation with a retrospective search for pneumonia patients potentially linked to the market. The investigators found additional patients linked to the market, and on December 30, health authorities from Hubei Province reported this cluster to China CDC.”

Basically, after only four Covid-19 cases were identified from the market, local investigators looked for more cases that might be linked to the market — a clear ascertainment or sampling bias.

“This cannot be stressed enough: Early cases with no links to the market had been identified almost solely through searching the hospitals near the market and the neighborhood of the market,” Chan wrote. “This is why even the unlinked cases look like they cluster around the market.”

It was only after mid-January 2020 that Covid-19 cases were searched regardless of any potential link to the market. As the China-WHO joint report stated, “An association with the Huanan market was identified among some of the earliest recognized cases and, for a short period until mid-January 2020, exposure to the Huanan market was included in the case definition.”

But Worobey et al. identified their early Covid-19 cases in December 2019, indicating that all of their early cases were biasedly searched based on their potential link to the market — a clear ascertainment or sampling bias.

Worobey et al. did admit that sampling bias could be present in their results, but ultimately ruled it out, with the main justification being that “the December 2019 COVID-19 cases we consider here were identified based on reviews of clinical signs and symptoms, not epidemiological factors such as where they resided or links to the Huanan market.”

But those who read the China-WHO joint report can obviously tell that “this statement in their Science paper directly conflicts with the source of their early case information (the China-WHO joint report), and reveals that the authors don’t understand how early cases had been identified,” Chan wrote.

While Worobey et al. did provide more justifications as to why sampling bias was not present in their study, these were counterargued in Chan’s article that comprehensively reviewed Worobey et al.’s findings.

“In summary, all of the observations that Worobey et al. cite as signs that the market was the outbreak epicenter can be easily explained by the ascertainment bias with which early cases were identified,” Chan ended.

Study 2: Genomic analyses

In the second study, Pekar et al. analyzed the genomic data of early Covid-19 cases, detected before February 2020, to decode how SARS-CoV-2 evolved. In the beginning, two lineages of SARS-CoV-2 exist: A and B, first sampled on 24 and 30 December, respectively. But lineage B took hold and dominated, which later split into multiple lineages and variants of SARS-CoV-2.

To pinpoint the origin of lineages A and B, Pekar et al. re-constructed the evolutionary tree of early SARS-CoV-2 genomic sequences. Using a series of bioinformatic and modeling analyses, Pekar et al. could not pinpoint the most recent common ancestor (MRCA) of lineages A and B; i.e., the SARS-CoV-2 progenitor that split into lineages A and B.

So, Pekar et al. deduced that lineages A and B did not evolve from one another or from one MRCA, but rather evolved independently from two separate spillover or introduction events — thus two MRCAs or progenitors — approximately on 18 and 25 November 2019, respectively.

SARS-CoV-2 is unlikely to have spread silently before November 2019, Pekar et al. noted. This is because no probable Covid-related hospitalizations were found from hospital records, and no SARS-CoV-2-positive samples were found among the thousands of blood samples collected, before November 2019.

Such two spillovers are likely successful from a total of 8 spillover events, the modeling analyses suggest. So, SARS-CoV-2 likely infected humans eight times in the market, but only two of the infections led to the establishment of lineages A and B, which subsequently sparked the pandemic.

As it’s much more likely for two animal-human spillover events than two lab leaks to occur, this study puts a major blow to the lab origin hypothesis of Covid-19. For the lab origin to work, it relies on a single event — a lab leak. But if two independent spillover events occurred in the market, it becomes very unlikely for two similar lab leaks to occur in quick succession in the same place. Whereas virus spillover from animals can occur whenever there’s animal-human contact, such as in the market that sold live mammals.

How study 2 is flawed

It was Chan again who highlighted the flaws in Pekar et al.’s study.

Chan argued that because lineages A and B differed by only one mutation, they are not too distinct from one another to warrant two separate spillover events. So, Chan suggested it’s more plausible that one spillover event split into lineages A and B. And Chan cited that SARS-CoV-2 has a 5–10% chance of picking two mutations every time it infects a human.

Pekar et al. proposed the two spillover events hypothesis because of one fundamental assumption — that SARS-CoV-2 first emerged from the Huanan wet market in Wuhan, per Worobey et al.

But this assumption is flawed.

Among the early Covid-19 cases in the market, all actually belonged to lineage B. So, to corroborate the wet market origin assumption, lineage A has to be found in the market as well, especially when lineage A likely emerged before lineage B, based on the SARS-CoV-2 evolutionary tree.

To this end, Pekar et al. cited another study that found a glove tested positive for SARS-CoV-2 of lineage A — out of 1380 environmental and animal samples — as evidence that lineage A started in the market. But this glove was tested on January 1, 2020, which is rather late to use it to infer that lineage A started in the market. Plus, the genome isolated from the glove sample also has other mutations, indicating that it’s not an ‘early’ Covid-19 case.

“However, Worobey and Pekar et al. assert that this non-definitive piece of evidence points to a second spillover of the virus from (still missing) animals to people at the market,” Chan critiqued.

In fact, the evidence is stronger that lineage A emerged outside of the market, Chan pointed out, citing two studies showing that (i) several early patients who did not visit the market and (ii) a family from Shenzhen who visited Wuhan but not the market were infected with SARS-CoV-2 of lineage A.

So, given that (i) all market-linked Covid-19 cases belong to lineage B, which comes after lineage A, but (ii) lineage A is more likely to have emerged outside than inside the market, it’s unreasonable to assume that the market is the origin of Covid-19.

The more plausible scenario is that the SARS-CoV-2 progenitor was introduced to the market, which acted as an amplifier or superspreader site that sparked the pandemic, Chan argued.

And it’s also far-fetched to hypothesize two spillover events when we haven’t even confirmed how the first spillover event occurred, especially when such a hypothesis is based on the flawed assumption of the market origin of Covid-19 (Worobey et al.’s findings) and flawed data (the glove sample).

In fact, another bioinformatics study published this year also re-constructed the evolutionary tree of over 68,000 SARS-CoV-2 genomes, and concluded that only one common ancestor or progenitor likely exists, not two.

National Geographic consulted several experts on this discrepancy and reported that both findings “should be taken with a grain of salt,” because “these studies used different methods to infer viral evolution based on observed genomic sequences, and both approaches have uncertainties.”

“Ultimately, it should be clarified that there was only 1 strain of the virus that emerged in Wuhan in late 2019,” Chan concluded. “Describing variants that only differ by 2 mutations as 2 different strains has (mis)led people into arguing about whether it is more or less likely for 2 spillovers at a market to occur as opposed to 2 leaks in a laboratory.”

Before I read Chan’s work, I, among others, also thought that Pekar et al.’s findings had shifted the landscape to favor the natural origin of Covid-19. But since Pekar et al.’s study is highly flawed, their results are likely invalid.

The studies still solve nothing

The lab leak or origin hypothesis could mean two things, of which the first one is more plausible and grounded in reality:

  • Coronaviruses sampled from nature and researched, which may or may not involve gain-of-function experiments, might have accidentally leaked from the lab. Gain-of-function is the artificial enhancement of microbes, such as making them more infectious or deadly, in order to learn how dangerous they can be.
  • Bioengineering to create a new coronavirus that might have leaked from the lab accidentally or deliberately. Bioengineering creates or designs while gain-of-function modifies or improves existing microbes.

Whereas the natural origin of Covid-19 means that coronaviruses from bats have jumped or spillover to humans —i.e., zoonotic transmission — which may or may not involve an intermediate animal host mediating the spillover.

While the pair of studies by Worobey et al. and Pekar et al. showed that (i) the Huanan wet market in Wuhan, which sold live mammals, is the epicenter of the Covid-19 pandemic, and (ii) early lineages A and B of SARS-CoV-2 likely emerged from two independent spillover events in the market, they are fundamentally flawed that their conclusions are uncertain.

In addition to their flaws, the pair of studies still fail to answer major questions on the origin of Covid-19.

It doesn’t explain how the SARS-CoV-2 progenitor ended up in the Huanan wet market in Wuhan, especially when the nearest bat (the only natural reservoir of coronaviruses) cave is over 1500km away in Yunnan. Bats in Yunnan don’t fly to Wuhan. And the wet market did not sell any bats. In contrast, the Wuhan Institute of Virology, known for its advanced coronavirus gain-of-function research, is just 13km away from the market.

It also lacks the smoking gun evidence: the identification of the SARS-CoV-2 progenitor that spillover the humans, which could be from an intermediate animal host, like raccoons or minks, or even a human, who could be either infected by another animal carrying the virus or by a lab virus.

“They don’t have samples from animals that had the virus. That’s what they’d like to have, and they’d like to be able to trace those animals back to the farms from which they came and see whether people in those farms had been exposed to the virus or viruses,” Jonathan Stoye, a virologist at the Francis Crick Institute in the UK, who was not involved in the research, said.

But such an animal sample is likely never to be found. “The animal or animals that carried coronavirus are almost certainly long dead: shipped off and sold for meat, or killed in one of the mass culls that took place in early 2020 as the Chinese authorities clamped down on the live animal trade,” Wired reported.

Even if such an animal sample harboring the SARS-CoV-2 progenitor is finally found, not all will be convinced if that’s the full story.

Skeptics will question if the Wuhan Institute of Virology (WIV) truly played no role in Covid-19 emergence, as well as if China is being honest, without planting misleading evidence to shift attention away from the labs.

But will thorough investigations into theWIV solve the Covid-19 origin problem? It’s very improbable.

Not only are such investigations unlikely to be allowed, but nearly three years have also passed since SARS-CoV-2 first emerged in December 2019. That’s plenty of time to get rid of evidence if there’s a need to do so. This means that not everyone will be convinced if an independent investigation into the WIV in the future says that the lab origin hypothesis is false.

So, disproving the lab origin hypothesis may be impossible since the possibility that evidence might have been erased already exist. It may already be too late to send independent investigators to labs in Wuhan and expect transparent results that would convince everyone.

All in all, “We Will Forever Be Skeptical About the Origin of Covid-19.”

Comments / 3

Published by

MSc Biology | 7x first-author academic papers | 250+ articles on coronavirus | Freelance medical writer


More from Shin Jie Yong

Comments / 0