Gaps in global data hindered scientists' efforts to solve one of the biggest mysteries in New Zealand's Covid-19 saga - where in the world Auckland's August outbreak originated.
Genome sequencing creates a "genetic fingerprint" of a virus that's infected a person, helping public health officials untangle different cases involved an outbreak.
In New Zealand's first wave of Covid-19, scientists sequenced the genomes of 649 separate cases to reveal nearly 300 different introductions from different parts of the world.
Sequencing proved just as crucial in the August outbreak, helping pick apart Auckland community cases – and effectively informing the response to the cluster in real time.
In that case, scientists were quickly able to show the fresh Covid-19 cases belonged to a single cluster - and hence had stemmed from a single introduction.
But, according to a new study published before peer review, successfully pinpointing the origin of the outbreak was impeded by "substantial" biases and gaps in global sequencing data.
That was despite what's been a sprawling international effort to harness the power of sequencing against Covid-19.
A full genome of the SARS-CoV-2 virus behind Covid-19 was published within two weeks of its discovery - and since then, more than 160,000 genomes have been shared globally.
Today, around 60 per cent of 185 countries that have reported Covid-19 cases have sequenced and shared genomes on the global GISAID database, which has played a major role in responses around the world.
But the quality of that data varied widely between countries.
The 42,000 genomes reported from the UK made up nearly 40 per cent of the global dataset, despite Britain recording just one per cent of positive international cases - while India, making up 18 per cent of cases, has contributed only 3 per cent of genomes.
For New Zealand, such disparities mattered when it came to quickly trying to track down the source of infection.
When Auckland's outbreak was discovered, initial sequencing showed the cases belonged to a lineage called B.1.1.1.
Of the countries that had contributed data, 40 per cent had genomes of this lineage - and remarkably, 85 per cent were from the UK and had been produced between March and September.
Further analysis showed those identified in Switzerland, South Africa, and England in August were the closest relatives of the viruses linked to that month's flare-up in Auckland.
Although scientists were able to estimate when the outbreak's first transmission happened - at some point between July 26 and August 13 - they concluded it was unlikely the source could be found anywhere within international data.
Genomes sequenced in New Zealand have also highlighted how much genomic diversity was likely missing.
In one dramatic example, 12 genomes taken from people aboard a single flight from India fell across at least four lineages - and represented more genomic mutations than was observed across New Zealand's entire first wave.
Its authors pointed out the dataset that local labs put together from the August outbreak was one of the most complete ever assembled - providing genomes for about 81 per cent of all the positive cases.
"Real-time genomic sequencing quickly informed track and trace efforts to control the outbreak, setting New Zealand on track to eliminate the virus from the community for the second time," the researchers wrote.
"The rapid genome sequencing of positive samples provided confidence to public health teams regarding links to the outbreak and identified that cases and sub-clusters were linked to a single genomic lineage, resulting from a single introduction event."
The timing and length of Auckland's Level 3 lockdown was partly informed on the data, they said.
But nevertheless, they added, the biased nature of global sampling - including the contribution of very few genome sequences from certain regions - "clearly limited" the power of genomics to tell scientists where the outbreak came from.
The authors said there should be "careful consideration" of these gaps when trying to link outbreaks to source countries - and that any analysis should consider a wide range of evidence.
The study's lead author, Otago University and ESR virologist Dr Jemma Geoghegan, stressed there was now a "ridiculous amount of data" that was easily accessible.
"It's just that some countries have got bigger priorities than sequencing - but I guess I'd argue that sequencing should be a priority."
She said the other big limitation to New Zealand's sequencing effort was not being able to extract enough information from some samples to create full genomes.
That was typically because of low levels of virus RNA - a fragile molecule that can degrade quickly - in such samples.
Scientists now think they have a way to complete these incomplete puzzles with a more sensitive assay that can work on lower concentrations and shorter pieces of RNA, when the team knew what mutations to look for.