Skip to content

nextclade: first stab at nextclade genome dataset#48

Draft
rneher wants to merge 2 commits intomainfrom
nextclade/genome
Draft

nextclade: first stab at nextclade genome dataset#48
rneher wants to merge 2 commits intomainfrom
nextclade/genome

Conversation

@rneher
Copy link
Copy Markdown
Member

@rneher rneher commented Aug 15, 2025

This uses the measles full genome workflow as a template for rubella.

@rneher rneher marked this pull request as draft August 15, 2025 12:02
The reference tree used in this dataset includes sequences for the 28 reference strains, along with (nearly) complete genomes of other representative strains for most genotypes.
This dataset can be used to assign genotypes to any sequence that includes at least 400 bp of the N450 region, including whole genome sequences.
In addition, this dataset implements simple quality control metrics based to the amount of missing sequence, the number of ambiguous nucleotides, frameshifts or stop codons, and clusters of mutations relative to sequences in the reference tree.
The reference tree used in this dataset includes uses a complete rubella virus genome, whole the nomenclature by the WHO is typically defined based on the E1 segment.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The reference tree used in this dataset includes uses a complete rubella virus genome, whole the nomenclature by the WHO is typically defined based on the E1 segment.
The reference tree used in this dataset includes uses a complete rubella virus genome, while the nomenclature by the WHO is typically defined based on the E1 segment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants