The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- Fixed malformed STAR ALIGN command which caused the pipeline to fail when RNASeq data was provided #185
- Nextflow!>=25.04.0
- [email protected]
- Updated the nf-core template to
3.4.1 - Update modules and sub-workflows from nf-core/modules and GallVp/nxf-components
- Add parameter
--tagsto support nf-shard tags - Added parameter
--strict_fasta_id_validationto disable strict Fasta ID validation for NCBI assemblies with Fasta IDs containing periods ('.')
- Fixed an issue where some genes with introns shorter than 10bp were not marked for exclusion correctly #89
- Fixed an issue where Fasta IDs such as
>lcl|Lmh1Chr1caused the pipeline to fail due to IDs being changed by BRAKER3 #161 - Fixed an issue which caused a pipeline crash where there were empty lines in the protein evidence file #166
- Fixed an issue where the pipeline failed when liftoff Fasta and Gff had the same file name and the Gff extension was
gff3#160 - Fixed an issue where STAR_ALIGN failed on large genomes in sorting the output BAM #163
- Nextflow!>=25.04.0
- [email protected]
| Tool | Old Version | New Version |
|---|---|---|
| agat | 1.4.2 | 1.5.1 |
| ltr_harvest_parallel | 1.1 | 1.2 |
| braker3 | v3.0.7.6 | v3.0.7.5 |
| busco | 5.8.3 | 6.0.0 |
| diamond | 2.1.8 | 2.1.12 |
| fastp | 0.24.0 | 1.0.1 |
| multiqc | 1.28 | 1.32 |
| htslib | 1.21 | 1.22.1 |
| samtools | 1.21 | 1.22.1 |
| sortmerna | 4.3.6 | 4.3.7 |
| umi_tools | 1.1.5 | 1.1.6 |
- Now using
agat_sp_complement_annotations.plto merge Liftoff and BRAKER models to avoid creation of iso-forms due to overlap of separate genes #153
- Nextflow!>=24.04.2
- [email protected]
- Gene models from BRAKER with invalid ORF(s) are now removed #151
- Demoted nf-schema to 2.2.0 to avoid errors with latest Nextflow versions
- Fixed a nextflow syntax issue in
conf/modules.config
- Nextflow!>=24.04.2
- [email protected]
- Added parameter
append_genome_prefix_to_feature_idswhich allows the user to add genome prefixes defined in the assemblysheet to the final Gff/Fasta files #135 - Updated nf-core template to 3.2.0
- Fixed an issue where
filter_genes_by_aa_lengthwas not correctly applied when the CDS was shorter than the transcript by replacingGFFREADwithAGAT_SPFILTERBYORFSIZE#139
- Nextflow!>=24.04.2
- [email protected]
| Tool | Old Version | New Version |
|---|---|---|
| agat | 1.4.0 | 1.4.2 |
| braker3 | v3.0.7.5 | v3.0.7.6 |
| busco | 5.7.1 | 5.8.3 |
| coreutils | 8.30 | 9.5 |
| fastp | 0.23.4 | 0.24.0 |
| multiqc | 1.25.1 | 1.28 |
| seqkit | 2.8.1 | 2.9.0 |
| htslib | 1.18 | 1.21 |
| samtools | 1.18 | 1.21 |
| star | 2.7.10a | 2.7.11b |
- Added cDNA and CDS outputs to <OUTPUT_DIR>/annotations/ directory #118
- Added parameter
add_attrs_to_proteins_cds_fastas - Added parameter
filter_genes_by_aa_lengthwith default set to24which allows removal of genes with ORFs shorter than 24 #125
- Fixed an issue where TSEBRA failed because LIFTOFF lifted non-protein coding genes #121
- Switched branch name from
mastertomainin the GHA CIs - Fixed an issue in
genepal_report.Rmdwhich caused the pangene matrix plot to fail when the number of clusters exceeded 65536 #124 - Fixed an issue where
GENEPALREPORTprocess failed due to OOM kill signal from SLURM #123 - Fixed an issue where Gff merge after liftoff failed when one of the Gff files did not contain any genes
- Fixed an issue where
gxf_fasta_agat_spaddintrons_spextractsequencescrashed due to short introns #89
- Nextflow!>=24.04.2
- [email protected]
- Removed parameter
add_attrs_to_proteins_fasta
- Added MultiQC #65
- Updated nf-core template to 3.0.2 #66
- Integrated nf-test into pipeline CI #68
- Updated the flowchart #87
- Added a large test dataset for the
test_fullprofile #90 - Now
.gff.gzand.gff3.gzinputs are also allowed for thebenchmarkcolumn in--input - Now removing liftoff genes with any intron shorted than 10bp #89
- Now also removing
rRNAandtRNAafter liftoff as the downstream logic in the pipeline can not correctly handle these - Now skipping FastQC by default #98
- Added an HTML report #44
- Added content type as text/html for the MultiQC and genepal reports
- Added sra-tools for RNASeq data download #102
- Now using
${meta.id}_trimas prefix forFASTQCfiles - Updated citations to include DOIs
- Fixed a bug where FASTQ versions were not correctly captured
- Now using the correct out channel from
STAR_ALIGN. This bug was introduced by a module update during the development of this version #74 - Fixed OrthoFinder results copy failure on AWS #108
- Nextflow!>=24.04.2
- [email protected]
- Resource parameters have been removed:
max_memory,max_cpus,max_time - Removed a number of unnecessary parameters:
monochromeLogs,config_profile_contact,config_profile_url,validationFailUnrecognisedParams,validationLenientMode,validationSchemaIgnoreParams,validationShowHiddenParams,validate_params - Removed
extra_fastp_argsand replaced it withfastp_extra_args - Removed and replaced
skip_fastpandskip_fastqcwithfastp_skipandfastqc_skip#82
- Added
orthofinder_annotationsparam - Added
FASTA_GFF_ORTHOFINDERsub-workflow - Added evaluation by BUSCO #41
- Included common tax ids for eggnog mapper #27
- Implemented hierarchical naming scheme: geneI.tJ, geneI.tJ.exonK, geneI.tJ.cdsK #19, #34
- Now sorting list of bam and list of fastq before cat to avoid resume cache misses
- Allowed BAM files for RNA evidence #3
- Added
GXF_FASTA_AGAT_SPADDINTRONS_SPEXTRACTSEQUENCESsub-workflow for splice type statistics #11 - Changed
orthofinder_annotationsfrom FASTA/GFF to protein FASTA #43 - Added param
enforce_full_intron_supportto turn on/off strict model purging by TSEBRA #21 - Added param
filter_liftoff_by_hintsto evaluate liftoff models with TSEBRA to make sure they have the same level of evidence as BRAKER #28 - Added a script to automatically check module version updates
- Reduced
BRAKER3threads to 8 #55 - Now the final annotations are stored in the
annotationsfolder #53 - Now a single
fastafile can be directly specified forprotein_evidence eggnogmapper_db_diris not a required parameter anymoreeggnogmapper_tax_scopeis now set to 1 (root div) by default- Added a
testprofile based on public data - Added parameter
add_attrs_to_proteins_fastato enable/disable addition of decoded gff attributes to proteins fasta #58 - Added a check for input assemblies. If an assembly is smaller than 1 MB (or 300KB in zipped format), the pipeline errors out before starting the downstream processes #47
- Now
REPEATMASKERGFF output is saved viaCUSTOM_RMOUTTOGFF3#54 - Added
benchmarkcolumn to the input sheet and usedGFFCOMPAREto perform benchmarking #63 - Added
SEQKIT_RMDUPto detect duplicate sequence and wrap the fasta to 80 characters - Updated parameter section labels for annotation and post-annotation filtering #64
- Updated modules and sub-workflows
- Fixed BRAKER spellings #36
- Fixed liftoff failure when lifting off from a single reference #40
- Added versions from GFF_STORE sub-workflows #33
- NextFlow!>=23.04.4
- nf-validation=1.1.3
- Renamed
external_protein_fastasparam toprotein_evidence - Renamed
fastqparam torna_evidence - Renamed
braker_allow_isoformsparam toallow_isoforms - Moved liftoffID from gene level to mRNA/transcript level
- Moved
version_check.shto.github/version_checks.sh - Removed dependency on https://github.com/kherronism/nf-modules.git for
BRAKER3andREPEATMASKERmodules which are now installed from https://github.com/GallVp/nxf-components.git - Removed dependency on https://github.com/PlantandFoodResearch/nxf-modules.git
- Now the final annotations are not stored in the
finalfolder - Now BRAKER3 outputs are not saved by default #53 and saved under
etcfolder when enabled - Removed
localprofile. Local executor is the default when no executor is specified. Therefore, thelocalprofile was not needed. - Removed
CUSTOM_DUMPSOFTWAREVERSIONS pipeline_info/software_versions.ymlhas been replaced withpipeline_info/genepal_software_mqc_versions.yml
- Added a stub test to evaluate the case where an assembly is soft masked but has no annotations
- Fixed a bug where
is_maskedwas ignored by the pipeline - Fixed a bug in param validation which allowed specification of
braker_hintswithoutbraker_gff3
- NextFlow!>=23.04.4
- nf-validation=1.1.3
- Increased time limit for REPEATMODELER_REPEATMODELER to 5 days
- Now removing comments from fasta file before feeding it to BRAKER added tests for the perl one liner
- Fixed CHANGELOG version check failure in
version_check.sh - Increased the SLURM job time limit to 14 days
- NextFlow!>=23.04.4
- nf-validation=1.1.3
- Increased time limit for REPEATMODELER_REPEATMODELER to 3 days, REPEATMASKER to 2 days, EDTA_EDTA to 7 days, BRAKER3 to 7 days and EGGNOGMAPPER to 1 day
- NextFlow!>=23.04.4
- nf-validation=1.1.3
- Added changelog and semantic versioning
- Changed license to MIT
- Updated
.editorconfig - Moved .literature to test/ branch
- Renamed
genepal_localtolocal_genepal - Renamed
genepal_pfrtopfr_genepal - Added versioning checking
- Updated github workflow to use pre-commit instead of prettier and editorconfig check
- Added central singularity cache dir for pfr config
- Added
SORTMERNA_INDEXbeforeSORTMERNA - Fixed sample contamination bug introduced by
file.simpleName - Now using empty files for stub testing in CI
- Now BRAKER can be skipped by including BRAKER outputs from previous runs in the
target_assembliesparam - Added
gffcompareto merge liftoff annotations - Renamed
samplesheetparam tofastq - Now using assemblysheet in combination with nf-validation for assembly input
- Added nextflow_schema.json
- Now using nf-validation to validate fastqsheet provided by params.fastq
- Moved
manifest.configandreporting_defaults.configcontent tonextflow.config - Now using a txt file for
params.external_protein_fastas - Now using nf-validation for
params.liftoff_annotations - Now using nf-validation for all the parameters
- Added
PURGE_BRAKER_MODELSsub-workflow - Added
GFF_EGGNOGMAPPERsub-workflow - Now using a custom version of
GFFREADwhich supportsmetaandfasta - Now using TSEBRA to purge models which do not have full intron support from BRAKER hints
- Added params
eggnogmapper_evalueandeggnogmapper_pident - Added
PURGE_NOHIT_BRAKER_MODELSsub-workflow - Now merging BRAKER and liftoff models before running eggnogmapper
- Added
GFF_MERGE_CLEANUPsub-workflow - Now using
descriptionfield to store notes and textual annotations in the gff files - Now using
mRNAin place oftranscriptin gff files - Now
eggnogmapper_purge_nohitsis set tofalseby default - Added
GFF_STOREsub workflow external_protein_fastasandeggnogmapper_db_dirare not mandatory parameters- Added contributors
- Add a document for the pipeline parameters
- Updated
pfr_genepalandpfr/profile.config - Now using local tests/stub files for GitHub CI
- Now removing iso-forms left by TSEBRA using
AGAT_SPFILTERFEATUREFROMKILLLIST - Added
pyproject.toml - Now using PFAMs from eggnog if description is '-'
- Removed liftoff models with
valid_ORF=False - Updated license text to include 'Copyright (c) 2024 The New Zealand Institute for Plant and Food Research Limited'
- NextFlow!>=23.04.4
- nf-validation=1.1.3