Skip to content

Commit c871bbc

Browse files
authored
Merge pull request #62 from bcgsc/l_polish_mode
Allow -l to be used in polish mode
2 parents ab94419 + 6882ab6 commit c871bbc

2 files changed

Lines changed: 23 additions & 20 deletions

File tree

README.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -153,13 +153,14 @@ optional arguments:
153153
```
154154
run-ntedit polish --help
155155
usage: run-ntedit polish [-h] --draft DRAFT --reads READS [-i {0,1,2,3,4,5}] [-d {0,1,2,3,4,5,6,7,8,9,10}] [-x X] [--cap CAP] [-m {0,1,2}] [-a {0,1}] -k K
156-
[--cutoff CUTOFF] [--solid] [-t T] [-z Z] [-y Y] [-j J] [-X X] [-Y Y] [-v] [-V] [-n] [-f]
156+
[-l L] [--cutoff CUTOFF] [--solid] [-t T] [-z Z] [-y Y] [-j J] [-X X] [-Y Y] [-v] [-V] [-n] [-f]
157157
158158
optional arguments:
159159
-h, --help show this help message and exit
160160
--draft DRAFT Draft genome assembly. Must be specified with exact FILE NAME. Ex: --draft myDraft.fa (FASTA, Multi-FASTA, and/or gzipped compatible),
161161
REQUIRED
162-
--reads READS Prefix of reads file(s). All files in the working directory with the specified prefix will be used for polishing (fastq, fasta, gz), REQUIRED
162+
--reads READS Prefix of reads file(s). All files in the working directory with the specified prefix will be used for polishing (fastq, fasta, gz),
163+
REQUIRED
163164
-i {0,1,2,3,4,5} Maximum number of insertion bases to try, range 0-5, [default=5]
164165
-d {0,1,2,3,4,5,6,7,8,9,10}
165166
Maximum number of deletions bases to try, range 0-10, [default=5]
@@ -169,16 +170,17 @@ optional arguments:
169170
overall (suggestion that you reduce i and d for performance)
170171
-a {0,1} Soft masks missing k-mer positions having no fix (1 = yes, default = 0, no)
171172
-k K k-mer size, REQUIRED
173+
-l L input VCF file with annotated variants (e.g., clinvar.vcf)
172174
--cutoff CUTOFF The minimum coverage of k-mers in output Bloom filter [default=2, ignored if solid=True]
173-
-t T Number of threads [default=4]
174175
--solid Output the solid k-mers (non-erroneous k-mers), [default=False]
176+
-t T Number of threads [default=4]
175177
-z Z Minimum contig length [default=100]
176178
-y Y k/y ratio for the number of edited k-mers that should be present, [default=9.000]
177-
-v Verbose mode, [default=False]
178179
-j J controls size of k-mer subset. When checking subset of k-mers, check every jth k-mer [default=3]
179180
-X X Ratio of number of k-mers in the k subset that should be missing in orderto attempt fix (higher=stringent) [default=0.5, if -Y is
180181
specified]
181-
-Y Y Ratio of number of k-mers in the k subset that shouldbe present to accept an edit (higher=stringent) [default=0.5, if -X is specified]
182+
-Y Y Ratio of number of k-mers in the k subset that should be present to accept an edit (higher=stringent) [default=0.5, if -X is specified]
183+
-v Verbose mode, [default=False]
182184
-V, --version show program's version number and exit
183185
-n, --dry-run Print out the commands that will be executed
184186
-f, --force Run all ntEdit steps, regardless of existing output files
@@ -187,7 +189,7 @@ optional arguments:
187189
### Running ntEdit in SNV mode
188190
```
189191
run-ntedit snv --help
190-
usage: run-ntedit snv [-h] --reference REFERENCE [--reads READS] [--genome GENOME [GENOME ...]] [-l L] -k K [--cutoff CUTOFF] [--solid] [-t T] [-z Z] [-y Y]
192+
usage: run-ntedit snv [-h] [--reference REFERENCE] [--reads READS] [--genome GENOME [GENOME ...]] -k K [-l L] [--cutoff CUTOFF] [--solid] [-t T] [-z Z] [-y Y]
191193
[-j J] [-X X] [-Y Y] [-v] [-V] [-n] [-f]
192194
193195
optional arguments:
@@ -198,18 +200,17 @@ optional arguments:
198200
polishing (fastq, fasta, gz)
199201
--genome GENOME [GENOME ...]
200202
Genome assembly file(s) for detecting SNV on --reference
201-
-l L input VCF file with annotated variants (e.g., clinvar.vcf)
202203
-k K k-mer size, REQUIRED
204+
-l L input VCF file with annotated variants (e.g., clinvar.vcf)
203205
--cutoff CUTOFF The minimum coverage of k-mers in output Bloom filter [default=2, ignored if solid=True]
204-
-t T Number of threads [default=4]
205206
--solid Output the solid k-mers (non-erroneous k-mers), [default=False]
207+
-t T Number of threads [default=4]
206208
-z Z Minimum contig length [default=100]
207209
-y Y k/y ratio for the number of edited k-mers that should be present, [default=9.000]
208-
-v Verbose mode, [default=False]
209210
-j J controls size of k-mer subset. When checking subset of k-mers, check every jth k-mer [default=3]
210-
-X X Ratio of number of k-mers in the k subset that should be missing in orderto attempt fix (higher=stringent) [default=0.5, if -Y is
211-
specified]
212-
-Y Y Ratio of number of k-mers in the k subset that shouldbe present to accept an edit (higher=stringent) [default=0.5, if -X is specified]
211+
-X X Ratio of number of k-mers in the k subset that should be missing in orderto attempt fix (higher=stringent) [default=0.5, if -Y is specified]
212+
-Y Y Ratio of number of k-mers in the k subset that should be present to accept an edit (higher=stringent) [default=0.5, if -X is specified]
213+
-v Verbose mode, [default=False]
213214
-V, --version show program's version number and exit
214215
-n, --dry-run Print out the commands that will be executed
215216
-f, --force Run all ntEdit steps, regardless of existing output files

run-ntedit

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,7 @@ def main():
4646
parser_snv.add_argument("--genome",
4747
help="Genome assembly file(s) for detecting SNV on --reference",
4848
nargs="+")
49-
parser_snv.add_argument("-l",
50-
help="input VCF file with annotated variants (e.g., clinvar.vcf)",
51-
type=str)
49+
5250
# Arguments for polishing only
5351
parser_polishing.add_argument("--reads",
5452
help="Prefix of reads file(s). "
@@ -85,6 +83,9 @@ def main():
8583
subparser.add_argument("-k",
8684
help="k-mer size, REQUIRED",
8785
required=True, type=int)
86+
subparser.add_argument("-l",
87+
help="input VCF file with annotated variants (e.g., clinvar.vcf)",
88+
type=str)
8889
subparser.add_argument("--cutoff",
8990
help="The minimum coverage of k-mers in output Bloom filter "
9091
"[default=2, ignored if solid=True]",
@@ -196,6 +197,12 @@ def main():
196197
command += " v=1"
197198
else:
198199
command += " v=0"
200+
201+
if args.l:
202+
if not os.path.isfile(args.l):
203+
raise FileNotFoundError(f"VCF file {args.l} not found")
204+
intro_string.append(f"\t-l {args.l}")
205+
command += f" l={args.l}"
199206

200207
# Polishing-specific parameter logs
201208
if args.mode == "polish":
@@ -211,11 +218,6 @@ def main():
211218

212219
# SNV / Ancestry-specific parameter logs
213220
if args.mode == "snv":
214-
if args.l:
215-
if not os.path.isfile(args.l):
216-
raise FileNotFoundError(f"VCF file {args.l} not found")
217-
intro_string.append(f"\t-l {args.l}")
218-
command += f" l={args.l}"
219221
if args.reads and args.genome:
220222
raise argparse.ArgumentTypeError("Please specify --reads OR --genome")
221223
if not args.reads and not args.genome:

0 commit comments

Comments
 (0)