kircherlab
diff --git a/‎.github/workflows/main.yml‎
Lines changed: 2 additions & 0 deletions b/‎.github/workflows/main.yml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎.release-please-manifest.json‎
Lines changed: 1 addition & 1 deletion b/‎.release-please-manifest.json‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎CHANGELOG.md‎
Lines changed: 28 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 28 additions & 0 deletions
diff --git a/‎Dockerfile‎
Lines changed: 15 additions & 0 deletions b/‎Dockerfile‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎config/example_assignment_pbmm2.yaml‎
Lines changed: 20 additions & 0 deletions b/‎config/example_assignment_pbmm2.yaml‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎docs/1_getting_started/config.rst‎
Lines changed: 9 additions & 5 deletions b/‎docs/1_getting_started/config.rst‎
Lines changed: 9 additions & 5 deletions
diff --git a/‎docs/2_workflows/assignment.rst‎
Lines changed: 9 additions & 1 deletion b/‎docs/2_workflows/assignment.rst‎
Lines changed: 9 additions & 1 deletion
diff --git a/‎profiles/default/config.yaml‎
Lines changed: 2 additions & 0 deletions b/‎profiles/default/config.yaml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 2 additions & 0 deletions b/‎pyproject.toml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎resources/long_read/CEBPRE_10k.bam‎
4.34 MB b/‎resources/long_read/CEBPRE_10k.bam‎
4.34 MB
@@ -40,7 +40,9 @@ jobs:
           VALIDATE_YAML: true
           YAML_CONFIG_FILE: .yamllint.yml
           VALIDATE_SNAKEMAKE_SNAKEFMT: true
+          SNAKEMAKE_SNAKEFMT_CONFIG_FILE: pyproject.toml
           VALIDATE_R: true
+          SAVE_SUPER_LINTER_SUMMARY: true
 
   Linting:
     runs-on: ubuntu-latest
 
@@ -1,3 +1,3 @@
 {
-  ".": "0.5.9"
+  ".": "0.6.0"
 }
@@ -1,5 +1,33 @@
 # Changelog
 
+## [0.6.0](https://github.com/kircherlab/MPRAsnakeflow/compare/v0.5.9...v0.6.0) (2026-02-11)
+
+
+### ⚠ BREAKING CHANGES
+
+* renaming output files to use dots instead as undersocres a file separators ([#239](https://github.com/kircherlab/MPRAsnakeflow/issues/239))
+* assignment adapter removal by length.  Assignment config, adapter and forward read (now FWD), changed ([#237](https://github.com/kircherlab/MPRAsnakeflow/issues/237))
+
+### Features
+
+* assignment adapter removal by length.  Assignment config, adapter and forward read (now FWD), changed ([#237](https://github.com/kircherlab/MPRAsnakeflow/issues/237)) ([521735e](https://github.com/kircherlab/MPRAsnakeflow/commit/521735e9b6911114bb8382fc4e7bac4dcab89b5f))
+* configurable bwa ([#244](https://github.com/kircherlab/MPRAsnakeflow/issues/244)) ([9550e22](https://github.com/kircherlab/MPRAsnakeflow/commit/9550e223bf66f3aeeef0c249148962b571a2f61b))
+* enhance trimming functionality and update config schema for adapter specifications ([798cebb](https://github.com/kircherlab/MPRAsnakeflow/commit/798cebb8ad24b6d2b38816522fe86072d1b0df04))
+* experiment adapter trimming and option to do BC (also UMI if available) selection from end of the read (FWD only) ([#238](https://github.com/kircherlab/MPRAsnakeflow/issues/238)) ([04dd683](https://github.com/kircherlab/MPRAsnakeflow/commit/04dd6831d243bf22508b390b7c1926f744eb4759))
+* fastq-join as option for merging reads (assignment workflow) ([#243](https://github.com/kircherlab/MPRAsnakeflow/issues/243)) ([093e288](https://github.com/kircherlab/MPRAsnakeflow/commit/093e288fd9a6f38df2be2f7e25ae18ecba0d3f7a))
+* implement adapter trimming functionality in experiment rules ([9fd32ce](https://github.com/kircherlab/MPRAsnakeflow/commit/9fd32cee97dc69217f88f994249ea92ed0dd5b5e))
+
+
+### Bug Fixes
+
+* correct parameter name in check_version function ([7cd50a5](https://github.com/kircherlab/MPRAsnakeflow/commit/7cd50a52f3eaa8baae4d2d6937219b58c492d8d7))
+* snakemake reverted default value handling ([#236](https://github.com/kircherlab/MPRAsnakeflow/issues/236)) ([fa5109b](https://github.com/kircherlab/MPRAsnakeflow/commit/fa5109baacc8252c9f407c9cc54e080ca72e32e4))
+
+
+### Code Refactoring
+
+* renaming output files to use dots instead as undersocres a file separators ([#239](https://github.com/kircherlab/MPRAsnakeflow/issues/239)) ([0546082](https://github.com/kircherlab/MPRAsnakeflow/commit/0546082a2edca83566dcb2283db61c423533524f))
+
 ## [0.5.9](https://github.com/kircherlab/MPRAsnakeflow/compare/v0.5.8...v0.5.9) (2026-01-07)
 
 
 
@@ -192,6 +192,20 @@ COPY workflow/envs/quarto.yaml /conda-envs/b933cc1aa7c25db04635e7ec0e37f80e/envi
 RUN mkdir -p /conda-envs/1891509f8d9a8a89487739b14cd6dbef
 COPY workflow/envs/mpralib.yaml /conda-envs/1891509f8d9a8a89487739b14cd6dbef/environment.yaml
 
+# Conda environment:
+#   source: workflow/envs/pbmm2_pysam.yaml
+#   prefix: /conda-envs/2308b21c334f9613fdb840777a17d2b9
+#   ---
+#   channels:
+#       - conda-forge
+#       - bioconda
+#   dependencies:
+#       - pbmm2
+#       - pysam
+#       - biopython
+#       - python>=3.10
+RUN mkdir -p /conda-envs/2308b21c334f9613fdb840777a17d2b9
+COPY workflow/envs/pbmm2_pysam.yaml /conda-envs/2308b21c334f9613fdb840777a17d2b9/environment.yaml
 
 # Step 2: Generate conda environments
 
@@ -214,6 +228,7 @@ RUN conda env create --no-default-packages --prefix /conda-envs/a4e1b935cbca52df
 RUN conda env create --no-default-packages --prefix /conda-envs/b933cc1aa7c25db04635e7ec0e37f80e --file /conda-envs/b933cc1aa7c25db04635e7ec0e37f80e/environment.yaml
 RUN conda env create --no-default-packages --prefix /conda-envs/ae3e37bf43cbb30416a885168e10c552 --file /conda-envs/ae3e37bf43cbb30416a885168e10c552/environment.yaml
 RUN conda env create --no-default-packages --prefix /conda-envs/1891509f8d9a8a89487739b14cd6dbef --file /conda-envs/1891509f8d9a8a89487739b14cd6dbef/environment.yaml
+RUN conda env create --no-default-packages --prefix /conda-envs/2308b21c334f9613fdb840777a17d2b9 --file /conda-envs/2308b21c334f9613fdb840777a17d2b9/environment.yaml
 
 # cleanup when version changed
 ARG VERSION
 
@@ -0,0 +1,20 @@
+---
+version: "0.6"
+
+assignments:
+  exampleLongRead:
+    bc_length: 15
+    long_read_input: resources/long_read/CEBPRE_10k.bam
+    design_file: resources/long_read/CEBPRE_reference.fasta
+    alignment_tool:
+      tool: pbmm2
+      configs:
+        preset: SUBREAD
+        min_concordance: 0.9
+        alignment_start: 1
+        sequence_length: 303
+    linker: GCAAAGTGAACACATCGCTAAGCGAAAGCTAAG # linker sequence in the read after we expect the BC
+    configs:
+      test:
+        min_support: 1
+        fraction: 0.51
@@ -45,13 +45,13 @@ For each assignment you want to process, you must give it a name like :code:`exa
     :split_number:
         To parallelize mapping for assignment, the reads are split into :code:`split_number` files. For example, setting it to 300 means that the reads are split into 300 files, and each file is mapped in parallel. This is only useful when using a cluster. When running the workflow on a single machine, the default value should be used. The default is set to :code:`1`. (For technical reasons, when multiple assignments are defined, all will be set to the maximum defined in the config.)
     :tool:
-        Alignment tool that is used. Currently, :code:`bbmap`, :code:`bwa`, :code:`bwa-additional-filtering`, and :code:`exact` are supported. Default is :code:`bbmap`.
+        Alignment tool that is used. Currently, :code:`bbmap`, :code:`bwa`, :code:`bwa-additional-filtering`, :code:`exact`, and :code:`pbmm2` are supported. Default is :code:`bbmap`.
     :configs:
         Configurations of the alignment tool selected.
 
-        :sequence_length (exact, bbmap):
+        :sequence_length (exact, bbmap, pbmm2):
             Defines the :code:`sequence_length`, which is the length of a sequence alignment to an oligo in the design file. Only one length design is supported.
-        :alignment_start (exact, bbmap):
+        :alignment_start (exact, bbmap, pbmm2):
             Defines the start of the alignment in an oligo. When using adapters, you must set the length of the adapter. Otherwise, 1 will be the choice for most cases.
         :sequence_length (bwa, bwa-additional-filtering):
             Defines the :code:`min` and :code:`max` of a :code:`sequence_length` specification. :code:`sequence_length` is the length of a sequence alignment to an oligo in the design file. Because there can be insertions and deletions, we recommend varying it slightly around the exact length (e.g., ±5). This option enables designs with multiple sequence lengths.
@@ -69,15 +69,19 @@ For each assignment you want to process, you must give it a name like :code:`exa
             (Optional) Threshold of mismatches we investigate if we should try to rescue. Default is :code:`3`.
         :verbose (bwa-additional-filtering):
             (Optional) Print which alignments were rescued and which could not be rescued. Default is :code:`false`.
+        :preset (pbmm2):
+            (Optional) Preset for pbmm2 alignment. Default is :code:`SUBREAD`.
+        :min_concordance (pbmm2):
+            (Optional) Minimum concordance for pbmm2 alignment. Default is :code:`0.9`.
 
 :bc_length:
     Length of the barcode. Must match the length of :code:`BC`.
 :BC_rev_comp:
     (Optional) If set to :code:`true`, the barcode is reverse complemented. Default is :code:`false`.
 :linker_length:
-    (Optional) Length of the linker. Only needed if you don't have a barcode read and the barcode is in the forward read with the structure: BC+Linker+Insert. The fixed length is used for the linker after a fixed length of BC. The recommended option is :code:`linker` by defining the exact linker sequence and using cutadapt for trimming.
+    (Optional) Length of the linker. O nly needed if you don't have a barcode read and the barcode is in the forward read with the structure: BC+Linker+Insert. The fixed length is used for the linker after a fixed length of BC. The recommended option is :code:`linker` by defining the exact linker sequence and using cutadapt for trimming. 
 :linker:
-    (Optional) Length of the linker. Only needed if you don't have a barcode read and the barcode is in the forward read with the structure: BC+Linker+Insert. Uses cutadapt to trim the linker to get the barcode as well as the start of the insert.
+    (Required for long read, otherwise optional) The exact linker between BC and oligo. *Short read data:* Only needed if you don't have a barcode read and the barcode is in the forward read with the structure: BC+Linker+Insert. Uses cutadapt to trim the linker to get the barcode as well as the start of the insert. *Long read data:* Required! BC will be taken after the linker.
 :FWD:
     List of forward-read files in gzipped fastq format. The full or relative path to the files should be used. The same order in FWD, BC, and REV is important.
 :REV:
 
@@ -64,7 +64,12 @@ Example of an assignment file using exact matches and read 1 with BC, linker, an
 .. literalinclude:: ../../config/example_assignment_exact_linker.yaml
    :language: yaml
 
-If you want to use the strand sensitivity option (e.g., testing enhancers in both directions), you can add the following to the config file: :code:`strand_sensitive: {enable: true}`. Otherwise, MPRAsnakeflow will give you an error because it cannot handle the same sequences in both sense and antisense directions. This is an issue with the mappers because they do not consider the strand and will always call your read ambiguous due to multiple matches.
+Example of an assignment file using long read data with pbmm2 mapping:
+
+.. literalinclude:: ../../config/example_assignment_pbmm2.yaml
+   :language: yaml
+
+If you want to use the strand sensitivity option (e.g., testing enhancers in both directions), you can add the following to the config file: :code:`strand_sensitive: {enable: true}`. Otherwise, MPRAsnakeflow will give you an error because it cannot handle the same sequences in both sense and antisense directions. This is an issue with the mappers because they do not consider the strand and will always call your read ambiguous due to multiple matches. **Not available for long read data.**
 
 Snakemake
 ============================
@@ -118,6 +123,9 @@ Rules run by Snakemake in the assignment utility:
 - **assignment_mapping_bwa_ref**: Create mapping reference for BWA from design file.
 - **assignment_mapping_exact**: Map the reads to the reference and sort using exact match.
 - **assignment_mapping_exact_reference**: Create reference to map the exact design
+- **assignment_mapping_pbmm2_align**: Align long reads (BAM or FASTA) to reference using pbmm2.
+- **assignment_mapping_pbmm2_getBCs**: Extract barcodes from aligned long reads. Produces the standard barcode TSV for downstream collection and filtering.
+- **assignment_mapping_pbmm2_index**: Create pbmm2 index from design reference.
 - **assignment_merge_NGmerge**: Merge the FWD, REV and BC fastq files into one using NGmerge.
 - **assignment_merge_fastqjoin**: Merge the FWD, REV and BC fastq files into one using fastq-join.
 - **assignment_preprocessing_adapter_remove**: Remove adapter sequence from the reads (3' or 5'). Uses cutadapt to trim adapters based on the primer direction.
 
@@ -42,6 +42,8 @@ set-resources:
   assignment_mapping_bbmap:
     runtime: 240
     mem: 10G
+  assignment_mapping_pbmm2_align:
+    runtime: 240
   assignment_collect:
     runtime: 2160
     mem: 10G
 
@@ -0,0 +1,2 @@
+[tool.snakefmt]
+line_length = 127
Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,3 @@`
`1`	`1`	`{`
`2`		`- ".": "0.5.9"`
	`2`	`+ ".": "0.6.0"`
`3`	`3`	`}`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+[tool.snakefmt]`
	`2`	`+line_length = 127`