tacular-omics
diff --git a/‎.github/copilot-instructions.md‎
Lines changed: 214 additions & 0 deletions b/‎.github/copilot-instructions.md‎
Lines changed: 214 additions & 0 deletions
diff --git a/‎.github/workflows/draft-pdf.yml‎
Lines changed: 29 additions & 0 deletions b/‎.github/workflows/draft-pdf.yml‎
Lines changed: 29 additions & 0 deletions
diff --git a/‎.github/workflows/python-package.yml‎
Lines changed: 53 additions & 0 deletions b/‎.github/workflows/python-package.yml‎
Lines changed: 53 additions & 0 deletions
diff --git a/‎.github/workflows/python-publish.yml‎
Lines changed: 37 additions & 0 deletions b/‎.github/workflows/python-publish.yml‎
Lines changed: 37 additions & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 52 additions & 0 deletions b/‎.gitignore‎
Lines changed: 52 additions & 0 deletions
@@ -0,0 +1,214 @@
+
+
+
+# ProForma Notation - Basic Summary
+
+1 - Never make summary documentation unles specifically asked.
+2 - check makfile for commands
+
+## Documentation & Comments
+
+### Docstring Format
+
+Use **Google-style docstrings** but keep them minimal - type hints handle the rest.
+
+**Simple function:**
+```python
+def calculate_mass(sequence: str, charge: int = 1) -> float:
+    """Calculate the mass-to-charge ratio of a peptide."""
+```
+
+**When you need more detail:**
+```python
+def find_isotopes(mz: float, tolerance: float = 0.01) -> list[Peak]:
+    """Find isotopic peaks within the tolerance window.
+    
+    Uses a greedy algorithm to identify the most intense peaks first,
+    then searches for their isotopic patterns.
+    """
+```
+
+**Classes:**
+```python
+class Peptide:
+    """Represents a peptide sequence with ProForma modifications."""
+```
+
+### What to Document
+
+- **One-line summary** for all public functions/classes
+- **Additional details** only when the implementation is non-obvious
+- **Don't repeat** what's already in type hints
+- **Private functions** (`_name`) can skip docstrings if obvious
+
+### Building Docs
+```bash
+cd docs
+make html
+# View at docs/_build/html/index.html
+```
+
+see **proforma.schema.json** for the full ProForma 2.0 json object specification.
+
+## What is ProForma?
+
+ProForma is a **standardized text notation for representing peptides and proteins with modifications**. It's designed to be both human-readable and machine-parsable, allowing scientists to precisely describe modified peptide sequences in mass spectrometry data.
+
+## Core Concept
+
+Think of it as a way to write: **"amino acid sequence + where modifications are located + what those modifications are"**
+
+## Basic Examples
+
+### 1. Simple Unmodified Peptide
+```
+PEPTIDE
+```
+Just amino acids using standard one-letter codes (A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y)
+
+### 2. Peptide with Modification
+```
+PEM[Oxidation]TIDE
+```
+- Methionine (M) is oxidized
+- Modifications go in square brackets `[]` right after the modified amino acid
+
+### 3. Multiple Modifications
+```
+PEM[Oxidation]TIS[Phospho]DE
+```
+- M is oxidized
+- S is phosphorylated
+
+### 4. Terminal Modifications
+```
+[Acetyl]-PEPTIDE
+[iTRAQ4plex]-PEPTIDE-[Amidated]
+```
+- N-terminal modifications: `[mod]-` before sequence
+- C-terminal modifications: `-[mod]` after sequence
+
+## Ways to Specify Modifications
+
+ProForma supports multiple ways to describe the same modification:
+
+```
+EM[Oxidation]TIDE              # By name (Unimod)
+EM[UNIMOD:35]TIDE              # By accession number
+EM[+15.995]TIDE                # By mass change
+EM[Formula:O]TIDE              # By chemical formula
+```
+
+## Key Advanced Features
+
+### Ambiguous Modification Position
+When you know a modification exists but not exactly where:
+```
+[Phospho]?PEPTIDE              # Phospho is somewhere, location unknown
+```
+
+### Multiple Possible Sites
+```
+PEP[Phospho#g1]TIS[#g1]DE     # Phospho is on either T or S
+```
+
+### Labile Modifications
+Modifications that fall off during fragmentation:
+```
+{Glycan:Hex}PEPTIDE            # Glycan present but lost in MS2
+```
+
+### Cross-linked Peptides
+This is somewhat handled at the parsing level but will not will not be implmented in the codebased. Dont worry about this too much.
+```
+PEPTK[#XL1]IDE//SEQK[#XL1]    # Two peptides linked together
+```
+
+### Chimeric Spectra
+Multiple peptides in same spectrum:
+This is somewhat handled at the parsing level but will not will not be implmented int eh codebased. Dont worry about this too much.
+```
+PEPTIDE+SEQUENCE               # Two co-eluting peptides
+```
+
+### Charge States
+```
+PEPTIDE/2                      # Charge state +2
+```
+
+### Charge Adducts
+```
+PEPTIDE/[Na+:z+1]              # Sodium adduct with +1 charge
+PEPTIDE/[Na+:z+1^2]            # added 2 times (total charge: +2)
+EPT[Formula:Zn:z+2]IDE/[Na:z+1^2] # total +4
+
+```
+
+both charge and charge adduct cannot occur simultaneously.
+
+```
+PEPTIDE/[Na+:z+1^2]              # Sodium adduct with +1 charge 2 times
+```
+
+
+
+## Compliance Levels
+
+ProForma has different levels of complexity:
+
+1. **Base-ProForma** - Simple sequences with basic modifications
+2. **Level 2-ProForma** - Adds ambiguity, formulas, delta masses
+3. **Extensions** - Specialized features for:
+   - Top-down proteomics
+   - Cross-linking
+   - Glycoproteomics
+   - Advanced complexity
+
+## Common Use Cases
+
+### Bottom-up Proteomics
+```
+[Acetyl]-EM[Oxidation]EVTSES[Phospho]PEK
+```
+Typical tryptic peptide with PTMs
+
+### Top-down Proteomics
+```
+<[Oxidation]@M>FULLPROTEINSEQUENCE...
+```
+Full protein with fixed modifications
+
+### Glycopeptide
+```
+NEEYN[Glycan:Hex5HexNAc4]K
+```
+N-glycosylation site
+
+### Cross-linking
+```
+PEPTK[XLMOD:02001#XL1]IDE//SEQK[#XL1]
+```
+DSS cross-link between two lysines
+
+## Why ProForma?
+
+**Before ProForma:** Everyone used different formats to describe modified peptides
+- Hard to share data
+- Hard to write software that works with different tools
+- Ambiguous representations
+
+**With ProForma:** Standard notation means:
+- Data can be easily exchanged between labs
+- Software tools can interoperate
+- Unambiguous communication of results
+- Integration with databases (Unimod, PSI-MOD, etc.)
+
+## Key Design Principles
+
+1. **Human readable** - Scientists can read and understand it
+2. **Machine parsable** - Software can reliably parse it
+3. **Extensible** - Can add new features as needs evolve
+4. **Precise** - Captures uncertainty and ambiguity when present
+5. **Standards-based** - Uses controlled vocabularies (Unimod, PSI-MOD, etc.)
+
+
@@ -0,0 +1,29 @@
+name: Draft PDF
+on:
+  push:
+    branches:
+      - paper
+    paths:
+      - 'paper/**'
+  workflow_dispatch:
+jobs:
+  paper:
+    runs-on: ubuntu-latest
+    name: Paper Draft
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Build draft PDF
+        uses: openjournals/openjournals-draft-action@master
+        with:
+          journal: joss
+          # This should be the path to the paper within your repo.
+          paper-path: paper/paper.md
+      - name: Upload
+        uses: actions/upload-artifact@v4
+        with:
+          name: paper
+          # This is the output path where Pandoc will write the compiled
+          # PDF. Note, this should be the same directory as the input
+          # paper.md
+          path: paper/paper.pdf
@@ -0,0 +1,53 @@
+# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
+# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
+
+name: Python package
+
+on:
+  push:
+    paths:
+      - 'src/**'
+      - 'tests/**'
+  workflow_dispatch:
+
+jobs:
+  build:
+
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.12"]
+
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v5
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Install uv
+      uses: astral-sh/setup-uv@v4
+    - name: Install just
+      uses: extractions/setup-just@v2
+    - name: Install dependencies
+      run: just install-all
+    - name: Lint with ruff
+      run: just lint
+    - name: Type check with ty
+      run: just check
+    - name: Test with pytest
+      run: just test-cov codecov-tests
+    - name: Upload coverage reports to Codecov
+      uses: codecov/codecov-action@v5
+      with:
+        token: ${{ secrets.CODECOV_TOKEN }}
+        slug: tacular-omics/paftacular
+        fail_ci_if_error: false
+    - name: Upload test results to Codecov
+      if: ${{ !cancelled() }}
+      uses: codecov/codecov-action@v5
+      with:
+        token: ${{ secrets.CODECOV_TOKEN }}
+        slug: tacular-omics/paftacular
+        report_type: test_results
+        fail_ci_if_error: false
@@ -0,0 +1,37 @@
+# This workflow will upload a Python Package using Twine when a release is created
+# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries
+
+# This workflow uses actions that are not certified by GitHub.
+# They are provided by a third-party and are governed by
+# separate terms of service, privacy policy, and support
+# documentation.
+
+name: Upload Python Package
+
+on:
+  release:
+    types: [published]
+
+permissions:
+  contents: read
+
+jobs:
+  deploy:
+
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python
+      uses: actions/setup-python@v5
+      with:
+        python-version: '3.x'
+    - name: Install uv
+      uses: astral-sh/setup-uv@v4
+    - name: Build package with uv
+      run: uv build
+    - name: Publish package
+      uses: pypa/gh-action-pypi-publish@release/v1
+      with:
+        user: __token__
+        password: ${{ secrets.PYPI_API_TOKEN }}
@@ -1,3 +1,47 @@
+# Python-generated files
+__pycache__/
+*.py[oc]
+build/
+dist/
+wheels/
+*.egg-info
+
+# IDEs and editors
+.idea/
+.vscode/
+.zed/
+
+# Virtual environments
+.venv
+
+docs/_build/*
+
+# Testing and coverage reports
+try*.py
+.coverage
+.ruff_cache/
+.mypy_cache/
+.pytest_cache/
+.ipynb_checkpoints/
+htmlcov/
+uv.lock
+
+# Other
+coverage.xml
+htmlcov/
+.coverage
+*.cover
+.pytest_cache/
+junit.xml
+
+
+*GNOme.obo
+*PSI-MOD.obo
+*UNIMOD.obo
+*XLMod.obo
+
+*try_*.py
+
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
@@ -113,3 +157,11 @@ dmypy.json
 # editors
 .vscode/
 .idea/
+
+# Other
+coverage.xml
+htmlcov/
+.coverage
+*.cover
+.pytest_cache/
+junit.xml