Skip to content

Configurable Transaction Date Filtering for Engines A/B/C (Validated, Documented, and Tested)#84

Merged
manuel-reyes-ml merged 17 commits intomainfrom
feature/date-filter-engines-abc
Jan 9, 2026
Merged

Configurable Transaction Date Filtering for Engines A/B/C (Validated, Documented, and Tested)#84
manuel-reyes-ml merged 17 commits intomainfrom
feature/date-filter-engines-abc

Conversation

@manuel-reyes-ml
Copy link
Copy Markdown
Owner

✅ PR Summary

🎯 Objective

What problem does this PR solve?
Add config-driven transaction date filtering (range + month selections) so Engines A/B/C consistently process only in-scope records, with centralized validation and guardrail filtering.

Expected output / deliverable
Filters are configurable via DateFilterConfig, applied in cleaning + engines, and documented with tests and notebook examples.


📌 Scope

In scope

  • Config validation for date ranges and month filters (including empty/month‑list behavior).
  • Apply date filtering in Matrix/Relius cleaning and engine guardrails (A/B/C).
  • Documentation + notebook examples for date filter usage.
  • Tests for range/month filtering and missing-date handling.

Out of scope

  • UI/CLI additions beyond config‑driven filtering.

🧩 Implementation Plan (What changed)

Files changed / added

  • src/core/validators.py (date filter normalization).
  • src/core/normalizers.py (date filter helper).
  • src/cleaning/clean_matrix.py, src/cleaning/clean_relius.py (filter application).
  • src/engines/match_planid.py, src/engines/age_taxcode_analysis.py, src/engines/roth_taxable_analysis.py (guardrail filtering).
  • README.md, docs/matching_logic.md (usage docs).
  • notebooks/... (date filter examples).
  • tests/... (coverage for filtering and missing dates).

High-level approach

  1. Normalize/validate DateFilterConfig and convert months/ranges to canonical values.
  2. Apply inclusive date filtering in cleaning and add guardrails in engines A/B/C.
  3. Document usage and validate with unit tests and notebooks.

🧠 Data + Logic Notes

Business rules implemented / updated

  • Rule(s): Filter rows by date_start, date_end, and/or months (intersection), with missing dates excluded when filters are active.
  • Threshold(s): Inclusive date bounds; month names/numbers supported.
  • Exclusions / locks: Rows with invalid/missing dates are excluded when filtering is active.

Canonical schema impact

  • New columns added: n/a
  • Columns modified: n/a
  • No schema change: [x]

Data quality considerations

  • Join keys: plan_id + ssn (+ gross_amt for Engine A).
  • Null-handling: Missing/invalid dates are excluded when filters are active; missing date columns bypass filtering in cleaning.
  • Type enforcement: Dates normalized to datetime/date before filtering.
  • Idempotence: Re-running with the same filter produces the same filtered set.

🧪 Validation (Local)

Smoke checks

  • python -c "import src" passes
  • Key module import(s) run without error
  • Notebook cell(s) run without error

Data quality checks

  • No duplicate keys where uniqueness is required
  • Expected columns exist in canonical schema
  • Dtypes verified (dates/Int64/Float64)

Validation (executed)

python -m pytest tests/pipelines -k date_filter
python -m pytest tests/roth_taxable -k date_filter
python -m pytest tests/validators -k date_filter

Results: All tests passed sucessfully.

Rule verification (recommended)

  • Added/updated notebook examples for key scenarios:
    • Attained 59½ within txn_year
    • Under 59½ with term_date (attained 55 within term_year)
    • Under 59½ without term_date (attained 55 within txn_year)
    • Rollover normalization cases (B+G, G+blank, blank+G)
    • “tax-code locked” rows still evaluated for taxable/basis/year logic

Export checks (if applicable)

  • Output opens in Excel and columns populate correctly
  • Template headers found (no misalignment)

✅ Acceptance Criteria

  • AC1: Config supports date_start, date_end, months with default “All.”
  • AC2: Engine A filters Relius exported_date and Matrix txn_date.
  • AC3: Engines B/C filter Matrix txn_date.

🧯 Risks / Edge Cases

  • Potential risk: Missing dates are excluded when filtering is active.
  • Edge cases covered: Empty month configs and missing date columns in cleaning.
  • Mitigation: Validation + tests; date filter bypass when date column is absent in cleaning.

📎 Reviewer Notes

What to focus on

  • Correctness of date-filter validation and intersections (range + month).
  • Guardrail filtering in engines without duplicating cleaning behavior.
  • Documentation clarity and tests for empty/missing date cases.

Screenshots / sample outputs (optional)

pytest run output (local) Screenshot 2026-01-09 at 2 44 06 PM

🔗 Linking

@manuel-reyes-ml manuel-reyes-ml self-assigned this Jan 9, 2026
@manuel-reyes-ml manuel-reyes-ml added type: feature New functionality type: docs Documentation updates type: test Tests / pytest coverage priority: P2 Normal priority engine: A-reconcile Relius ↔ Matrix reconciliation engine engine: B-age Age-based tax code engine engine: C-roth-taxable Roth taxable + basis engine area: config Config/schema mappings area: cleaning Cleaning/normalization modules area: notebooks Notebooks / walkthroughs labels Jan 9, 2026
@manuel-reyes-ml
Copy link
Copy Markdown
Owner Author

@codex, review!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70fc4b19ed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/core/normalizers.py Outdated
Comment thread src/core/validators.py
@manuel-reyes-ml manuel-reyes-ml merged commit 9429fdb into main Jan 9, 2026
4 checks passed
@manuel-reyes-ml manuel-reyes-ml deleted the feature/date-filter-engines-abc branch January 9, 2026 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: cleaning Cleaning/normalization modules area: config Config/schema mappings area: notebooks Notebooks / walkthroughs engine: A-reconcile Relius ↔ Matrix reconciliation engine engine: B-age Age-based tax code engine engine: C-roth-taxable Roth taxable + basis engine priority: P2 Normal priority type: docs Documentation updates type: feature New functionality type: test Tests / pytest coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add configurable transaction date-range & month filters across Engines A/B/C (docs + tests included)

1 participant