Evaluation doesn't filter test data by observing run from config

**Description:**

## Current Behavior

The evaluation step in `run_clean_pipeline.py` loads **all** CWT files from `data/processed/` regardless of the `runs` parameter specified in the config file:

```python
# Line 440 in run_clean_pipeline.py
cwt_files = list(processed_dir.glob("*.npy"))  # Loads everything
```

The manifest is only used to determine `segment_type` (signal vs noise), but doesn't filter by observing run. This means:

- Config setting `signals.runs: [O4a, O4b]` only affects **downloading**
- Evaluation uses whatever is in `data/processed/`, including old O1/O2/O3 data from previous runs
- Users can inadvertently test on multi-run data even with O4-only config

## Problem

This caused confusion for a user who had O4-only specified in their config but was still testing on O1-O4 signals because old processed data remained in the directory from previous runs.

Their workaround was to manually delete old processed data and modify the downloader code.

## Proposed Enhancement

Add optional run-filtering to the evaluation code that respects the config's `runs` parameter:

1. Store observing run metadata in the manifest during download
2. During evaluation, filter `signal_files` by the runs specified in config
3. Make this behavior optional (e.g., `pipeline.filter_by_config_runs: true`)

## Workaround (Current)

Users must manually clear `data/processed/` before downloading with a new run configuration:

```bash
rm -rf data/processed/*
```

## Priority

Low - the workaround is simple and the project achieved its research goals. This would be a quality-of-life improvement for future users.

---

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation doesn't filter test data by observing run from config #14

Current Behavior

Problem

Proposed Enhancement

Workaround (Current)

Priority

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Evaluation doesn't filter test data by observing run from config #14

Description

Current Behavior

Problem

Proposed Enhancement

Workaround (Current)

Priority

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions