ci: add Python data tools workflow#714
ci: add Python data tools workflow#714tianjianjiang wants to merge 2 commits intorefactor/py3_migrationfrom
Conversation
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Test data files | ||
| run: make check | ||
| working-directory: Source/Data |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 6 months ago
To fix this problem, add an explicit permissions block to limit the default permissions of the GITHUB_TOKEN used by the workflow.
The single best way is to add this block at the top-level of the workflow (YAML root), so it applies to all jobs unless overridden. The minimal starting point for most workflows is contents: read, which suffices for read-only actions such as checkout and running code without modifying repository contents or interacting with issues or pull requests.
You should add:
permissions:
contents: readimmediately after the name: field and before the on: field, to be clear and conventional.
No changes to imports, methods, or definitions are required since this is GitHub workflow YAML.
| @@ -1,4 +1,6 @@ | ||
| name: Python Data Tools | ||
| permissions: | ||
| contents: read | ||
|
|
||
| on: | ||
| push: |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
Code Review: Python Data Tools WorkflowOverviewThis PR introduces a dedicated CI workflow for Python data tools, which is a good optimization strategy to avoid triggering expensive macOS runners for Python-only changes. The implementation is straightforward and follows GitHub Actions best practices. Positive Aspects✅ Clear separation of concerns: Python-only changes now have their own workflow Critical Issues1. Missing Python Setup Step
|
zonble
left a comment
There was a problem hiding this comment.
The unit tests are failing.
| test: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - uses: actions/checkout@v4 |
fbef247 to
8912cb4
Compare
This commit migrates the Source/Data directory to Python 3 standards and modern tooling: Python 3 Migration: - Converted all scripts to Python 3 (shebangs, unichr→chr, print functions) - Fixed string comparisons (is → ==) - Removed coding declarations (unnecessary in Python 3) - Migrated to modern type hints (dict/list/tuple instead of Dict/List/Tuple) - Applied Ruff auto-fixes (IOError→OSError, formatting) Code Refactoring: - Completely refactored frequency_builder.py with functions and type hints - Rewrote main_compiler.py with structured functions and documentation - Modernized plain_bpmf_compiler.py with proper function decomposition - Refactored analyze_data.py with type aliases and clean structure - Improved count_occurrences.py with better organization - Rewrote map_bpmf.py to be more pythonic - Created audit_encoding.py (Python replacement for Swift version) - Converted score_validator.py to f-strings Modern Tooling: - Added Ruff configuration for linting/formatting - Replaced pycodestyle with Ruff in dev dependencies - Improved .gitignore with official Python template - Documented Ruff usage in README.md File Handling Modernization: - Replaced while True: readline() with for line in file - Ensured all file operations use with statements - Converted string paths to pathlib.Path throughout - Improved error handling with specific exceptions Documentation: - Consolidated 3 documentation files into single README.md - Updated AGENTS.md navigation to Source/Data/README.md - Added Ruff documentation and usage examples - Fixed console script naming (mcbpmf_audit_encoding→mcbpmf-audit-encoding) Legacy Cleanup: - Deleted entire bin_legacy/ directory (29 files) - Removed obsolete DEPRECATED.md, old README files - Removed curation/README.md (merged into main README.md) All changes maintain backward compatibility with existing workflows and Makefile targets. Code formatted with Ruff for consistency. Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Create dedicated workflow for Python dictionary data tools that runs independently from Xcode builds. This ensures Python-only changes get proper CI validation without triggering expensive macOS runners. Triggers when: - Python files in Source/Data/ change - curation/ package files change - Makefile or pyproject.toml change Runs on: - Ubuntu (cost-effective) - Only tests data file validation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
8912cb4 to
96db181
Compare
1ef9514 to
2203bf3
Compare
2203bf3 to
c1b7f61
Compare
User description
Summary
Creates dedicated workflow for Python dictionary data tools that runs independently from Xcode builds.
Problem
Python data tool changes currently trigger expensive macOS Xcode runners unnecessarily. Need separate workflow for Python-only changes.
Changes
Python Data Tools Workflow (NEW:
python-data-check.yml)Triggers when:
Source/Data/changecuration/package files changeMakefileorpyproject.tomlchangeRuns on:
make checkin Source/Data directoryBenefits
Status
DRAFT - Needs fixing. Current
make checkmay have issues that need to be resolved.🤖 Generated with Claude Code
PR Type
Enhancement, Tests
Description
Add CI workflow for Python data tools
Trigger on Data, curation, scripts changes
Run make check on Ubuntu runners
Separate from macOS Xcode builds
Diagram Walkthrough
File Walkthrough
python-data-check.yml
Add Python data CI workflow.github/workflows/python-data-check.yml