refactor: enhance security and improve error handling in training and frontend components#14
refactor: enhance security and improve error handling in training and frontend components#14cristofima merged 8 commits intomainfrom
Conversation
Modified CORS settings for S3 buckets to use specific allowed origins instead of wildcard. Introduced a new variable for allowed origins to enhance security and maintain flexibility for development and production environments.
- Handled empty target in problem type detection in EDA - Improved error handling for feature importance extraction in Model Trainer - Guard against empty target in problem type detection in Preprocessor - Used timezone-aware datetime for report timestamp
…tion Refactored Docker command generation into a reusable function to avoid duplication in the ResultsPage component. This improves maintainability and readability of the code.
There was a problem hiding this comment.
Pull request overview
This PR enhances security and robustness across the AutoML platform by implementing specific CORS origins, improving error handling in training pipelines, and tightening IAM permissions for CI/CD workflows. The changes focus on production-readiness through better edge case handling and security hardening.
Key Changes:
- Replace wildcard CORS origins (
*) with specific allowed origins based on Amplify domain and environment - Improve error handling by replacing bare exception catches with specific exception types and logging
- Add guards against empty datasets and NaN values in training preprocessing
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
infrastructure/terraform/s3.tf |
Updated CORS configuration to use specific origins (Amplify domain + localhost) instead of wildcard |
infrastructure/terraform/variables.tf |
Added cors_allowed_origins variable for custom CORS configuration |
.github/SETUP_CICD.md |
Replaced wildcard IAM permissions with specific API Gateway and Batch actions |
scripts/predict.py |
Added NaN fallback for median imputation and improved exception handling with specific types |
backend/training/preprocessor.py |
Added guard against empty target series in problem type detection |
backend/training/eda.py |
Added guards for empty targets and removed unused type imports |
backend/training/model_trainer.py |
Improved exception handling with specific exception types |
backend/training/training_report.py |
Fixed timezone-aware datetime generation using timezone.utc |
frontend/next.config.ts |
Enhanced comment explaining trailingSlash configuration |
frontend/app/results/[jobId]/page.tsx |
Refactored duplicate Docker command strings into reusable function |
| allowed_origins = length(var.cors_allowed_origins) > 0 ? var.cors_allowed_origins : concat( | ||
| local.amplify_enabled ? ["https://${aws_amplify_app.frontend[0].default_domain}"] : [], | ||
| var.environment == "dev" ? ["http://localhost:3000"] : [] |
There was a problem hiding this comment.
The CORS configuration may result in an empty allowed_origins list in production when Amplify is disabled. When var.cors_allowed_origins is empty, local.amplify_enabled is false (no GitHub integration), and var.environment != "dev", the concat() will produce an empty list [], which is invalid for CORS configuration.
Recommendation: Add a fallback to prevent empty origins list:
allowed_origins = length(var.cors_allowed_origins) > 0 ? var.cors_allowed_origins : (
length(concat(
local.amplify_enabled ? ["https://${aws_amplify_app.frontend[0].default_domain}"] : [],
var.environment == "dev" ? ["http://localhost:3000"] : []
)) > 0 ? concat(
local.amplify_enabled ? ["https://${aws_amplify_app.frontend[0].default_domain}"] : [],
var.environment == "dev" ? ["http://localhost:3000"] : []
) : ["*"] # Fallback to wildcard if no specific origins configured
)Or require cors_allowed_origins to be set for production deployments without Amplify.
| allowed_origins = length(var.cors_allowed_origins) > 0 ? var.cors_allowed_origins : concat( | |
| local.amplify_enabled ? ["https://${aws_amplify_app.frontend[0].default_domain}"] : [], | |
| var.environment == "dev" ? ["http://localhost:3000"] : [] | |
| allowed_origins = length(var.cors_allowed_origins) > 0 ? var.cors_allowed_origins : ( | |
| length(concat( | |
| local.amplify_enabled ? ["https://${aws_amplify_app.frontend[0].default_domain}"] : [], | |
| var.environment == "dev" ? ["http://localhost:3000"] : [] | |
| )) > 0 ? concat( | |
| local.amplify_enabled ? ["https://${aws_amplify_app.frontend[0].default_domain}"] : [], | |
| var.environment == "dev" ? ["http://localhost:3000"] : [] | |
| ) : ["*"] |
| allowed_origins = length(var.cors_allowed_origins) > 0 ? var.cors_allowed_origins : concat( | ||
| local.amplify_enabled ? ["https://${aws_amplify_app.frontend[0].default_domain}"] : [], | ||
| var.environment == "dev" ? ["http://localhost:3000"] : [] | ||
| ) |
There was a problem hiding this comment.
The CORS configuration may result in an empty allowed_origins list in production when Amplify is disabled. When var.cors_allowed_origins is empty, local.amplify_enabled is false (no GitHub integration), and var.environment != "dev", the concat() will produce an empty list [], which is invalid for CORS configuration.
Recommendation: Add a fallback to prevent empty origins list or require cors_allowed_origins to be set for production deployments without Amplify.
| allowed_origins = length(var.cors_allowed_origins) > 0 ? var.cors_allowed_origins : concat( | ||
| local.amplify_enabled ? ["https://${aws_amplify_app.frontend[0].default_domain}"] : [], | ||
| var.environment == "dev" ? ["http://localhost:3000"] : [] | ||
| ) |
There was a problem hiding this comment.
The CORS configuration may result in an empty allowed_origins list in production when Amplify is disabled. When var.cors_allowed_origins is empty, local.amplify_enabled is false (no GitHub integration), and var.environment != "dev", the concat() will produce an empty list [], which is invalid for CORS configuration.
Recommendation: Add a fallback to prevent empty origins list or require cors_allowed_origins to be set for production deployments without Amplify.
This PR enhances security and robustness across the AutoML platform by implementing specific CORS origins, improving error handling in training pipelines, and tightening IAM permissions for CI/CD workflows. The changes focus on production-readiness through better edge case handling and security hardening.
Key Changes: