Open
Conversation
lucidtronix
approved these changes
Apr 18, 2025
Collaborator
lucidtronix
left a comment
There was a problem hiding this comment.
Thanks JK! Glad to get these into the repo! overtime I think we can integrate this code more fully with the rest of the ml4h library...
| import pandas as pd | ||
| import smart_open | ||
|
|
||
| ECG_REST_LEADS = { |
Collaborator
There was a problem hiding this comment.
This is ok for now, but since these constants are already defined in ml4h/defines.py it would be better to include ml4h and import from there. But that would require making the docker image much bigger.
2e0d467 to
63831d7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new processing pipeline for the ECG2AF model deployment. The changes include the creation of a Docker image, implementation of data preparation and finalization scripts, and the addition of necessary dependencies. These updates enable the preprocessing of ECG data and the generation of risk scores from model predictions.
Docker Image Setup:
Dockerfileto define the environment for the processing pipeline, using a Python 3.9 slim base image, installing dependencies, and setting up the entry point for Python scripts. (model_zoo/ECG2AF/deployment/v1/processing_image/Dockerfile, model_zoo/ECG2AF/deployment/v1/processing_image/DockerfileR1-R7)Data Preparation:
prepare.pyto process raw ECG data into normalized HDF5 tensor format, supporting various storage backends (e.g., GCS, Azure, local). (model_zoo/ECG2AF/deployment/v1/processing_image/prepare.py, model_zoo/ECG2AF/deployment/v1/processing_image/prepare.pyR1-R53)Data Finalization:
finalize.pyto convert model predictions into a structured CSV format, including risk score calculations and validation of input-output consistency. (model_zoo/ECG2AF/deployment/v1/processing_image/finalize.py, model_zoo/ECG2AF/deployment/v1/processing_image/finalize.pyR1-R44)Dependency Management:
requirements.txtfile specifying dependencies such aspandas,numpy,h5py, andsmart-open[gcs]for the pipeline. (model_zoo/ECG2AF/deployment/v1/processing_image/requirements.txt, model_zoo/ECG2AF/deployment/v1/processing_image/requirements.txtR1-R4)