Skip to content

arjunsawhney1/scalable-ML

 
 

Repository files navigation

Scaleable-Ml

This project predicts the deliquency rate in the Freddie Mac dataset in a distributed way using Dask and PySpark.

About

In this repo, I build a LogisticRegression prediction model with Dask and PySpark and initialize an AWS EMR cluster to run the entire pipeline.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 94.5%
  • Shell 5.5%