🤖 End-to-End Customer Churn Prediction Platform

Welcome to my Customer Churn Prediction Platform project! This project demonstrates a complete data science workflow — from raw data ingestion, transformation, and model training to deployment as an interactive web application.

➡️ Try the Live App Here!

🎯 Project Background & Objectives

Customer churn — when users stop subscribing or using a service — is a critical problem for subscription-based businesses. Losing customers not only reduces revenue but also increases the cost of acquiring replacements.

This project aims to build an automated system that can:

Identify high-risk customers before they churn.
Provide an interactive tool for business teams to perform "what-if" analysis on various customer profiles.

🏗️ System Architecture

This project is built using a modern data stack, separating each phase of the data lifecycle. The workflow is as follows:

![Architecture Diagram]( )

Ingestion: Raw data is uploaded to Google Cloud Storage (GCS), acting as the data lake.
Warehousing: Data is loaded into Google BigQuery for structured storage and querying.
Transformation: dbt (data build tool) is used to clean, transform, and shape the raw data into analytics-ready models.
Modeling: A Jupyter notebook pulls the transformed data from BigQuery, trains a classification model using Scikit-learn, and saves it as a model artifact.
Deployment: An interactive web app is built using Streamlit, loading the trained model and serving predictions to end-users.

🛠️ Tech Stack

Cloud Provider: Google Cloud Platform (GCP)
Data Storage: Google Cloud Storage, Google BigQuery
Data Transformation: dbt (data build tool)
Machine Learning: Python, Pandas, Scikit-learn
Web Application: Streamlit
Orchestration & Automation (Local): Python Scripts, Jupyter Notebook

🚀 How to Run the Project Locally

Follow these steps to run the project on your local machine:

Clone the Repository

git clone https://github.com/bestoism/ChurnPredictionPlatform.git
cd ChurnPredictionPlatform

Install Dependencies
Ensure Python 3.8+ and pip are installed.
```
pip install -r requirements.txt
```
Set Up Google Cloud Authentication
```
gcloud auth application-default login
```
Run dbt Transformations
```
cd churn_analytics
dbt run
cd ..
```
Launch the Streamlit App
```
streamlit run app.py
```
The app will be available at http://localhost:8501.

📈 Results & Learnings

The model achieved around 80% accuracy in predicting churn on the test dataset.
Biggest Challenge: Handling data type inconsistencies between BigQuery, Pandas, and Scikit-learn — highlighting the importance of explicit and consistent data validation.
This project reinforced the value of modular data architecture and using the right tool for the right task (e.g., dbt for transformations, Streamlit for interactivity).

Feel free to fork this project, open an issue, or contribute! 🚀

🚧 This project is under construction — stay tuned for more updates!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
churn_analytics		churn_analytics
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
telco_churn.csv		telco_churn.csv
upload_to_gcs.py		upload_to_gcs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 End-to-End Customer Churn Prediction Platform

➡️ Try the Live App Here!

🎯 Project Background & Objectives

🏗️ System Architecture

🛠️ Tech Stack

🚀 How to Run the Project Locally

📈 Results & Learnings

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 End-to-End Customer Churn Prediction Platform

➡️ Try the Live App Here!

🎯 Project Background & Objectives

🏗️ System Architecture

🛠️ Tech Stack

🚀 How to Run the Project Locally

📈 Results & Learnings

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages