Telco Churn Prediction with Advanced SQL & CatBoost

This project implements an end-to-end churn prediction pipeline. It simulates a telecom company's relational database to perform feature engineering using SQL Window Functions and trains a CatBoost classifier to identify high-risk customers.

The primary goal is to demonstrate the ability to handle data transformation at the database level (ELT) rather than relying solely on in-memory processing, followed by state-of-the-art machine learning modeling.

📋 Project Overview

Data Simulation: Generates a relational database (SQLite) with realistic patterns for customers, call logs, and complaints.
SQL Feature Engineering: Uses CTEs, Aggregations, and Window Functions (e.g., calculating trend changes over time) to extract features directly from the raw database tables.
Machine Learning: Trains a CatBoost Classifier to predict customer churn, optimizing for Recall to capture as many potential churners as possible.

🛠 Tech Stack

Language: Python 3.x
Database: SQLite
Data Manipulation: SQL (Window Functions, Joins), Pandas
Machine Learning: CatBoost, Scikit-learn
Version Control: Git

📊 Key Results

The model achieved high performance in distinguishing between loyal and churning customers.

ROC-AUC Score: 0.9367
Recall (Churn Class): 0.78 (Correctly identified 78% of leaving customers)
Precision (Churn Class): 0.70

Top Predictive Features:

calls_last_30_days (Derived via SQL: Significant drop in usage)
total_complaints (Customer dissatisfaction signal)
total_calls (General usage volume)

📂 Project Structure

├── data/                  # Stores generated database and model artifacts
├── src/
│   ├── db_generator.py    # Simulates customers, calls, and complaints data
│   ├── feature_store.py   # Extracts features using complex SQL queries
│   └── train_model.py     # Trains, evaluates, and saves the CatBoost model
├── requirements.txt       # Project dependencies
└── README.md              # Project documentation

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Telco Churn Prediction with Advanced SQL & CatBoost

📋 Project Overview

🛠 Tech Stack

📊 Key Results

📂 Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Telco Churn Prediction with Advanced SQL & CatBoost

📋 Project Overview

🛠 Tech Stack

📊 Key Results

📂 Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages