DESCRIPTION
Binary classification model predicting credit card customer churn on a Kaggle competition dataset. Started with a baseline decision tree, then improved with XGBoost and GridSearchCV hyperparameter tuning.
TOOLS
TIMELINE
July 2025
The dataset had 7,088 training records with a 16% churn rate — a moderately imbalanced classification problem. EDA showed that transaction volume and revolving balance were the strongest predictors of churn, while categorical features like education level and card category had little signal.
The baseline decision tree achieved an F1 of 0.813. Switching to XGBoost with 5-fold cross-validation and grid search over depth, learning rate, and estimator count pushed that to 0.893 F1 and 0.967 accuracy — a solid improvement, though the competition leaderboard kept me humble.
