Customer Churn Prediction Pada Sektor Perbankan Dengan Model Logistic Regression dan Random Forest

Authors

  • Ely Mufida Universitas Bina Sarana Informatika
  • Doni Andriansyah Universitas Bina Sarana Informatika
  • Hylenarti Hertyana Universitas Bina Sarana Informatika

DOI:

https://doi.org/10.31294/coscience.v5i1.7576

Keywords:

Customer Churn, Logistic Regression, Random Forest, Unsupervised Learning

Abstract

– Customer churn is a detrimental phenomenon in the banking sector because it can reduce revenue and increase the cost of acquiring new customers. This research aims to compare the performance of two models, Logistic Regression and Random Forest, to predict customer churn using datasets from Kaggle. The research process involves data preprocessing such as z-score normalization and dividing the dataset into training data (70%) and testing data (30%). The model was evaluated using a confusion matrix with Accuracy, precision, recall and F1-Score values. Logistic Regression achieved 76.85% Accuracy, 79% precision, 94% recall, and 86% F1-Score, showing quite good performance but susceptible to false positives. In contrast, Random Forest shows superior performance with 83.12% Accuracy, 84% precision, 96% recall, and 90% F1-Score. Random Forest is suitable for problems with high recall requirements because it is more reliable in detecting potential customer churn. To further improve model performance, it is recommended to perform hyperparameter optimization and feature importance analysis. This churn prediction model is expected to help banks reduce churn and increase customer retention.

Downloads

Published

2025-01-31