Credit Card Fraud Detector
🎯 BRIEF
The ultimate goal of the project was to develop a highly accurate and efficient machine learning model that can be used to detect fraudulent credit card transactions in real-time. The project also has broader implications for the financial industry, as the detection of fraudulent transactions is a critical aspect of financial security and risk management. It deals with several challenges of imbalanced datasets—bias issues, false accuracy, poor generalization, inappropriate evaluation metrics, etc.
🔧 TOOLS
Python, pandas, re, scikit-learn, sklearn, imblearn, seaborn, Matplotlib
🤝 CONTRIBUTION
Built an effective ML classifier for the detection of fraud transactions in a highly imbalanced dataset
Employed oversampling and undersampling methods (SMOTE and NearMiss) to models such as Logistic Regression, Support Vector Classifier, and KNN
Achieved a recall of 0.94 by performing hyperparameter tuning using GridSearchCV and RandomizedSearchCV
🏆 TAKEAWAYS
Oversampling may allow data to leak from the validation folds into the training folds if it is not done properly.
Metrics should be chosen carefully—accuracy can be misleading in such cases, as it tends to favor the majority class.
📷 SCREENSHOT
ROC Curve