Data Science professional with a passion for creative problem solving through data analysis. Quick learner possessing extensive analytical skills, strong attention to detail, and a significant ability to work in team environments. Highly accurate and adept at collecting, analyzing, and interpreting large datasets, developing new forecasting models, and performing data management tasks for data driven decisions and products.
Currently seeking internships and full time opportunities. Graduated with a Master of Engineering degree in Data Science at University of Toronto.
View My LinkedIn Profile
GitHub Repository Link: https://github.com/mikonguyen/Fraud-Detection
The goal of this project was to develop a fraud detection system using machine learning models. The two machine learnings models developed in this project were Decision Tree and Random Forest models to detect fraudulent transactions.
The models were evaluated using the Precision-Recall (PR) and Area under the ROC curve (AUC) metrics for the training and test sets.
Comparing both the decision tree classifier and random forest classifier, the decision tree classifier has better results for both the Precision-Recall (PR) and Area under the ROC curve (AUC) metrics.
Comparing the two confusion matrices, the difference between the two models are that the random forest classifier has more false positives with 16 samples that are predicted as fraud but are not fraud. Overall, the decision tree classifier has slightly better results. In general, both models are effective for fraud detection with high PR and AUC metrics.