May 17, 2020
Loaded and Viewed the confidential dataset, as the contributor of the dataset has anonymized the feature names Read this blog, to better understand the anonymized features Handled all missing values, as they affect the performance of machine learning model if they go unchanged Preprocessed the data into Three main tasks: Converted the non-numeric data into numeric, Splitting the data into train and test sets, and Scaled the feature values to a uniform range Fitted a Logistic Regression model (a generalized linear model), and Evaluated the model on the test set with respect to classification accuracy, and summarized the performance of a classification algorithm using Confusion matrix Performed GridSearchCV by defining the grid of values to two hyperparameters ‘tol’ and ‘max_iter’ to improve the model’s ability to predict credit card approvals Summarized the best achieved model score of 85% and the respective best parameters Link to GitHub Repository