- Loaded and Viewed the confidential dataset, as the contributor of the dataset has anonymized the feature names
- Read this blog, to better understand the anonymized features
- Handled all missing values, as they affect the performance of machine learning model if they go unchanged
- Preprocessed the data into Three main tasks: Converted the non-numeric data into numeric, Splitting the data into train and test sets, and Scaled the feature values to a uniform range
- Fitted a Logistic Regression model (a generalized linear model), and Evaluated the model on the test set with respect to classification accuracy, and summarized the performance of a classification algorithm using Confusion matrix
- Performed GridSearchCV by defining the grid of values to two hyperparameters ‘tol’ and ‘max_iter’ to improve the model’s ability to predict credit card approvals
- Summarized the best achieved model score of 85% and the respective best parameters
Link to GitHub Repository