Below you will find pages that utilize the taxonomy term “Prediction”
Post
Predicting Credit Card Approvals
Loaded and Viewed the confidential dataset, as the contributor of the dataset has anonymized the feature names Read this blog, to better understand the anonymized features Handled all missing values, as they affect the performance of machine learning model if they go unchanged Preprocessed the data into Three main tasks: Converted the non-numeric data into numeric, Splitting the data into train and test sets, and Scaled the feature values to a uniform range Fitted a Logistic Regression model (a generalized linear model), and Evaluated the model on the test set with respect to classification accuracy, and summarized the performance of a classification algorithm using Confusion matrix Performed GridSearchCV by defining the grid of values to two hyperparameters ‘tol’ and ‘max_iter’ to improve the model’s ability to predict credit card approvals Summarized the best achieved model score of 85% and the respective best parameters Link to GitHub Repository
Post
Predict Data Science Salary
Developed a tool estimating the salary for the position of data science job with (MAE) ~ $13k posted on glassdoor.com in the United States Scraped 1000 Jobs lists from Glassdoor using python and selenium Performed Exploratory Data Analysis to uncover the underlying important structure of the Glassdoor job dataset Engineered features from the text of each job description to quantify the value companies put on Python, SQL, R, Spark, AWS and Excel Optimized Linear, Lasso, and Random Forest Regressors using GridsearchCV to reach the best model Created a client facing API using flask Link to GitHub Repository
Post
Predict Admit chance to Graduate Programs.
Performed Exploratory Data Analysis to find correlation and distribution of various parameters Applied Linear Regression Machine Learning Model to predict the target variable ‘Chance of Admit’ Identified most crucial parameter for ‘High Chance of Admit’ by applying Feature Importance using Random Forest Regressor Link to GitHub Repository
Post
Forecast Medical Insurance Cost
Cleaned the data and conducted Exploratory Data Analysis Transformed hashable and comparable non-numerical labels to numerical labels using LabelEncoder Applied Polynomial and Linear Regression Machine Learning Models to predict the target variable ‘Insurance Cost’ accurately Link to GitHub Repository