Developed my expertise in machine learning by completing core specializations in AI at Coursera and still doing more.
Projects
House Price Prediction with Improved ML Techniques
· Estimated Beijing & London City residential prices based on the attributes regardless of the data from previous years.
· Implemented versatile approach for model training with not only ensemble methods but also with modern machine learning techniques like Hybrid and Stacked Generalization with 15% error.
· Performed data cleaning to replace all the Chinese symbols with English words, and exploratory analysis to identify outliers, feature engineering to see the price change of a house far or close to centre of city through spatial data, and finding the statistical significance features through correlation.
Statistical Analysis with Python using National Health & Nutrition Examination Survey Data
· Applied inferential statistics like constructed of confidence intervals for the difference between two populations proportions (male and female smokers) and means (male and female Body Mass Index).
· Conduced a hypothesis test (at the 0.05 level) for the null hypothesis that the proportion of women who smoke is equal to the proportion of men who smoke.
· Applied Statistical-modelling techniques like linear and logistic regression, multilevel and marginal, and Bayesian Inference on different features of NHANES and other datasets.
Loan Predictions with Deep Learning
· Filtered thousands of student loan applications from IPEDS (integrated post-secondary college data) data through Shallow Neural Network Keras model to predict successful ones with 87% accuracy rate.
Image Search Application with OCR, Open CV, and Tesseract
· Built an application for character recognition and object detection in images.
· Returned a contact sheet of images for the searched keywords.
Sentiment Classifier with Python
· Designed a sentiment classifier that calculates the net score on hundreds of positive and negative tweets.
· Analyzed text and csv files to extract only tweets with the support of Python keywords and data types.
Clustering Model with Foursquare API
· Explored tourist or common locations in Manhattan and Downtown Toronto using Foursquare APIs.
· Clustered those locations based on foot traffic activity in the respective neighborhoods.
Energy Consumption in Netherland, a Nonlinear Regression Analysis
· Implemented all the architectural decisions like ETL, Data Cleansing, Feature Engineering, Model Designing (examined a non-linear relationship between the variables) and evaluation in predicting energy consumption for the year 2019.
Accident Risk Places, an Analysis of US Traffic Data
· Identified specific places of different boroughs where the accidents ratio is high.
· Predicted accident risks using text mining and data visualization with the help of Neural Network
IoT Data with Apache Spark, Node-Red & NoSQL
· Configured Node-Red IBM application with NoSQL database to operate IoT devices like Mobile Censor, Washing Machine, and wheel bearing data.
· Executed data analysis on the stored thousands of data through Apache Spark
Kaggle Machine Learning Competitions
· Ranked 234 (from 4245) in Jane Street Market Prediction, a competition to run the model against the future real market trading data. A classification problem handled through by omitting volatility days (outliers) and model training through Light GBM algorithm.
Among the top 4% in Predicting Housing Prices of residential homes in Ames, Iowa through Gradient Boosting technique with Mean Absolute Error Evaluation. A Competition for Kaggle Learner Users