PROFESSIONAL PROJECTS
Aesthetics Market Model
Guidepoint • Python, SQL | Snowflake, Tableau | pandas, re, scikit-learn, Matplotlib
The Aesthetics Market Model is a proprietary market model specifically designed to provide insights into the aesthetics market, which includes data on trends, demand, patient preferences, competitive landscape, and more. It is based on proprietary data sources, such as point-of-sale data and survey data, and incorporates other factors such as demographic data, economic indicators, and regulatory changes to provide a comprehensive analysis of the aesthetics market.
Read more →
Aesthetics Directory
Guidepoint • Python, SQL | Snowflake, Tableau | pandas, re, BeautifulSoup
The Aesthetics Directory is a comprehensive database of aesthetics facilities across the US providing a wealth of information, including facility names, addresses, websites, contact details, treatments offered, affiliated doctors, facility types, and more. The directory serves as a powerful tool for market analysis, providing insights into trends, competitive landscapes, and high-demand areas. Armed with actionable data, companies can optimize marketing, tailor sales approaches, and strategically plan expansions in the dynamic and competitive aesthetic industry.
Read more →
ACADEMIC PROJECTS
COVID-19 Literature Clustering
Python | pandas, re, scikit-learn, Matplotlib, seaborn, spaCy, Bokeh
This project utilizes a comprehensive approach to organize and visualize COVID-19 literature. By employing k-means clustering, t-SNE dimensionality reduction, and Latent Dirichlet Allocation (LDA) for topic modeling, the dataset's dimensionality is reduced, and thematic clusters are identified. K-means and t-SNE independently reveal relationships between papers, while LDA enhances the understanding of each cluster by identifying prevalent keywords. The evaluation involves plot examination, paper review, and classification model testing.
Read more →
TasteAI: Restaurant Review Analyzer and Recommender
Python | pandas, gensim, nltk, numpy, pyLDAvis, seaborn, Matplotlib
The project focuses on categorizing and analyzing restaurant reviews using Latent Dirichlet Allocation (LDA) for topic modeling. By extracting underlying themes, the aim is to provide actionable insights for restaurants to enhance customer experience, self-evaluate, and stay competitive. This approach not only improves overall restaurant performance but also enables personalized recommendations for customers, contributing to a more satisfying dining experience.
Read more →
Maple Syrup Production Predictor
Python | statsmodels, numpy, pandas, seaborn, Matplotlib
The goal of the project was to build a multivariate time series forecaster to predict the production quantity of Maple Syrup for the next year considering different factors that affect its production—Daily Precipitation, Daily Soil Moisture, Daily Temperature, and Eight Day NDVI (Normalized Difference Vegetation Index). Different metrics were used to assess the causality and stationarity of different time series and to build the most suitable Vector Auto Regressive model.
Read more →
Credit Card Fraud Detector
Python | pandas, re, scikit-learn, sklearn, imblearn, seaborn, Matplotlib
The ultimate goal of the project was to develop a highly accurate and efficient machine learning model that can be used to detect fraudulent credit card transactions in real-time. The project also has broader implications for the financial industry, as the detection of fraudulent transactions is a critical aspect of financial security and risk management. It deals with several challenges of imbalanced datasets—bias issues, false accuracy, poor generalization, inappropriate evaluation metrics, etc.
Read more →
Emojifier
Python | Tensorflow, Keras, Matplotlib
The project involves using LSTM (Long Short-Term Memory) to analyze a large dataset of sentences and assign appropriate emojis to them taking the word order into account. The LSTM model is trained to understand the context and sentiment of the sentences and then generate a suitable emoji that reflects the meaning of the sentence. This project proves why LSTM is good at capturing long-term dependencies in the input sequences.