Data Scientist focused on Sports Analytics, Local Tourism, AI Safety, and Supply Chain Optimization.
Featured Project
Using AI to predict the outcomes of NBA games with a streamlined, model-focused approach.
• Data Pipeline:
Complete PBP → GameStates → Features → Predictions pipeline processing multiple seasons
of NBA data in an optimized SQLite database.
• Prediction Engines:
Multiple models including Ridge Regression, XGBoost, and PyTorch MLP, with ensemble
predictions and ongoing development of deep learning and GenAI engines.
• Web App: Flask-based
platform for accessing game predictions and live scores with real-time updates.
Past Work
(See NBA AI for newer version) Using data analytics and machine learning to create a comprehensive and profitable system
for predicting the outcomes of NBA games.
• Data acquisition architecture that
leverages Scrapy, Airflow, and RDS Postgres to analyze and store data on NBA teams,
players, and games from a diverse range of data sources.
• Data modeling setup employing AutoML for
quick iteration and Deep Learning Transformer models for optimized performance.
• Public-facing web application and
dashboard to showcase predictions and results.
Predicting Yelp Review Quality
Utilizing the Yelp Open Dataset, this project predicts review quality to enhance user engagement and satisfaction. It leverages Apache Spark for ETL processing and AWS RDS for database hosting, while incorporating advanced feature engineering techniques and machine learning models. Through text analysis and sentiment analysis, it offers improved insights into user behavior and drives data-driven decision-making on Yelp.
In this project, I implemented machine learning and natural language processing techniques to predict fraudulent events from transaction data. The results were visualized through an intuitive Flask web application, deployed on AWS. This project highlights my ability to transform complex data into actionable insights.