I build machine learning models and data pipelines that extract meaning from messy, real-world data — from NLP and deep learning to RF signal classification.
About me
I'm a data analytics student with a focus on applied machine learning — building systems that work on real, noisy data. My projects span natural language processing, time-series deep learning, and RF signal intelligence.
I enjoy the full pipeline: from wrangling raw data and engineering features, to training models, evaluating performance, and communicating results clearly.
Skills
Projects
Built an end-to-end NLP pipeline on the Kaggle Movie Review dataset to classify sentiment across five levels. Implemented full text preprocessing — tokenization, stopword removal, and lemmatization via spaCy — then engineered bag-of-words features and trained a Naive Bayes classifier with 5-fold cross-validation to evaluate precision, recall, and F-measure. Also compared against Logistic Regression and VADER for experiment breadth.
Designed and trained an LSTM neural network to predict Apple (AAPL) stock prices from 60-day historical sequences. Preprocessed close prices with MinMaxScaler, built a two-layer LSTM with dropout regularization, and trained with early stopping and learning rate reduction callbacks. Evaluated on held-out test data with RMSE, MAE, R², and MAPE to quantify prediction accuracy.
Classified radio frequency signals from raw I/Q data using CatBoost and Random Forest. Tackled a key data engineering challenge: parsing complex I/Q arrays from string cells, standardizing sequence lengths to 100 samples, and normalizing signals to unit power. Applied ITU frequency band labeling to map signals to VHF sub-bands and technologies, used SMOTE to handle class imbalance, and evaluated with precision-recall curves and confusion matrices.
Resume
Contact