INFO 5368-030: Practical Applications in Machine Learning
Cornell Tech
Spring 2026, 2025, 2024, 2023
Course Description
This course provides hands-on experience developing and deploying foundational machine learning algorithms on real-world datasets for practical applications including predicting housing prices, document retrieval, product recommendation, and image classification using deep learning. Students will learn about the machine learning pipeline end-to-end including dataset creation, pre- and post-processing, preparation for machine learning, training and evaluating multiple models. Students will focus on real-world challenges at each stage of the ML pipeline while handling bias in models and datasets.
Prerequisites: CS 2800 or equivalent, linear algebra, probability, and experience programming with Python, or permission of the instructor.
Reading: Géron, Aurélien. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O'Reilly Media, Inc.", 2022.
Introduction to PAML
Learning Outcomes
-
Prepare datasets for a ML task, train and evaluate ML models
-
Understand core challenges of dataset creation including handling missing data, bias, among others
-
Visualize features in datasets to be used for ML tasks
-
Apply, analyze, and identify key differences in regression, classification, clustering, deep learning, and Agentic AI
-
Evaluate model quality using appropriate metrics of performance
-
Build front- and back-end ML pipelines for analysis of ML performance and tools for ML practitioners.
Introduction to HRI PAML
Course Schedule

Introduction to PAML Course
Final Projects, 2026
Students propose a new ML application that addresses a real-world problem and provides a front-end and back-end solution to users for well-justified user-cases. Students will select an application area (e.g., robotics, healthcare, social media) search for or collect a dataset to address a problem, build an end-to-end ML pipeline, evaluate the algorithms using standard metrics, create visualization tools to analyze ML performance, create a front- and back-end application.
Course projects will be done in groups of up to 4 students and consist of the following tasks:
-
Application of machine learning to a practical problem of your choice. Improvements to machine learning algorithms.
-
Comparison to three or more machine learning methods
-
Evaluate model performance using two or more metrics.
-
Comparison on one or more benchmarks.
-
Analysis of machine learning models.
AtmoSound: An ML Framework for Venue-Adaptive Playlist Generation
Abstract—Many venues currently rely on guesswork when selecting music. AtmoSound solves this by creating data-driven playlists using machine learning. Our system analyzes venue data from Google Maps Places API and generates Spotify-style playlists optimized for atmosphere, predicting key audio features like danceability and energy. Trained on a large dataset of Manhattan venues, our model achieved near-perfect accuracy (MSE of 0.000281 and 98.6% genre retrieval). We developed a Streamlit web application that allows venue owners to easily create personalized playlists using their Google Maps URL. This demonstrates the value of venue metadata in music recommendation, offering a more effective approach to improve customer experience and drive loyalty.
Abstract—Many venues currently rely on guesswork when selecting music. AtmoSound solves this by creating data-driven playlists using machine learning. Our system analyzes venue data from Google Maps Places API and generates Spotify-style playlists optimized for atmosphere, predicting key audio features like danceability and energy. Trained on a large dataset of Manhattan venues, our model achieved near-perfect accuracy (MSE of 0.000281 and 98.6% genre retrieval). We developed a Streamlit web application that allows venue owners to easily create personalized playlists using their Google Maps URL. This demonstrates the value of venue metadata in music recommendation, offering a more effective approach to improve customer experience and drive loyalty.
Synthetic Emergency Room Triage Level Prediction Using Machine Learning
Abstract—Improving emergency room (ER) triage is crucial for patient safety and efficiency, but challenging in high-volume settings. This research investigates whether machine learning can reliably predict triage urgency levels. We developed and compared three NumPy-based models – multinomial logistic regression, a multilayer perceptron (MLP), and a random forest – using a synthetic triage dataset of 18,000 rows. The MLP achieved the highest macro-F1 (0.9399) and accuracy (0.9478), serving as the basis for a Streamlit application. This application connects patient input to a streamlined triage prediction pipeline, providing both urgency levels and associated probabilities. This project demonstrates a practical, interpretable, and easily deployable machine learning solution for ER triage decision support and education.
Abstract—Improving emergency room (ER) triage is crucial for patient safety and efficiency, but challenging in high-volume settings. This research investigates whether machine learning can reliably predict triage urgency levels. We developed and compared three NumPy-based models – multinomial logistic regression, a multilayer perceptron (MLP), and a random forest – using a synthetic triage dataset of 18,000 rows. The MLP achieved the highest macro-F1 (0.9399) and accuracy (0.9478), serving as the basis for a Streamlit application. This application connects patient input to a streamlined triage prediction pipeline, providing both urgency levels and associated probabilities. This project demonstrates a practical, interpretable, and easily deployable machine learning solution for ER triage decision support and education.
AgentMod: An Agentic AI System for Adaptive Content Moderation on Social Media
Anushka Vijay Kumar Naik, Shubham Manish Gandhi, Aagam Bakliwal, Abhijay Rane, Om Ganesh Kamath
Consultant: Saloni Gandhi
Abstract—Content moderation systems must balance harmful- content detection against false positives on ambiguous language. We present AgentMod, an end-to-end moderation pipeline that combines a from-scratch neural network for toxicity scoring with K-means user-risk clustering. Specialized agents for text toxicity, user risk, and policy are coordinated through a gated fusion mechanism that incorporates user-risk information when the neural network is uncertain. On the Jigsaw Toxic Comment dataset, AgentMod improves full-test F1 from 0.7595 to 0.7655 and reduces false positive rate (FPR) from 0.0316 to 0.0181. On a hard subset of ambiguous comments, AgentMod reduces FPR from 0.3632 to 0.0355 while increasing precision from 0.6593 to 0.9369, with lower recall and F1 due to tuning that penalizes false positives. These results show that contextual behavioral signals can reduce false positives while maintaining competitive overall performance.
Anushka Vijay Kumar Naik, Shubham Manish Gandhi, Aagam Bakliwal, Abhijay Rane, Om Ganesh Kamath
Consultant: Saloni Gandhi
Abstract—Content moderation systems must balance harmful- content detection against false positives on ambiguous language. We present AgentMod, an end-to-end moderation pipeline that combines a from-scratch neural network for toxicity scoring with K-means user-risk clustering. Specialized agents for text toxicity, user risk, and policy are coordinated through a gated fusion mechanism that incorporates user-risk information when the neural network is uncertain. On the Jigsaw Toxic Comment dataset, AgentMod improves full-test F1 from 0.7595 to 0.7655 and reduces false positive rate (FPR) from 0.0316 to 0.0181. On a hard subset of ambiguous comments, AgentMod reduces FPR from 0.3632 to 0.0355 while increasing precision from 0.6593 to 0.9369, with lower recall and F1 due to tuning that penalizes false positives. These results show that contextual behavioral signals can reduce false positives while maintaining competitive overall performance.
DiaMetrics: A Diabetes Risk Assessment Tool
Abstract—Type 2 diabetes is a significant public health challenge, and early identification is crucial. This project develops a machine learning pipeline to predict diabetes risk using data from the 2024 BRFSS. We compared Logistic Regression, SVM, and Naive Bayes, finding Logistic Regression and SVM outperformed the baseline. A Streamlit web application was created, incorporating the best-performing model, to allow users to determine their diabetes risk. This project delivers a practical, interpretable screening tool, contributing to research on behavioral predictors of metabolic disease.
Abstract—Type 2 diabetes is a significant public health challenge, and early identification is crucial. This project develops a machine learning pipeline to predict diabetes risk using data from the 2024 BRFSS. We compared Logistic Regression, SVM, and Naive Bayes, finding Logistic Regression and SVM outperformed the baseline. A Streamlit web application was created, incorporating the best-performing model, to allow users to determine their diabetes risk. This project delivers a practical, interpretable screening tool, contributing to research on behavioral predictors of metabolic disease.
Behavioral Archetypes of Prediction Market Traders: Unsupervised Clustering and Classification of Wallet-Level Activity on Polymarket
Abstract—Polymarket’s large prediction market reveals extreme user concentration: the top 1% captures 84% of gains. This research explores the underlying behavioral patterns of traders using unsupervised clustering on wallet activity data. We identified five distinct archetypes – contrarians, conviction buyers, etc. – using K-Means, Hierarchical Clustering, and K-Nearest Neighbors, achieving 83.8% archetype accuracy. A Streamlit application allows researchers to explore wallet behavior and classify new participants. This bottom-up analysis provides a valuable lens on prediction market dynamics and offers a reusable methodology for analyzing anonymous blockchain participants.
Abstract—Polymarket’s large prediction market reveals extreme user concentration: the top 1% captures 84% of gains. This research explores the underlying behavioral patterns of traders using unsupervised clustering on wallet activity data. We identified five distinct archetypes – contrarians, conviction buyers, etc. – using K-Means, Hierarchical Clustering, and K-Nearest Neighbors, achieving 83.8% archetype accuracy. A Streamlit application allows researchers to explore wallet behavior and classify new participants. This bottom-up analysis provides a valuable lens on prediction market dynamics and offers a reusable methodology for analyzing anonymous blockchain participants.
Predicting 3D Print Failure Before Execution Using Machine Learning
Bryant Jiang, Hannah Liang, Weicong Hong, Carina Hu, Jully Li, Feier Su
Mentor: Niti Parikh, Director of Learning Spaces & MakerLABs, Cornell Tech
Abstract—Fused Deposition Modeling (FDM) 3D printing failures are common in makerspaces, leading to wasted time and frustration. This project develops a machine learning-based pre-print verification tool to predict print success or failure. We compared Logistic Regression and a Neural Network, trained exclusively on synthetic MakerLAB data, to identify key failure factors. The Neural Network showed promise for capturing nonlinear relationships between printing parameters and geometry. Performance was evaluated using accuracy, precision, recall, and ROC-AUC, prioritizing recall to minimize undetected failures. A Streamlit application allows users to upload G-code, receive success predictions with confidence scores, and obtain actionable feedback. This work improves efficiency and usability in digital fabrication, supporting resource-efficient workflows and reducing material waste.
Bryant Jiang, Hannah Liang, Weicong Hong, Carina Hu, Jully Li, Feier Su
Mentor: Niti Parikh, Director of Learning Spaces & MakerLABs, Cornell Tech
Abstract—Fused Deposition Modeling (FDM) 3D printing failures are common in makerspaces, leading to wasted time and frustration. This project develops a machine learning-based pre-print verification tool to predict print success or failure. We compared Logistic Regression and a Neural Network, trained exclusively on synthetic MakerLAB data, to identify key failure factors. The Neural Network showed promise for capturing nonlinear relationships between printing parameters and geometry. Performance was evaluated using accuracy, precision, recall, and ROC-AUC, prioritizing recall to minimize undetected failures. A Streamlit application allows users to upload G-code, receive success predictions with confidence scores, and obtain actionable feedback. This work improves efficiency and usability in digital fabrication, supporting resource-efficient workflows and reducing material waste.
A Machine Learning Approach to ZIP-Code-Level Traffic Crash Prediction in New York City
Abstract—Traffic accidents remain a critical safety concern in densely populated cities like New York City. This paper presents a fine-grained crash prediction system for NYC, leveraging 2020-2025 data on crashes, traffic, and weather. We developed a two-stage pipeline: first, estimating ZIP code-level traffic volume using Ridge and Bayesian Linear Regression (R² = 0.846, RMSE = 2,153); second, predicting crash counts with Poisson and Negative Binomial Regression, confirming NB as the statistically superior model. The resulting pipeline, deployed as a Streamlit application, provides real-time crash predictions and enables exploration of historical patterns – supporting smarter urban safety decisions.
Abstract—Traffic accidents remain a critical safety concern in densely populated cities like New York City. This paper presents a fine-grained crash prediction system for NYC, leveraging 2020-2025 data on crashes, traffic, and weather. We developed a two-stage pipeline: first, estimating ZIP code-level traffic volume using Ridge and Bayesian Linear Regression (R² = 0.846, RMSE = 2,153); second, predicting crash counts with Poisson and Negative Binomial Regression, confirming NB as the statistically superior model. The resulting pipeline, deployed as a Streamlit application, provides real-time crash predictions and enables exploration of historical patterns – supporting smarter urban safety decisions.








