Etsy Product Success Analysis
Analyzed an Etsy product dataset to understand what drives product success, uncovering how pricing, titles, brand scale, and keyword strategies impact engagement.

Project overview
This project explores what drives product success on Etsy using review count as a proxy for engagement. I analyzed listing features such as price, title length, images, and ratings, along with brand-level performance and keyword usage. Through statistical testing, regression modeling, and text analysis, I found that individual listing features have limited explanatory power, while brand dynamics and keyword specificity play a much larger role. The project highlights how successful listings differentiate themselves through positioning and clarity rather than just optimization of individual attributes.
Problem
I wanted to understand what actually drives product success on Etsy. Specifically, I was interested in whether listing-level features like price, images, and titles meaningfully impact performance, or if success is driven more by broader factors like brand strength and positioning.
Approach / process
I started by cleaning and preparing the dataset, engineering new features such as title length, description length, and photo count. I then performed exploratory analysis to identify trends, followed by statistical testing and regression modeling to validate those findings. From there, I expanded the analysis to include brand-level performance and keyword analysis, comparing high-performing listings against the rest of the dataset.
Implementation details
- Cleaned and transformed raw Etsy listing data using R and the tidyverse - Engineered features such as photo_count, name_length, and log_reviews - Used boxplots, scatterplots, and summary tables for exploratory analysis - Performed t-tests to compare top-performing products against others - Built a regression model to evaluate the impact of multiple variables - Conducted keyword analysis using tokenization and relative frequency (lift) - Compared title and description effectiveness for keyword signaling
Results / outcomes
- Lower-priced products tend to receive more engagement - Longer, more descriptive titles are associated with higher performance - Photo count initially appeared important but was not significant in regression - Brand-level factors (scale and efficiency) play a major role in success - Keyword specificity (not just popularity) differentiates top listings - Listing features alone explain only a small portion of performance
Gallery
Related projects

Demand Forecasting for Regional Distribution Centers
Built a time series forecasting system in R that compared five models on 2.5 years of weekly distribution center data, then deployed the winning Prophet model as a production-ready Dockerized REST API.

CineNiche Streaming Platform
A full-stack movie discovery platform built during BYU’s INTEX 2025 that enables users to browse, rate, and organize films while receiving personalized recommendations through content-based and collaborative filtering.
