In the coming few days, we’ll be posting blog posts from several of our model submitters articulating how they’ve built their model and how it’s performed thus far.
The second model in this series is from Chayacore. You can read more about this model from its creator below.
RuPaul's Predict-a-Looza: data science + drag queens + fierce competition that will obviously lead to a lifetime of bragging rights and a lucrative career as a part-time drag statistician? SIGN ME UP. This is very on-brand for me and I'm super glad the nerds at Data for Progress thought it was a good idea. Thanks nerds!
Building the Model
I started with Shira Mitchell's example model as a good excuse to teach myself more about Stan. I've done some Bayesian inference but wanted to expand my toolkit, and Shira's code is well-commented with excellent citations. Shira's model starts with age, past wins, and a coefficient for each contestant in a multi-level conditional logistic regression model, and uses Stan to apply Markov Chain Monte Carlo methods for Bayesian inference.
In checking out the base dataset I knew I'd want to add more variables about the queens. Like a good scientist, I of course leaned on the prior literature on this topic, Alex Hanna's survival analysis from 2013. And being a Drag Race fan, I had my own theories about what would contribute to a queen's success.
For my time-invariant covariates I gathered data on race/ethnicity and specifically whether the queen is Puerto Rican, whether the queen is plus sized, whether the queen belongs to a drag house (and if so, which one), whether the queen had drag family who had competed on previous seasons, whether the queen is trans or non-binary, and a personal curiosity: whether the queen is from New York City (from over here it seems like that drag community is TIGHT). All of my data collection for these variables came from the RuPaul's Drag Race Wiki, surely a high-quality source of unimpeachable facts.
To enable maximum flexibility in model selection, I generated the following time-variant covariates based on the existing dataset: number of times the queen has been safe, in the top but not won, lipsynced but not gone home, in the bottom but not lipsynced, as well as the past_wins covariate already defined by Shira's example code. I also generated a couple of aggregate measures: number of times the queen was in the top or won, and number of times the queen was in the bottom or lipsynced.
I then indulged in a great deal of sport fishing to land upon a final (or final-ish) model. Because running the full model with leave-future-out cross-validation takes about 4 hours on my laptop, I ran a number of quick-fit logistic regression models and correlation matrices to see what seemed promising, and by "a number of" I mean at last count I fished through 30 models. Would I do this if it were for science? No. But this isn't science, this is Recreational Statistics(tm) and this is a competition, this isn't RuPaul's Best Friends Race. I had a lot of fun indulging all my p-hacking whims in search of the best possible model. As Silky Nutmeg Ganache says, "When business comes, I'm gonna handle business. But when it's time for pleasure, I'mma have pleasure."
Some patterns emerged: being Asian showed a weak positive effect for episode winners, being Latino and/or Puerto Rican showed a stronger effect for episode losers. Specific drag houses were completely equivocal but belonging to a drag house at all had a positive effect for episode winners. I saw no real effects for age, plus size, or whether the queen was white or Black. Surprisingly to me, there were no notable effects for whether a queen's drag family had competed before or whether she was from NYC. Past performance was predictive, but not necessarily in the ways I expected: wins did not predict future wins, but past lipsyncs did predict future losses.
The Rough Formula
Ultimately, I ran a few models through the full cross-validation and landed upon this one for my week 2 predictions:
placement ~ asian + latino + drag_house_ind + past_top + past_bottom + alpha
….where drag_house_ind is a binary indicator of a queen's membership in an established drag house, past_top is the number of times the queen was in the top or won, past_bottom is the number of times the queen was in the bottom or lipsynced, and alpha is the coefficient for the queen herself. To get a better sense of the predictive power of the model, I separated out the accuracy for winners and losers, and this model correctly predicted the winners 19% of the time and correctly predicted the losers 34% of the time. Pretty good!
My week 2 predictions: Plastique Tiara wins, and Nina West goes home. A curious tension in having a RuPaul's Drag Race model: I want the machines to get it right, but the machines don't always align with my personal hopes and desires. Love you Nina West! Glad the machines were wrong!
For my week 3 predictions, I engaged in a ridiculous amount of Recreational Statistics(tm) and played around with model specifications A LOT. Because the full cross-validation takes 4 hours, I was selecting a model spec and then setting it running before bed every night, then checking to see how it did in the morning. Here's the thing: I didn't really get much better. I did make a perhaps unconventional change to the data though: Shira's model filters out atypical episodes, or those that have more than one winner or more than one loser. I added those back in to see how we'd do. I used the same model as week 2, but with the atypical episodes included in the priors, and got 14% accuracy for winners and 27% accuracy for losers. My week 3 predictions: Scarlet Envy wins and Mercedes Iman Diamond goes home.
PARTIAL VICTORY!! The week 3 winnerS (!) were Scarlet Envy and Yvie Oddly, and there were six, count ‘em, six, lip syncers. This made me more certain about leaving the atypical episodes in, and for week 4 I made sure to include those results in my priors.
The other change I made to the model in week 4 is that I separated out all the past performance covariates. Previously I'd been concerned about collinearity, but with the atypical episodes included I could look at episode placement as relatively independent. I made a small change to the race/ethnicity covariates: dropped Asian, dropped Latino, and added back in Puerto Rican (on the theory that the Latino variable was mostly absorbing the effects of the Puerto Rican variable).
My week 4 model looked like this:
placement ~ puerto_rican + drag_house_ind + past_wins + past_high + past_safe + past_low + past_lipsync + alpha
This model predicted the winners correctly 17% of the time and the losers correctly 29% of the time. My week 4 predictions: Vanessa Vanjie Mateo wins, and Mercedes Iman Diamond goes home.
ANOTHER PARTIAL VICTORY!! Time to start putting money on these predictions. Maybe just the losers.
As of week 4, I was thinking the current iteration of the Chayacore model may be predicting what makes good reality television more than anything else – if a queen has been in the bottom or lipsyncing for a while, she's likely to go home, and if she won last week she's probably not going to win this week, gotta spread that drama around. My model fitting shows this too: past wins have a slightly negative effect on current episode wins, but past high performance has a positive effect. Past lipsyncs are strongly predictive of episode losses.
My week 5 predictions were Silky Nutmeg Ganache to win and Ra'Jah D. O'Hara to go home - since Silky just won week 4, it seemed extra unlikely that she'd win week 5. My new theory: disaggregating episode placement types was a mistake, and now the model is weighting Silky's 3 "safe" placements too heavily.
Similarly, for week 6, my model predicts Brooke Lynn Hytes to win and Ra'Jah D. O'Hara to go home. I love Brooke Lynn but I still don't think she's going to win two weeks in a row, even if she is the first queen this season to have two wins under her garter belt. I ran some model diagnostics to see where I might be leaking but nothing obvious came up - I even re-aggregated the "high" and "safe" placements together and took out wins entirely, and the model is still pretty sure about those predictions, so I'm just going to go with it for now and see what happens. Running through the cross-validation, the model as it is predicted the winners correctly 13% of the time and the losers correctly 32% of the time (and none of the alternatives I tried did any better).
Overall after the last round of model diagnostics, I feel more-or-less-good about the current model and am unlikely to change it significantly before the end of the season, but I do have some thoughts about what I would do in a universe where data availability was no issue. I'd collect data about strengths and talents of each queen (singing? dancing? sewing? a killer Cher impression?) – this is tricky though, if we rely on self-reporting we might wind up with a scenario like Episode 4's with Ra'Jah D. O'Hara claiming dance/choreo experience that turned out to be 15 years old. To go with this, I'd try to collect data about the upcoming episode's maxi challenge – this would probably mean trawling Reddit for spoilers. I'm also super curious about the Puerto Rican effect and have some thoughts about analysis that looks at what kinds of challenges Puerto Rican queens are most likely to be eliminated on - in the meantime I hope Miss Vanjie sidesteps this effect because I am here for her expensive showgirl realness and I REALLY want to see her Snatch Game.
For the record and just between us squirrel friends: if you ask me and not the machines, I stan for Yvie Oddly, 100%. I am always here with all the love for the weirdos and freakshows and look forward to Yvie shaking down this crown. (The machines will catch up eventually.)