Model Spotlight: Bayes the House Down
In the coming few days, we’ll be posting blog posts from several of our model submitters articulating how they’ve built their model and how it’s performed thus far.
The third and final model in this series is from Bayes the House Down. You can read more about this model from its creator below.
Model Overview
I decided to use a Naïve Bayes Classifier (NBC), implemented in R with the package e1071. I chose the NBC because it is simple to implement, and runs very quickly. This was my first experience working with a Naïve Bayes Classifier, and I found this blog post to be very helpful.
For features, the model considers a Queen’s age, home state, past wins, and past losses (thank you Data For Progress for all the great Google Sheets!). For training the classifier, I considered all High/Top performances to be wins, and all Low/Bottom2 performances to be losses. Additionally, for the first few episodes, the model is only training on past episodes that occurred at the beginning of a season (we chose this training strategy because challenges tend to be different at the beginning/end of the season). The model is not trained on any social media data because since Mama Ru makes all of the decisions, I think that social media is more reactive than predictive or a Queen’s performance.
Weekly Performance
Team Bayes the House Down is proud to report that the algorithm correctly predicted that Brook Lynne Hytes would be the winner of the first episode. However, I've have been out of luck since then aside from a few close calls. I think that the classifier might be putting too much negative weight on age, because for a while it kept predicting Nina West as the loser. This week is the first week I'll be training the classifier on "mid-season" episodes. For this week, the model predicts that Yvie Oddly will win, and that Ra'jah D. O'Hara will sashay away.
You can learn more about the algorithm and see all of its predictions and performance over at my website.