The Watson Challenge

The NetFlix Challenge has forever become part of machine learning lore.  In exchange for giving data gurus a smidgeon of anonymized user data and $1 million, NetFlix received not only an incredible set of recommendation algorithms from the winning team: they also earned the reverence of every AI and data mining guru out there, for creating one of the first opportunities to make machine learning an exciting event.

Watson on "Jeopardy". From Wired Magazine (

I think IBM should steal a page from NetFlix’s book and hold a contest to build the best algorithm for their product.  And they now have the perfect product for such a challenge: the neural net machine Watson, famous for beating Ken Jennings and Brad Rutter in Jeopardy.  (The episode doesn’t air until February 14th, but it seems that news of Watson’s victory has already spread far and wide.)

Why would IBM want to do such a thing?  One word: exposure.  Very few people understand the power behind machine learning and neural net processing, especially in answering sophisticated questions of the kind shown in Jeopardy.  In one easy step, IBM could grab the world’s attention far more powerfully than a single Jeopardy session.  And it’s a perfectly natural move for IBM, since the company is already well known for its commitment to the open source community.

How would the contest work?  Just like the NetFlix Challenge, the goal would be to improve the existing algorithm by 10% in some key metric.  In this case, I believe the ‘key metric’ is simply Watson’s final score in a random game of Jeopardy.  Here are the details, as I would imagine them:

  • IBM provides participants with a web API to a Watson virtual machine.  The web API restricts the public from seeing the specifics of Watson’s internal architecture, but allows participants to test their algorithms on training inputs drawn from real Jeopardy sessions.
  • IBM sets the victory condition for the contest: improve Watson’s final score by 10% or more in a virtual Jeopardy match that uses a randomized set of questions not shown to participants.
  • Participant teams begin their work.  As the contest continues, IBM reveals the contest progress, specifically how much better the leading teams’ Watsons’ scores are.
  • When the 10% threshold is broken, the contest begins its final phase: in 30 days, the team with the best performance wins the contest, to the tune of $1M (or possibly more, since I’m sure IBM has significantly more cash available than NetFlix did in 2007).
  • The winning team must explain how they did it: IBM posts the winning Watson’s algorithm and architecture for the world to see.  Better yet, have the machine participate in an actual session of Jeopardy!

So, come on IBM, dare us to build a better Jeopardy player than Watson.  If we can’t, then you’ve proved something truly impressive: that IBM’s machine learning group the best in the world.  But if we do, you’ve got a better algorithm, and together we’ve built a formidable AI machine that would perhaps be more aptly named ‘Sherlock’.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s