This month marks the 10 year anniversary for IBM Watson computer beating two human players on the American television game show Jeopardy!
Back then, I wrote a series of blog posts, starting with [ Watson: What is the Smartest Machine on Earth?] explaining some of the components. To help my readers appreciate the complexity of the problem, I provided [Homework Assignment: Understanding Jeopardy!] along with my own [Results and Observations].
Many people had assumed that IBM Watson was accessing the Internet. It was not, to be fair to the other two players, it could only answer from information it had learned. This was an important distinction, because IBM Watson was targeted to be used by large corporations that wanted to search and access their proprietary, internal data sources. An IBM Watson computer that relied on publicly available information on the Internet would not provide the level of competitive advantage our clients would pay for.
The information stored to answer all questions is called the "corpus", and I was surprised that it was less than 1TB in size for the Jeopardy event. I wrote a blog post on [what was in that 1TB] to cover all the information the IBM Research team collected for this event.
The [televised event] spanned three days. On Monday, I helped host a [Jeopardy Watch Event] at the Tucson Executive Briefing Center for local politicians and other dignitaries. I presented how IBM Watson worked before the show started, then we all watched the first live episode live together.
Note: This was not the first time IBM dabbled in Artificial Intelligence, or "AI" for short. [IBM Deep Blue computer beat world chess champion Garry Kasparov] 25 years ago! Unlike Deep Blue's chess-playing prowess, the [Business Intelligence, Data Retrieval and Text Mining] of IBM Watson had real-life applications. Today, we are familiar with Siri, Alexa, Bixby, and other "computer voice assistants", but most of these didn't exist back then, and the concept of having a conversation with a computer was the realm of science fiction movies.
My last post in my series explained [How to replicate Watson hardware and systems design for your own use in your basement]. The instructions were downloaded nearly 300,000 times, ranking it one of my top 10 blog posts.
My work explaining IBM Watson resulted in traveling all over the world presenting this technology, being interviewed by the media, and becoming the #1 blogger at IBM. I am still [mentioned by name in Wikipedia] about IBM Watson! I consider this one of my key achievements of my career.
A lot has happened over the past 10 years.
A few years later, the IBM Cloud team asked me to help a bunch of college graduate students at [University of Toronto (Canada)] build their own "Watson, Jr." from scratch on the IBM Cloud platform. They envisioned this would be a fun educational exercise for a semester or year-long course. Management approved my participation since I was basing all of the effort on publicly-available information and open source code. After working all summer, our team was able to get a working version that processed a single-page corpus, and everyone agreed that this was perhaps too difficult for the general student population.
Fun fact: One of the funniest mistakes Watson made during day 2 of the three-day event was its response to the category "U.S Cities". The clue was "Its largest airport is named for a World War II hero; its second largest, for a World War II battle". Watson responded "What is Toronto?????" With five question marks to indicate it was not sure of this. While the most famous "Toronto" is in Canada, there are actually U.S. cities named "Toronto" in Kansas, South Dakota and Ohio. Both of the human contestants, Ken Jennings and Brad Rutter, of course, [responded correctly].
A lot of companies were excited about the IBM Watson technology, but did not realize you can't just purchase a machine and have it immediately answer questions for your own industry-specific area of knowledge. It took a team of 25 IBM researchers nearly four years to train IBM Watson to play Jeopardy!
The General Parallel File System (GPFS) used in IBM Watson is now called IBM Spectrum Scale, and is used in many of the world's fastest supercomputers, including [Summit at the Oak Ridge National Laboratory (ORNL) in Tennessee], and [Sierra at the Lawrence Livermore National Laboratory in California]. IBM Spectrum Scale is considered the "gold standard" in [Storage for AI and Big Data] applications.
Last summer, motivated by the protests in the United States, I participated in IBM's [Call for Code for Racial Justice]. My team's project [Legit Info], one of the five selected finalists, used IBM Watson to analyze the text of U.S. federal and state legislation to help people find laws that have particular impact to your community.
Today, IBM is transitioning to become the world's leading "Hybrid Cloud and AI" company. The IBM Watson technology has been deployed to help with cancer diagnoses, income tax preparation, and weather forecasting. The IBM Watson now is available as a set of services on the IBM Cloud for everyone to have access to.
I am proud to be part of IBM's history!