For two weeks at the end of every summer, more than 700,000 people make the journey to Flushing Meadows in Queens, New York, to watch the best tennis players in the world compete in the US Open Tennis Championships.
It is one of the most highly attended sporting events in the world.
But more than 10 million global tennis fans follow the tournament through the US Open app and website. And to keep them coming back for more, year after year, the United States Tennis Association (USTA) has worked side-by-side with IBM Consulting™ for more than three decades, developing and delivering a world-class digital experience that constantly advances its features and functionality.
To help the US Open stay on this cutting edge of customer experience, IBM Consulting worked closely with the USTA to develop generative AI models that transform tennis data into insights and original content on the US Open app and website. IBM® watsonx, a next-generation AI and data platform, built and manages the entire lifecycle of the AI models that produce key app features such as Match Insights and the new AI Commentary for US Open highlight reels.
The USTA asked IBM to add spoken commentary to the video highlight reels that are produced for every singles match throughout the tournament. To do it, the IBM Consulting team built a generative AI solution, based on a powerful large language model called Sandstone, available through watsonx.ai. Sandstone already understands the English language, but it needed to be trained, or “tuned,” on tennis data in order to translate tennis scenes into complete sentences.
The team used watsonx.data to connect and curate the USTA’s trusted data sources. The curation process includes de-duping and filtering the foundational data that informs the large language model, as well as the USTA’s proprietary data. The process eliminates things like profanity or abusive language and manages data usage compliance with privacy regulations like the General Data Protection Regulation (GDPR) standard.
The AI Commentary model is then trained to translate the metadata attached to video clips into sentences. It generates dozens of different options before choosing the best sentence to describe the action, taking care to vary the sentence structure from clip to clip, so as to avoid repetition. Text-to-speech capabilities are then used to give voice to the words.
The operation of the model is monitored and managed using elements of watsonx.governance, which ensures the AI is performant, compliant, and operating as expected. And the same suite of watsonx tools were used to build and manage the AI models that power the Match Insights feature with IBM Watson®. These models rely on AI-powered fact sheets that use sophisticated data analytics and natural language processing to distill millions of data points into meaningful insights about every singles match.
For example, tennis fans can see which players have the most momentum in the tournament by checking out the IBM Power Index. And prior to each match, they can see analysis from IBM Watson on which player has the highest likelihood to win. They can even see the relative difficulty of each player’s draw with the new AI Draw Analysis feature.
To develop new capabilities every year—like the ones found in the AI Commentary and Match Insights features—with IBM Watson, the USTA needs to move with speed and purpose. The process starts the week after the US Open concludes, when IBM Consulting kicks off work using the IBM Garage™ Methodology, a highly collaborative approach to co-creation.
In order to transform new ideas into digital reality, IBM Consulting built a platform of innovation for the US Open, capable of processing structured and unstructured data, and integrating technology from a variety of sources.
The raw material of any digital experience is data, and the US Open tournament produces a lot of it. For starters, each US Open consists of 128 men and 128 women singles players, and a total of seven rounds for each tournament. Each tennis player comes with his or her own data set, including age, height, weight, world ranking and recent performance. But that’s just the beginning.
Over the course of the tournament, more than 125,000 points will be played. And each one of those points generates its own data set: serve direction, speed, return shot type, winner shot type, rally count and even ball position. All told, more than seven million data points are generated during the tournament.
But to add more texture and context to the US Open digital experience, the team wanted to go beyond the numbers. So they are using AI to analyze the language and sentiment of millions of articles from hundreds of thousands of different sources to develop insights that are unique and informative, like the IBM Power Index.
To streamline this process, IBM Consulting built automated workflows that integrate and orchestrate the flow of data through the various applications and AI models needed to produce the digital experience. These workflows are made possible by a hybrid cloud architecture and the containerized apps running on Red Hat® OpenShift® (link resides outside ibm.com). The US Open hybrid multicloud architecture is made up of four public and three private clouds, drawing on data from a variety of sources and integrating features and capability from a variety of partners.
By containerizing the applications, the team can write them once and run them anywhere, ensuring the right data gets to the right application on the right cloud. And to keep the entire operation running smoothly, the team uses IBM Instana™ Observability technology, which constantly monitors application performance and surfaces issues in less than three seconds, so the team can take swift action and avoid any downtime.
Over the course of the tournament, it’s not unusual for the US Open digital platforms to be on the receiving end of more than 40 million security incidents. The type of threat varies, but most are looking for a crack in the armor and are not serious.
Defending the platform starts months before the tournament begins. Using the IBM Security® Randori Recon solution, the team conducts a comprehensive attack surface analysis, scanning the entire network for vulnerabilities, including third-party or adjacent networks. Following this security reconnaissance, IBM Security Randori then ranks those vulnerabilities by their attractiveness to hackers, allowing the team to prioritize its response.
Once the tournament begins, the US Open uses the IBM Security QRadar® Suite to assess the severity of each security event, evaluating threats, ignoring the insignificant ones and passing along only the most urgent issues to the security analysts. It then correlates that activity with threat intelligence from external sources, like the IBM X-Force® Exchange solution, looking for any activity that might be part of a more coordinated, global attack. And finally, the IBM Security QRadar Suite serves up recommendations to IBM analysts on how best to deal with the threat.
QUESTION I: How one could build a model to predict the US Open 2023 Winner?
QUESTION II: Since when IBM is working with United States Tennis Association?
REFERENCE: IBM Open Article, 12 Hilariously Wrong Tech Predictions