Please join the IBM Data Science Community for our March 2022 virtual meetup.
10.45am – Doors open
11.00am – Welcome
11.02am – IBM Community announcements
11.08am – Lightning talk (5'): Exploring the Manning liveProject tutorial for IBM Project Debater API (Paco Nathan)
11:15am – Main talk (40'): Trustworthy Machine Learning (Kush R. Varshney) plus Q&A
11:55am – Mingle
12:00pm – Official event ends (IBM community team and speakers may stick around a little bit longer if there are still more questions)
💬 Lighting talk (5') – Exploring the Manning liveProject tutorial for IBM Project Debater API
Suppose you're on a data science team that handles lots of text: customer responses, news analysis, medical transcriptions, bill of materials, patent applications, or other similar cases. How do you manage data quality within the stream of input data? How do you determine which elements of the surrounding text support the arguments that get extracted by ML models? How do you expand the phrases your customers use to link into more commonly-used phrases?
While there have been incredible advances in the use of language models for predicting sequences of text, other areas of NLP have also been advancing. This “liveProject” tutorial at Manning provides goal-oriented, hands-on coding examples of how to leverage the IBM Project Debater API to analyze a Kaggle dataset about medical transcriptions. Milestones in this tutorial include data preparation techniques in Pandas, use of the Key Point Analysis, Argument Quality, and Term Wikifier API services, data visualization in Seaborn, and building an interactive dashboard of the results in Streamlit.
👨🏻💻 Speaker bio
Paco Nathan is Managing Partner at Derwen, Inc. Known as a “player/coach”, with core expertise in data science, cloud computing, natural language, graph technologies; ~40 years tech industry experience, ranging from Bell Labs to early-stage start-ups. Advisor for Amplify Partners, Recognai, KUNGFU.AI. Lead committer PyTextRank, kglab. Formerly: Director, Community Evangelism @ Databricks and Apache Spark.
💬 Main talk (40') – Trustworthy Machine Learning
Accuracy is not enough when you’re developing machine learning systems for consequential application domains. You also need to make sure that your models are fair, have not been tampered with, will not fall apart in different conditions, and can be understood by people. Your design and development process has to be transparent and inclusive. You don’t want the systems you create to be harmful, but to help people flourish in ways they consent to. All of these considerations beyond accuracy that make machine learning safe, responsible, and worthy of our trust have been described by many experts as the biggest challenge of the next five years. This talk aims to equip you with the thought process to meet this challenge.
👨🏻💻 Speaker bio
Kush R. Varshney is a distinguished research staff member and manager with IBM Research at the Thomas J. Watson Research Center, Yorktown Heights, NY, where he leads the machine learning group in the Foundations of Trustworthy AI department. He was a visiting scientist at IBM Research - Africa, Nairobi, Kenya in 2019. He is the founding co-director of the IBM Science for Social Good initiative. He applies data science and predictive analytics to human capital management, healthcare, olfaction, computational creativity, public affairs, international development, and algorithmic fairness and conducts academic research on the theory and methods of trustworthy machine learning. He self-published a book entitled 'Trustworthy Machine Learning' (http://www.trustworthymachinelearning.com) in 2022.