We had another fun virtual meetup in March. In case you missed it, the full recording is now available.
To kick things off, we heard a lightning talk from Paco Nathon, Managing Partner at Derwen, Inc., on “Exploring the Manning liveProject tutorial for IBM Project Debater API” (starts around the 7:18 mark):
Suppose you're on a data science team that handles lots of text: customer responses, news analysis, medical transcriptions, bill of materials, patent applications, or other similar cases. How do you manage data quality within the stream of input data? How do you determine which elements of the surrounding text support the arguments that get extracted by ML models? How do you expand the phrases your customers use to link into more commonly-used phrases?
While there have been incredible advances in the use of language models for predicting sequences of text, other areas of NLP have also been advancing. This “liveProject” tutorial at Manning provides goal-oriented, hands-on coding examples of how to leverage the IBM Project Debater API to analyze a Kaggle dataset about medical transcriptions. Milestones in this tutorial include data preparation techniques in Pandas, use of the Key Point Analysis, Argument Quality, and Term Wikifier API services, data visualization in Seaborn, and building an interactive dashboard of the results in Streamlit.
Stay tuned for the release of the latest IBM-sponsored learning offering on Manning liveProject!
Our main talk featured Kush R. Varshney, distinguished research staff member and manager with IBM Research at the Thomas J. Watson Research Center, Yorktown Heights, NY, with a talk on trustworthy ML.
Accuracy is not enough when you’re developing machine learning systems for consequential application domains. You also need to make sure that your models are fair, have not been tampered with, will not fall apart in different conditions, and can be understood by people. Your design and development process has to be transparent and inclusive. You don’t want the systems you create to be harmful, but to help people flourish in ways they consent to. All of these considerations beyond accuracy that make machine learning safe, responsible, and worthy of our trust have been described by many experts as the biggest challenge of the next five years. This talk aims to equip you with the thought process to meet this challenge.
You can grab Kush’s recent book entitled “Trustworthy Machine Learning” on his website.
If you have any feedback or unanswered questions related to our talks, please leave a note on the forum.
Please join us on April 20 for our April virtual meetup and learn about R Shiny dashboards in real life.
#GlobalAIandDataScience#GlobalDataScience