Global AI and Data Science

 View Only

Introducing AI Factsheets on Cloud Pak for Data as a Service: Automate collection of model facts across the AI Lifecycle

By Shashank Sabhlok posted Sun January 23, 2022 09:11 PM


Today’s top performing enterprises are achieving growth and revenue goals by being data driven and successfully adopting AI and ML technologies. While most organizations see the value in AI , developing organizational trust in the underlying data, models and process can be daunting. Organizations are challenged to maintain their corporate responsibility and brand while addressing fairness, safety and privacy at the same time. Implementing Trustworthy AI requires that the data used is of the appropriate quality and relevance, and that the models are constantly monitored for bias, explainability, and drift. Equally important is automated governance at each stage of the model lifecycle to ensure trust in the process.

A typical model lifecycle in an enterprise

AI Governance
 includes processes that trace and document the origin of data, models (including associated metadata) and pipelines for audits. The documentation should include the techniques that trained each model, the hyperparameters used, and the metrics from testing phases. The result of this documentation is increased transparency into the model’s behaviour throughout the lifecycle, the data that was influential in its development, and the possible risks.

Current practices and tools implemented across enterprise are not optimized for AI. Documentation of model inputs and behaviour requires manual work, while most tools and platforms used for model development/deployment today do not disclose metadata. This is where AI Factsheets comes in.

Introducing AI Factsheets

I am excited to announce the release of AI Factsheets as part of Watson Knowledge Catalog (WKC) on Cloud Pak for Data-as-a-Service, which will allow AI mature organizations to implement AI Governance, enabling them to monitor AI activities across the enterprise and make smart decisions faster.

AI Factsheets captures model metadata across the model development lifecycle, facilitating subsequent enterprise validation or external regulation. The automated collection of model metadata save valuable data scientist and ML engineering resources to focus on model building, instead of writing lengthy model documentations.

Through it’s Python SDK, AI Factsheets is able to capture various types of metadata related to the model including training scores and the input schema.

Furthermore, model validators need details from the model development, testing and validation phases to approve models for production use. AI Factsheets enables validators and approvers to get an accurate, always up-to-date view of the model lifecycle details.

Up-to-date views of the model lifecycle ensure accuracy in reporting, especially for compliance audits

AI Factsheets is powered by the AI Governance Facts Python SDK, which can be used to persist model metadata not only into Watson Machine Learning, but also relay metadata to WKC from AWS Sagemaker and Azure Machine Learning, via Watson OpenScale. Moreover, this integration with Watson Machine Learning and Watson OpenScale also results in the capture of deployment metadata, and introduces critical monitors for bias detection and quality.

In addition to model and deployment metadata, integration with Watson OpenScale allows for the model’s quality, fairness and drift to be closely monitored

If you are already an IBM Cloud Pak for Data-as-a-Service customer, visit the Model Inventory to get started and take the tour. If not, quickly sign up for the trial account here to get started!

Learn more about the advantages of Trustworthy AI, and how it drives responsible business transformation here