A Framework For Integrating Governed MLOps In Financial Services
Dr. Joseph N. Kozhaya1
E-mail :
kozhaya@us.ibm.com
CSM Architect – US Industry, IBM Master Inventor,
Member – IBM Academy of Technology
|
Souva Majumder2
E-mail :
souva.majumder@apollonius.in
Director of Strategic AI,
Apollonius Computational Business Solutions OPC Pvt. Ltd
(An IBM Business Partner)
|
Anushree Bhattacharjee3
E-mail :
Anushree.bhattacharjee@apollonius.in
Executive Director,
Apollonius Computational Business Solutions OPC Pvt. Ltd
(An IBM Business Partner)
|
Abstract
Artificial Intelligence (AI) systems have become increasingly prevalent in financial enterprises, and they’re now often being used to support human decision-making. These AI systems have grown increasingly complex and efficient, and AI holds the promise of uncovering valuable insights across a wide range of such applications. Wide adoption of AI systems will require humans to trust their output. Typical AI systems to date have been black box models where data is fed into the model and results come out of the model without any significant explanations. In order trust a decision made by an algorithm, humans need to know that it is fair and reliable. We need assurances that the AI model cannot be tampered with and that the system itself is secure. In this paper, our present an approach for AI governance, to ensure that AI models for Financial Services are fair, robust and explainable. AI governance consists of a set of tools and methods designed to deliver trust in AI so business leaders can confidently embed AI models in their business processes and customer interactions. AI governance enables data science teams and business leaders to operationalize, at scale, AI models developed in heterogenous environments while adhering to compliance and regulatory requirements by enforcing approval steps and recorded facts at every stage of AI model life cycle.
Keywords :
Responsible AI, Governed MLOps, Financial Services
Introduction
Artificial intelligence (AI) is now an integral part of every digital transformation. However, AI adoption and its integration with legacy systems is more complicated in highly regulated industries such as the financial industry. With data security and privacy paramount, along with customer safety, financial businesses need to understand the rapidly evolving regulatory structures that will make or break their AI initiatives. This is because the breach of stakeholders’ trust can lead to severe consequences including legal fines and reputational damage. Such breaches in the models can range from biased treatment in loan decisions to preferred selection of customers based on their profile. Traditionally, AI practitioners, including researchers, developers, and decision-makers, have considered system performance (i.e., accuracy) to be the main metric in their workflows but now, building the trust of the users in such models is of highest importance. Various aspects of AI systems beyond system performance should be considered to improve their trustworthiness, such as their robustness, algorithmic fairness, explainability, and transparency. Most active academic research on AI trustworthiness has focused on the algorithmic properties of models but
advancements in algorithmic research alone is insufficient for building Responsible AI products. From a practitioner's perspective, the life cycle of an AI product consists of multiple stages, including data preparation, algorithmic design, development, and deployment as well as operation, monitoring, and governance. Improving trustworthiness in any single metric (e.g., robustness) involves efforts at multiple stages in the AI life cycle, e.g., data cleansing, robust algorithms, anomaly monitoring, and risk auditing. On the contrary, the breach of trust in any single metric can undermine the trustworthiness of the entire AI model. Therefore, AI trustworthiness should be established and assessed systematically throughout the lifecycle of an AI system. In addition to taking a holistic view of the trustworthiness of AI systems, over all stages of their life cycle, it is important to understand the big picture of different aspects of AI trustworthiness. In addition to pursuing AI trustworthiness by establishing requirements for each specific aspect, we call attention to the combination of and interaction between these aspects, which are important and under explored topics for Responsible AI systems. For instance, data privacy might interfere with the desire to explain the system output in detail, and the pursuit of algorithmic fairness may be affecting the accuracy and robustness experienced by some groups. These facts suggest that a systematic approach is necessary to shift the current AI trustworthy paradigm toward holistic Responsible AI. This requires awareness and cooperation from multi-disciplinary stakeholders who work on different aspects of trustworthiness and different stages of the development lifecycle. The lack of such a holistic approach to AI governance, of Responsible AI, has delayed the deployment of AI models in production and adoption of such models by business leaders. Almost all organizations are experimenting with developing AI models and their data scientists work to collect data and train models but generally, most of these models never get deployed in production. One of the key reasons for that is lack of trust in the models. In this paper, we propose a holistic approach to AI governance to infuse trust and accelerate adoption of AI models in production for Financial Services. (Kozhaya, 2023)
Landscape for AI Based Financial Services in India
Financial Services are engaged in multiple digital transformation initiatives involving the integration of innovative technologies such as AI, blockchain, and cloud computing. These advanced technologies have opened new revenue streams and business opportunities for this industry. It is therefore immensely crucial for Financial Services to focus on long-term AI adoption strategy. Few areas which are becoming crucial for Financial Services include personal banking, personal financial management, investment decisions, robo financial advisory, mobile wallets, crowd funding, peer-to-peer (P2P) lending, Mobile Point of Sales (MPOS) services, consumer and business loans (Bachinskiy 2019). Additionally, the following are some key applications of AI in the financial sector: credit decisions, risk management, fraud protection, algorithmic trading and process automation (Maruti TechLabs 2020).
Table 1 summarizes the landscape of AI implementation opportunities in Financial Services with respect to market size, AI need, AI adoption, product development, operations & end-user experience. (Kumar R. & Guptha S. 2020). This table clearly highlights the potential and criticality of AI adoption in Financial Services and thus, it is of pinnacle importance to understand what is causing the hindrance in AI adoption by Financial Services at enterprise level. In the next section, we discuss these challenges in detail.
Table 1: Landscape of AI opportunities in Financial Services
Dimensions
|
Financial Services
|
Financial Services Market Size
|
$247.4 Billion by 2026 CAGR of 38%
|
AI need
|
Very high
|
AI Adoption
|
Very high
|
Financial Product Development
|
Moderate
|
Operations
|
Very high
|
In this paper, we look at its applications in the financial Institution where better AI hardware, software, solutions, and services are creating many opportunities. Data integrity, privacy policies, decision system guidelines, and holistic regulations are continuously evolving in these industries. This ecosystem is now ripe for service providers and system integrators to play their parts, with AI adoption achieving appreciable return on investment. Key applications of AI in this space include optimizing operational efficiencies, assuring robustness of systems, data and image interpretation, and human augmented decision-making. Other applications include automation of processes and workflows, better compliance, improved performance, and reliability platforms, unmanned derivative systems (in finance) and digital and virtual assistants.
AI adoption challenges in Financial Services
There are three main reasons why financial organizations are struggling with adopting AI in their mission critical processes :
1. Lack of confidence in AI model in operations
Many organizations struggle in adopting AI in production due to the following factors:
l Inability to access the right kind of data
l Manual processes are involved that introduce risk and make it hard to scale
l Multiple unsupported tool sets for building and deploying AI models
l Platforms and practices that are not optimized for AI
2. Challenges of managing risk due to AI integration
Customers, policy makers, and key stakeholders expect organizations to use AI responsibly. No one wants to be in the news for the wrong reasons of using AI. Increasingly we are also seeing companies making social and ethical responsibility a key strategic preposition.
3. Scaling with growing AI regulations
With the growing number of AI regulations across various spheres, compliance with such regulations when developing and deploying AI models is a growing challenge, especially for Financial Services which are governed by diverse requirements and highly regulated bodies such as Reserve Bank of India (RBI). Failure to meet the Government regulations can lead to stringent intervention in the form of regulatory audits or fines, which leads to damage to the organization’s reputation with shareholders and customers, and revenue loss. Specially, such detrimental actions will affect the financial industry start-up ecosystems which are the key players in this technology-driven financial revolution. The next important aspect that the Financial Services companies must consider along with AI adoption is integrating Responsible AI. In the next section of the paper, we discuss the importance of Responsible AI. (Kumar.R et.al 2020)
What is Responsible AI
Responsible AI is a term used to describe AI that is reliable, regulatory compliant, and technically explainable. It is based on the idea that AI will reach its full potential when trust can be established in every stage of its development life cycle, from conception to development, deployment and in use.
According to Gartner, “54% of models are stuck in pre-production because there is not an automated process to manage these pipelines and there is a need to ensure the AI models can be trusted.” In order to integrate Responsible AI into their processes and applications, Financial Services must consider the following key objectives:Privacy of the customers: Ensuring full privacy of the customers as well as protecting their data, there is also need for data governance access & control mechanisms. These need to take into account the whole system lifecycle, from training to production of the model which means personal data initially provided by the user, as well as the information generated about the user over the course of their interaction with the AI System.
- Robustness of the AI model: AI Models should be resilient and secure. They must be accurate and able to handle exceptional cases that perform well over time and be reproducible. Another important aspect of robustness is safeguards against adversarial threats and attacks. An attack on AI Model could target the user’s data, the model, or the underlying infrastructure. In such attacks, the user’s data as well as system behavior can be changed, leading the system to make different or erroneous outcomes, and under exceptional cases complete shutdown. For building robust AI Systems they need to be developed with a preventative approach to risks aiming to minimize and prevent harm
- Explainability: Understanding the underlying algorithm is an important aspect in developing trust. It is important to understand how the specific AI systems make decisions and which features were important to the decision-making process for each decision. Explanations are necessary to enhance understanding and allow all involved stakeholders to make judicious decisions. AI models and their predictions are often described as “black box models” due to the difficulty of understanding their behavior or output – even by experts. The stakeholders involved in and interacting with an AI system should be able to understand why AI arrived at a decision and at which point the model could have behaved differently (what-if analysis). Looking at the use case of AI-based risk assessment and fraud detection models, when investigating an alert for a transaction/login, there is a need to open the “black-box” and understand why an event was flagged as fraud event in a way that can be interpreted by a human. This can help fraud analysts decide whether to take further action or not. As Explainable AI systems are designed, AI Engineers must decide what level of stakeholder understanding is necessary. Financial Services need to make their AI platforms explainable to their engineers, legal team, compliance officers, and auditors.
- Fairness: AI models should be fair, unbiased, and accessible to all. Hidden biases in the AI model pipelines could lead to discrimination and exclusion of underrepresented or vulnerable groups. AI systems should ensure that they are fair and include proper safeguards against bias and discrimination, to deliver more equal outcomes of all users and stakeholders.
- Transparency: The data, AI model, and financial business model should be transparent. Users should be aware when they are interacting with an AI system and when they are interacting with human agents. Also, the capabilities and limitations of an AI system should be made clear to users and the regulatory authorities. Ultimately transparency will contribute to having more effective traceability, auditability, and accountability for Financial Services. The potential risks assessment in AI models require the involvement of critical stakeholders across government, industry, and academia to ensure effective regulation and standardization. (Kozhaya, 2023)
- Hurdles In Implementing Responsible AI In Financial Services
Let’s look at the specific hurdles related to Responsible AI, particularly in a banking environment.
§ Data strategy: Data collection starts with a comprehensive data strategy that’s predicated on collaboration with key stakeholders including data engineering and the business. Without a strategy, data scientists waste their time hunting for quality, reliable data to address the business objective.
§ Data collection: It’s a challenge to aggregate disparate data from internal silos, external sources, and in various formats. Algorithms and models are only as good as the data that is used to create them, and many teams lack access to the quantity and quality of data necessary to build and train high performing, unbiased AI models.
§ Data access, pipelines, and preparation: You can introduce error and add time when you manually move high volumes of data to a target repository. Data pipelines can fail and are difficult to investigate if not instrumented properly for observability. Challenges arise with CI/CD version control, scheduling, and orchestration. With weak controls for multiple personas, and the inability to track data lineage, compliance concerns rise. Manual processes can result in access to low-quality data, creation of biased models, and development delays.
§ Model building and deployment: Data scientists desire integrated tools to build, deploy, and train models at scale. Standalone tools can lack integration functionality which can lead to wasted time and costly errors. Informal “one-off” tools can be hard to manage and monitor, weakening governance and security. Without proper tools, it’s a challenge to deploy quickly, and collaboration suffers.
l Model monitoring, performance tracking and retraining: Once a model is deployed, the data scientists and operations team must continue to monitor the data and models across the entire AI pipeline including compliance with integrated, automated tools to track and record degradation, drift, bias, human error, and historical performance. It is also important to provide tools for model explainability so stakeholders and employees can easily understand and defend outcomes and corresponding decisions. (Kozhaya, 2023)
Responsible AI Strategies that Indian Financial Services need to adopt
To enable Responsible AI, Financial Services need to adopt the following practices:
§ Data governance mechanism: For AI to be trustworthy, it is important that governance is applied across the full pipeline starting with data. This includes consideration for how objectives for the model are set, how the model is trained, what privacy and security safeguards are needed, what data is used, and what the implications are for the end user.
§ AI model monitoring: Financial Services are accountable for the development, deployment and usage of AI technologies and systems in their process. Therefore, these models need to be continuously assessed and monitored as they perform their tasks to ensure biases don't creep in over time or the models drift. Using additional resources (e.g., IBM's AI Fairness 360) to examine and inspect models can help with testing, tracing, and documenting their development, making them easier to validate.
§ Third party integration: In addition to developing in-house AI solutions, Financial Services also procure AI models from external business partners and third-party providers. In such situations, there should be a commitment from all parties involved to ensure that the system is trustworthy and in compliance with current laws and regulations of India. To ensure the transparency in integration of third-party models into the organizations system, AI Model Monitoring needs to be adopted. (Irfan Saif & Beena Ammanath, 2020).
Proposed framework for integrating Responsible AI for Financial Services
In this section, we present an AI governance framework for Financial Services to achieve Responsible AI. AI governance is a framework that uses a set of human controlled tasks together with automated MLOps processes, methodologies, and tools to manage an organization’s use of AI. Consistent principles guiding the design, development, deployment, and monitoring of models are critical in driving responsible & Responsible AI. These principles include:
· Model Transparency: Model transparency starts with the automatic capture of information on how the AI model was developed and deployed. This includes capturing of the metadata, tracking provenance and documenting the entire model life cycle. Model transparency promotes Responsible AI by driving trusted results that build customer confidence, promote safer practices, and facilitate further AI adoption.
· Building Trust in the AI Model: Complying with regulations requires well defined and automatically enforced company policies, standards, and roles. Manual manipulation of data and models leads to costly errors with far-reaching business consequences. In addition, the automation of enforcement rules for validation drives model retraining and reliability to address drift over time.
· Fairness of AI models: Transparent and explainable AI requires the automation of the analysis of model performance against KPIs while continuously monitoring real-time usage for bias, fairness, and accuracy. The ability to track and share model facts and documentation across the organization provides backup for analytic decisions. Having this backup is crucial when addressing customers and concerns from regulators.
We believe AI governance is the responsibility of every organization and will help businesses to build more Responsible AI that is transparent, explainable, fair, and robust. (Majumder,2023)
Why Financial Services should look ahead for AI Governance
Imagine a team of explorers venturing into uncharted territory. To accomplish their objectives safely and efficiently, the team needs a leader. Their leader must be trustworthy, resourceful, adaptable, and proficient with their tools: maps, compasses and other navigational instruments used to chart their course through the unknown.
The rest of the team is responsible for working together to support the leader by using their unique skills and areas of subject matter expertise. The team ensures their leader has the tools and resources to accomplish their shared objectives. When working together, the team is a cohesive whole, able to navigate new challenges.
Data science and machine learning operations (MLOps) are tools analogous to those used by explorers in a wild land. Just as explorers use various instruments and techniques to gather information about a new place, data scientists use data science and MLOps to create predictive models that help organizations make informed decisions. Unfortunately, without a solid data science and MLOps foundation, many valuable AI projects struggle to leave the lab.
The increasing acceptance of AI in Financial Services is reflected in increased spending on AI projects and hiring in related fields. However, the time it takes to deploy a model is also growing, indicating that the foundation of AI model production, specifically data collection and analysis, requires improvement. TechTarget reports 83% of organizations have increased their AI budgets, with the average number of data scientists employed rising by 76%. However, the time required to deploy a model has increased, with 64% of organizations taking months or longer. According to Gartner, 53% of AI and ML projects remain in pre-production phases, and most machine learning models never make it to production. (https://www.ibm.com/blog/ibm-leads-in-data-science-and-mlops/) In addition, one-third of enterprises analyze less than half of generated data, highlighting the challenges in data collection and analysis that many organizations face. However, in this changing landscape, the data scientist—our expedition leader—has become mired. Manual tools introduce errors, poorly documented processes complicate the data scientist’s job, and AI models drift over time. It is important to provide a framework, together with supporting tools for data scientists to better develop Responsible AI.
The quality and quantity of data used to build AI and ML models, as well as algorithms, are critical to success. Incomplete, inaccurate or biased data sets can lead to faulty algorithms and skewed analytic outcomes. In addition, organizations face challenges aggregating disparate data sets from silos across the enterprise, from external sources and various deployments, including both on-premises and the cloud. A model that uses insufficient data can introduce risk to your business operations—wasting both time and resources—and must be retrained or discarded entirely. Many experiments die on the vine without ever achieving business value.
CEOs, CDOs and CIOs worldwide are overseeing a lot of experimentation with no insight into when the model will give back. Data science and machine learning investments are critical for ensuring accurate, unbiased models and for effectively collecting and using high-quality data.
(Source : https://www.ibm.com/blog/ibm-leads-in-data-science-and-mlops/)
Introduction to AI Governance
As enterprises invest in developing and adopting AI in their business, they need a framework or an approach to follow so they can extract value from their AI investment. The AI Ladder, described in Figure 1, is a prescriptive approach on how to achieve business value from AI and consists of the four rungs:
- Collect: Collect data of every type (structured/unstructured) regardless of where it resides enabling access to ever-changing data sources in a hybrid cloud environment where data may exist on-prem, in a public cloud platform, or may even be 3rd party data.
- Organize: Organize all data into a trusted, business-ready foundation to meet enterprise governance and compliance requirements.
- Analyze: Build and scale AI models with trust and transparency.
- Infuse: Operationalize AI throughout the business by infusing AI models in business processes and customer interactions.
Figure 1: The AI Ladder - a prescriptive approach for organizations to transform their business by connecting data and AI (Kozhaya, 2023)
Cloud Pak for Data is IBM’s Data and AI integrated platform that enables organizations to execute on the AI Ladder approach and build AI models powered by a solid data foundation. It is composed of several containerized software services that integrate to deliver Responsible AI via MLOps automation and AI governance.
MLOps is a set of processes, best practices, and technologies to enable the development, deployment, maintenance, and management of machine learning (ML), and more generally AI, models in production reliably and efficiently.
AI governance is emerging as the top priority for organizations and it consists of a set of tools and methods applied across the AI lifecycle to meet governance and compliance requirements for the enterprise. AI governance is designed to deliver trust in AI so business leaders can confidently embed AI models in their business processes and customer interactions. AI governance enables data science teams and business leaders to operationalize, at scale, AI models developed in heterogenous environments while adhering to compliance and regulatory requirements by enforcing approval steps and recorded facts at every stage of AI model life cycle.
Figure 2 outlines a more detailed view of how the various components of Cloud Pak for Data enable enterprises to establish a governed data foundation to leverage for developing and deploying Responsible AI models that can be infused in customer interactions for personalizing customer experience. Left to right, Figure 2 depicts how data is collected from various sources using data integration capabilities (for example, DataStage) and organized in a catalog to meet quality, privacy, lineage, and governance requirements (for example, Watson Knowledge Catalog). At that point, data is consumed to train, deploy, and monitor AI models (for example, Watson Studio, Watson Machine Learning, OpenScale). Lastly, deployed AI models are integrated with other applications such as Watson Assistant to personalize customer experience.
Figure 2: Governed MLOp Architecture (Kozhaya, 2023)
Data is the most critical component for AI and as such, it is important to understand the activities required to access, prepare, and organize data so it is business ready. High quality and well-governed data is critical for training useful and trusted AI models. To best represent that, we consider the roles and tasks of data providers and data consumers:
- Data providers: Data providers focus on collecting and organizing the data so it is business ready. Data providers typically include the roles of data engineer, data steward, and data quality analyst. Data engineers collect data from various data sources. Data stewards define governance artifacts (business terms, data classes, rules, policies, …) to organize and govern the accessed data according to the enterprise regulatory and compliance requirements. Data quality analysts discover and analyze the data to evaluate its quality and readiness for business use. Once data quality meets the required specification, data is published to the catalog where it is available for data consumers to find, access, and leverage for various purposes.
- Data consumers: Data consumers leverage the business ready data from the catalog to gather business insights, train AI models and embed in applications. Data consumers typically include the roles of data scientists, business analysts, and developers. Business analysts leverage cataloged data to create business intelligence dashboards and reports to highlight business insights. Data scientists use the business ready data to train AI models for various purposes. Developers build applications that consume the data for different use cases.
Figure 3 Illustrate a typical MLOps implementation procedures for an end-to-end solution for delivering trusted and governed AI models.
Figure 3 : MLOps & Trustworthy AI (Kozhaya 2023)
Data Governance and Privacy
Data providers are responsible for collecting and organizing the data so it is business ready for consumption by data scientists, business analysts, and developers. This section describes the tasks involved in collecting and organizing the data which are typically executed by data provides.
1. Setup Data Governance foundation (Data Providers – Data Steward): Data governance is a critical first step in preparing for AI model development. Before Data Stewards and Data Quality Engineers can start creating governance artifacts and curating data, data governance foundation needs to be setup. IBM’s data governance tool is Watson Knowledge Catalog (WKC) which is an intelligent data catalog that powers self-service discovery of data, models and more. To setup data governance, you must create governance categories, assign Watson Knowledge Catalog roles to users, add users to categories, and set up workflow configurations. Governance is provided by governance artifacts which are organized by categories. The processes for creating, updating, and deleting governance artifacts are controlled by governance artifact workflow configurations. Governance artifacts are created and assigned to data assets by users who are collaborators in categories.
2. Data Virtualization (Data Providers – Data Engineer): Leverage Data Virtualization capabilities in Cloud Pak for Data to connect to various data sources and virtualize relevant data assets for purposes of data visualization, extracting insights, and training models. Data virtualization empowers data professionals to access, view, and manipulate data from different sources without explicitly copying the data.
3. Data Quality and Data Privacy with Watson Knowledge Catalog (Data Providers – Data Quality Analyst): Collected data assets are governed using Watson Knowledge Catalog capabilities to make sure enterprise governance and compliance rules are enforced. There are use cases where it makes sense to connect to data assets directly and leverage such data for extracting insight and training AI models. However, in general, it is recommended to catalog all data assets and only leverage catalogued data for subsequent tasks such as business intelligence insights and training AI models. Cataloging the data ensures that the enterprise’s governance and compliance rules are being enforced and trusted data is delivered.
AI Governance and Responsible AI
Data is the lifeblood of AI and for AI Governance implementation, it is assumed that the relevant data assets have been identified, cleansed, quality-verified, and stored in data stores/catalogs; ready to be consumed. With that, the AI governance components consist of:
4. AI Governance (Data Consumers – AI Model Owners): This component consists of configuring workflows with well-defined tasks and owners to propose, approve, develop, and deploy AI models to address various business use cases. This is typically done with the help of a governance, risk, and compliance (GRC) tool such as IBM OpenPages with predefined AI governance workflows.
5. Training and deploying machine learning models (Data Consumers – Data Scientist): This component involves capabilities for training and developing AI models using a variety of capabilities ranging from completely automated solutions such as AutoAI to very flexible tools such as Jupyter notebooks. AutoAI offers automation to speed up exploration and identification of best ML algorithms for the given data set and objective. AutoAI automates much of the process of training AI models by automatically exploring different data processing methods, feature engineering techniques, and machine learning algorithms to find best performing models. Jupyter notebooks, on the other hand, offer data scientists the most flexibility in applying different open-source algorithms and customizing for their needs. Once an AI model is decided on, it can be deployed using Watson Machine Learning for batch or online inference where predictions can be triggered via REST APIs.
6. Automation with Watson Pipelines (Data Consumers – Data Scientists / MLOps): This component involves automation and operationalization of AI model development and deployment. Watson Pipelines enable data scientists and AI operations engineers to automate the tasks of shaping data, training models, deploying models in development deployment spaces, and selecting best performing models to propagate to pre-production deployment spaces. Deploying with Watson Pipelines enables automation and scheduling of jobs to run periodically to capture updated data and re-train models accordingly.
7. Validate and Monitor AI models with Watson OpenScale to build and scale AI trust and explainability (Data Consumers – Data Scientists / MLOps): This component consists of capabilities from IBM Watson OpenScale for validating and monitoring AI models for quality, fairness, robustness (drift avoidance), and explainability. To deliver trust in AI models, it is important for the business leaders to have the confidence that the AI models are fair and explainable. Fairness in AI models describes how evenly the model delivers favorable outcomes between the monitored group (the group potentially susceptible to be biased against) and the reference group. Generally, references to fairness and bias are used to describe similar behavior of the AI models but in an inverse manner. A model that is fair is not biased and a model that is biased is not fair. As for explainability, it refers to the ability to describe how the model determined the prediction and what were the most important factors that led to that prediction specifically for the given transaction. In addition to fairness and explainability, Watson OpenScale measures model quality and drift. After AI models are deployed in production, they may drift over time signaling a drop in accuracy or a drop in data consistency and it is important to detect potential drift early and trigger a model re-training as needed.
8. Governed MLOps CI/CD (Data Consumers – Data Scientists / MLOps): This component involves utilities for complete automation of machine learning, or generally AI, model CI/CD pipelines. Command line utilities such as the cpdctl tool for Cloud Pak for Data enable automation of various tasks such as executing model training jobs or copying assets (models, data, …) from one deployment space (for example, QA deployment space) to another deployment space (for example, prod). As organizations scale adoption of AI models in production, it becomes more important to automate the process for testing, validating, and promoting such models from dev (development) to UAT (user acceptance testing, also known as pre-prod, quality assurance or staging) to prd (production) environments. The cpdctl tool is a command line utility which can be embedded in any CI/CD pipelines to enable automation of the process of developing and deploying models in multiple environments (development, staging, production). In practice, the environments can exist in the same Cloud Pak for Data cluster or in completely different Cloud Pak for Data clusters hosted on different cloud platforms. Another alternative for CI/CD is to leverage git-based integration to define automation pipelines triggered using git actions or similar functionality to re-train models and propagate deployment to production.
Use Case: AI Governance for a Bank
A bank’s Chief Data officer (CDO) is well aware of regulators’ requirements for data controls, and closely monitors governance, stewardship, data quality, and metadata. These heavy compliance requirements, coupled with heightened customer awareness, demand that bank leaders support ethical, explainable AI. AI models and insights can have a direct impact on the bank’s reputation, and CDOs are called upon to strike a fine balance as they act to protect individual privacy while they protect the bank against discrimination. In such endeavors, trust is at the core.
Trust in AI is established through the connection to the right data, coupled with the automation and governance of building, deployment, and monitoring of models. Successful AI outcomes include better customer interactions, shorter time to market, and improved competitive positioning. Conversely, faulty decisions based on inaccurate data, models and processes can result in failed audits and regulatory fines. These faulty decisions can lead to loss of trust and brand reputation, higher expenses, and decreased revenues and profits for the organization.
To successfully build, deploy, and manage AI/ML models, it is important to build on quality data and automated data science tools and processes—an approach that requires a technology platform that can orchestrate data of many types and sources within a hybrid multi cloud environment. This can be achieved with help of a Data fabric which is an emerging concept that is driving better data consolidation for improved AI/ ML results. The IBM team’s discussion with Gartner analysts has revealed that they define data fabric similarly, as “an emerging data management design for attaining flexible, reusable, and augmented data management (that is, better semantics, integration, and organization of data) through metadata. Metadata drives the fabric design. Compared to traditional approaches, active metadata and semantic inference are key new aspects of a data fabric.” Data fabric provides a strong foundation for MLOps and Responsible AI, helping to ensure that quality data can be accessed by the right people, at the right time, no matter where it resides. Data analysis is augmented with simplified, permission-based, self-service access by multiple personas (such as data scientists, analysts, and business users). Users can access data for their project with confidence that the data sets are complete and accurate.
Instead of a fragmented group of products that have been stitched together, a data fabric offers a single, holistic solution that is built to work seamlessly. A data fabric connects, governs, and protects siloed data that’s distributed across a hybrid cloud landscape. It’s a good way to bring the promise of your data strategy to life. Gartner predicts that organizations using data fabrics will dynamically connect, optimize, and automate data management processes and reduce time to integrated data delivery by 30%. In the next paragraph, we will discuss how a bank has successfully implemented a data fabric approach to MLOps and Responsible AI.
It's clear that Financial Services (Banks) are recognizing the importance of leveraging technology, such as AI-powered solutions, to enhance customer experiences and streamline processes.
Here are some key points to highlight from the provided information:
Customer Expectations: Today's customers, including those in the financial sector, have high expectations for seamless omni-channel experiences. Meeting these expectations is crucial for retaining trust and remaining competitive.
Importance of Home Ownership: Home ownership is a significant goal for many individuals, and obtaining a mortgage is often a necessary step in achieving this goal.
Complexity of Mortgage Issuance: Issuing and obtaining a mortgage can be a complex process due to evolving regulations, products, and procedures.
AI-Powered Solution: A bank has developed an AI-powered, cloud-based platform that provides real-time digital mortgage support to home buyers.
Integration with Existing Data: The AI solution is integrated with the bank’s existing data structures and continuously updated with new data, including customer interactions.
Enhancing Employee Support: The bank’s employees in the mortgage call center can access quick digital mortgage support by using keywords in a console when assisting customers.
Personification of Marge: The solution build on Data Fabric, has its own evolving personality, which can create a more relatable and engaging experience for both employees and customers.
Cognitive Enterprise Technology: The use of cognitive enterprise technology empowers bank employees to better support both new and existing home buyers, making the mortgage process smoother and more efficient.
In summary, the developed AI solution represents a forward-thinking approach by the bank to harness AI and cloud technology to improve customer service and streamline mortgage-related processes. This initiative aligns with the broader trend in the financial industry toward digitization and automation to enhance customer experiences and operational efficiency.
Results summary
Since implementing the digital mortgage support tool, the bank has seen:
– 20% improvement in customer NPS
– 10% decrease in call duration
– Better tools to empower employees as they enter the organization’s digital transformation
(Source : https://www.ibm.com/downloads/cas/Y8ROXEMY)
Conclusion
AI technology, products, solutions, and services have created a great deal of momentum in digital transformation, automation, and autonomous initiatives. As AI becomes a requirement in every sector, it has moved from technology-oriented initiatives to framework-based solutions with multiple derivative modules. This demands that Financial Services take care of human values when building AI systems, and ensure they align with regulatory requirements. Industry-specific value chains can be built without bias using Responsible AI principles in a mature technology stack. Before full autonomous operations can be realized, derivative AI services can be built that factor in risk and guarantees so that human-augmented frameworks mature in the right way. On the other side, AI’s market opportunity is expected to increase by an average compound average growth rate of 43.5% by 2025. That provides incentives for companies to follow through with a quicker AI adoption strategy. Responsible AI must have assurance, security, risk, and safety layers with guaranteed services at various levels of Financial Services - specific value chains. For a holistic implementation of Responsible AI in technology driven, innovative Financial Services, Financial service regulations needs to be driven along with institutional research focusing on Responsible AI to be initiated along with an eco system of organizations which must focus on building the Responsible AI models for various financial industry use cases.
To conclude, developing an ML or AI model is just a starting point. To bring the technology into production, organizations need to solve various real-world challenges for AI models to be trusted and adopted. Organization need an end-to-end AI governance framework which consists of workflows that embed compliance gates approved by the right business stakeholders and supported by automation technologies for developing, deploying, and monitoring these AI models. These technologies include a data fabric for accessing and delivering high quality business-ready data, a suite of automation tools for training AI models, automated validation of the model, a scalable serving infrastructure, and ongoing operation of the ML infrastructure with monitoring and alerting for key metrics including accuracy, fairness, explainability and drift.
Declarations of Conflict of Interest
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article. Authors are thankful to IBM for their open source articles, concept papers & Github repository on Governed MLOps available in the public internet.
Funding
The authors received no financial support for the research, authorship and/or publication of this article.
Authors Profile
Dr. Joseph N. Kozhaya1 is a CSM Architect, an IBM Master Inventor, and a Watson Data & AI Subject Matter Expert. His focus is partnering with IBM teams, business partners, and clients to deliver AI-powered solutions using IBM’s portfolio of Data Science and AI offerings including Cloud Pak for Data, Watson Studio, Watson Machine Learning, Watson OpenScale, Watson Assistant, Watson Discovery and all the Watson APIs. Joe has several publications and patents in design automation, software applications, and cognitive computing services and applications and he is a Best of IBM and IBM Corporate award honoree.
Souva Majumder2 is currently serving as Director of Apollonius Computational Business Solutions OPC Pvt. Ltd, Kolkata India. He had obtained his M.Tech in Industrial Engineering & Management from IIT Kharagpur, India. He holds more than 8 years of experience in various AI & Decision Science related research & industrial consultancy . He is presently interested into the research of Governed MLOps and its applications in various industries. He is also IBM Certified AI/ML Operationalization Engineer.
Anushree Bhattacharjee3 is currently serving as Executive Director of Apollonius Computational Business Solutions OPC Pvt. Ltd. She had obtained her M.Tech in Information Technology from RCCIIT, India & MSc in Statistics from Visva Bharati University. She holds more than 5 years of experience in teaching & mentoring students on Data Science & Machine Learning. She is an active Developer of MLOps.
References :
1. Bachinskiy,A. (2019). The Growing Impact of AI in Financial Services: Six Examples. Retrieved from https://towardsdatascience.com/the-growing-impact-of-ai-in-financial-services-six-examples-da386c0301b2
2. Ariwala, P. (2023). 12 Ways AI is Transforming the Finance Industry, Maurti Techlabs. Retrieved from https://marutitech.com/ai-and-ml-in-finance/
3. Gokhale,N., Kaye,R., Gajjaria,A., Kuder,D (2019). AI leaders in financial services - common traits of front runners in the artificial intelligence race. Deloitte. Retrieved from https://www2.deloitte.com/content/dam/insights/us/articles/4687_traits-of-ai-frontrunners/DI_AI-leaders-in-financial-services.pdf
4. Kumar, R. & Guptha, S. (2020). AI trustworthiness in the aerospace, Financial Institutions, automotive, and healthcare industries by Infosys Knowledge Institute. Retrieved from https://www.infosys.com/about/knowledge-institute/documents/ai-trustworthiness.pdf
5. Saif,I. & Ammanath,B. (2020) Responsible AI is a framework to help manage unique-risk, MIT Technology review
6. https://www.ibm.com/blog/ibm-leads-in-data-science-and-mlops/
7. https://www.ibm.com/downloads/cas/Y8ROXEMY
8. https://research.ibm.com/topics/trustworthy-ai
9. https://www.pwc.com/gx/en/issues/analytics/assets/pwc-ai-analysis-sizing-the-prize-report.pdf