Cloud Pak for Data Group

CPD 4.0 multi-tenancy support overview

By Hong Wei Jia posted Sun September 12, 2021 09:38 AM

  

CPD 4.0 multi-tenancy support overview

Background

One of the important CP4D use scenarios is multi-tenancy. Public cloud providers and large enterprises have strong demands for taking CP4D to provide the Data and AI as a service for their tenants. Its important to understand that the concept of a "Tenant" itself may vary - in some cases, it maybe completely different Enterprises, in other cases, it maybe different departments inside one company or it maybe a situation where some tenancy criteria can be relaxed from a cost perspective. In this article we’ll introduce the multi-tenancy mechanisms and the corresponding use cases that CPD 4.0 supports.

Multi-tenancy support overview

In high-level, the multi-tenancy models that CPD 4.0 supports could be classified into two categories Platform level multi-tenancy and Service level multi-tenancy. Platform level multi-tenancy focus on differentiating the multi-tenancy models from the perspectives including Isolation, Security and Management. Service level multi-tenancy will introduce some specific services or resources on CPD available for multi-tenancy support in the specific use cases and especially in terms of access control.

Platform level multi-tenancy

Separate CPD instances each in a dedicated group of namespace

Use case

A large enterprise named compA plans to use CPD as their cloud service to server their own clients from different areas. Each of their clients is a separate company, institute or university.

Each client needs fully isolation for their data, resource, network, security, user management.

 

And compA has detailed multi-tenancy requirements as follows.

  • Multiple clients in a shared OpenShift cluster
  • Different clients need to be isolated from each other completely
  • Separate monitoring and quota
  • Separate set of add-ons for each tenant
  • Different Authentication (LDAP/AD/SAML2) mechanisms available for different tenants
  • Separate Admins & user roles for each tenant

Topology

Basically, the multi-tenancy model here is about tethering a group of OpenShift Projects/namespaces for one tenant.

In the following chart, as an example,  the namespace cpd-instance-1 and tethered namespace A are tethered group for tenant A’s services deployment.

 

 

 

Tethered namespace is optional and not required. And it’s useful when you want to deploy some of your service isolated from other services for better resource quota management, license monitoring or metering.

 

Note:

Not all services support Tethered namespace. 

The key features of this mechanism are as follows.

  • Share the OCP cluster and the same set of Cloud Pak Foundational Services and CPD Operators.
  • Full isolation - nothing of CPD instance is (typically) shared between tenants.
  • User access control including admin for the CPD instance could be maintained by the tenant.

 

From the tenancy aspects, this multi-tenancy model supports Isolation, Security and Management with the following mechanisms. 

Tenancy Aspects

Mechanism

Isolation

A tethered group of Kube namespaces/RH OS projects for each CPD instance

    1. Users – Authentication, Roles & Authorization
    2. Compute Resources
    3. Access to CPD instance
    4. Storage usage
  • If needed, can support different LDAP/AD or SAML configurations per instance (each CPD instance has its own unique user management service)
  • Namespace Quotas – set by the cluster admin.
  • Unique RHOS DNS based routes for each instance
  • Completely separated PVCs (in different namespaces) with different PVs – no sharing across instances.

Security

RBAC privileges & Service Accounts scoped to that group of namespace

    1. Restricting Kube Cluster-level privileges
    2. Network Access isolations
    3. Auditing  
  • Service Accounts & Role binding – scoped to RHOS Project namespaces
  • Kube network policies for hardening
  • Per-tenant SIEM forwarding supported

Management

Administration privileges scoped to a namespace

    1. Monitoring/Metering
    2. Ops – Back/DR, Scale, Patch,Upgrade
    3. Serviceability/Diagnosis
    4. Add-on installations
  • Different CPD “Admin” User Roles for different instances. Tenants manage independently
  • Ops Privileges scoped  to that RHOS project namespace
  • Serviceability utilities only captures diagnostics appropriate for that CPD instance
  • Different CPD instances can have different sets of add-ons scaled differently

 

Separate CPD instances each with a dedicated group of worker nodes

Use case

A large enterprise named compB has existing applications running on their OCP cluster. The existing application could be Cloud Pak for Data, other Cloud Pak applications or Non-IBM applications. And now customer wants to install Cloud Pak for Data next to existing applications and they want to make sure the products don’t affect each other.

 

Even Cloud Pak for Data deploys in a dedicated group of namespace can offer separation, this however may not be enough. As a further requiremet, CompB wants to keep the applications’ workloads independent and each run in dedicated worker nodes.

Topology

Basically, this multi-tenancy model is about dedicating a group of workers to one or a group of namespace.

 

In the following chart, as an example,  the tenant A’s CPD instance is deployed in the namespace cpd-instance-1 with the dedicated worker node group Group-1. Group-1 could be comprised of several worker nodes, such as worker01, worker02 and worker03.

 

The key feature of this mechanism is Cloud Pak for Data instances and other applications share the OCP cluster but run independently with dedicated worker nodes.

 

From the tenancy aspects, this multi-tenancy model supports Isolation, Security and Management with the following mechanisms.

Tenancy Aspects

Mechanism

Isolation

A dedicated groups of worker nodes for each CPD instance

1.     Users – Authentication, Roles & Authorization

2.     Compute Resources

3.     Access to CPD instance

4.     Storage usage

  •   If needed, can support different LDAP/AD or SAML configurations per instance (each CPD instance has its own unique user management service)
  •   Dedicated worker nodes and namespace Quotas – set by the cluster admin.
  •   Unique RHOS DNS based routes for each instance
  •   Completely separated PVCs (in different namespaces) with different PVs – no sharing across instances.

Security

RBAC privileges & Service Accounts scoped to the namespace with a dedicated worker node group

1.     Restricting Kube Cluster-level privileges

2.     Network Access isolations

3.     Auditing  

  • Service Accounts & Role binding – scoped to RHOS Project namespaces
  • Kube network policies for hardening
  • Per-tenant SIEM forwarding supported

Management

Administration privileges scoped to a CPD instance

1.     Monitoring/Metering

2.     Ops – Back/DR, Scale, Patch,Upgrade

3.     Serviceability/Diagnosis

4.     Add-on installations

  •  Different CPD “Admin” User Roles for different instances. Tenants manage independently
  •  Ops Privileges scoped  to that RHOS project namespace
  •  Serviceability utilities only captures diagnostics appropriate for that CPD instance
  •  Different CPD instances can have different sets of add-ons scaled differently

 

Note:

Please make sure that you review you license agreements before you deploy multiple CP4D instances on top of an existing OpenShift cluster.

Multi-tenancy with CPD Service instances & Resources

This part is introduced as a complement to the Platform level multi-tenancy. Some services can provision multiple service instances for different users or groups to support multi-tenancy. And some resources, such as Project and Catalog can support multi-tenancy by access control.

Shared CPD instance via access control

Use case

The Cloud Pak for Data customer compC has several business units/ departments. These departments are authenticated by the enterprise's central SAML2 or LDAP service. The customer compC would like to have their departments share the same one Cloud Pak for Data instance.

 

Considering all the business units within this client can have the same user authorization and authentication, so we can further support the tenants of compC client (different business unit) using tethered namespace or different service instances within the same CPD instance by access control.

 

For business units that do not need to track metering (resource usage) separately, compC can support them by different WSL projects, WKC catalogs, and/or Streams/Db2/WatsonServices instances within same namespace.

For business units that need to track metering (resource usage) separately, tethered namespace can be used.

 

Topology

This multi-tenancy model is about shared services and resources are access controlled by tenant scope.

  

In the above chart, tenant A and tenant B share the services and resources available in the same one CPD instance. But there’s access control enabled by the access privileges and roles which can help to manage the access to the services and resources. User groups could be used for easier access control.  By assigning roles to user groups, users can inherit permissions via their group memberships.

 

Besides, tethered namespace can support additional isolation needs. 

Tenancy Aspects

Mechanism

Isolation

Shared CPD instance  in one Kubernetes namespace  (No isolation! ) ⚠️

1.     Users – Authentication, Roles & Authorization

2.     Compute Resources

3.     Access to CPD instance

4.    Storage usage

  • One LDAP/AD/SAML configuration
  • No Quotas for individual users
  • Single OpenShift Route for all users
  • No separation of persistent volumes

Security

Resource level access privileges and roles

1.     Restricting Kube Cluster-level privileges

2.    Network Access isolations

  • No Cluster Level Privileges
  • No Network isolations between user workloads ⚠️

Management

Administration & Ops privileges cannot be granted to individual users ⚠️

1.     Monitoring/Alerting

2.     Operations – Backup/DR

3.     Serviceability/Diagnosis

4.    Add-on installations

  • One  CPD “Admin” User Role – ”tenants” cannot be granted Admin role
  • All add-on services in that CPD instance potentially visible to all users
  • Data Virtualization, Transformation & Enterprise Catalog are shared services.
  •  Granular access management for Analytics Projects, Databases & Service instances

 

WML-A is an example service which supports such kind of multi-tenancy.

It can be deployed into a tethered namespace.

Besides, for multiple organizations or users, WML-A support multi tenancy and resource management, such as:

  • Design multiple resource limitation for each organization or user or workloads.
  • Separate storage for each organization or user.
  • Resource reclaim between different organization or user.
  • Resource metering show for multiple organizations or users.

 

Note:

Considering there’s no strong isolation, this topology is not safe or reliable for multi-tenancy yet.

And please use with caution that the Enterprise Catalog, Search and Transformation Projects & the Data Virtualization service are meant to be shared across all users. Granting access to one tenant-user implies that this user may gain visibility to data or other artifacts owned by other tenant-users.

Besides, not all services support Tethered namespace. Each such tethered namespace can be associated with different quotas and security policies- but Authentication is still setup as part of the Control Plane namespace.

Summary

In this article, we introduced several multi-tenancy models that CPD 4.X supports from the perspectives of Isolation, Security and Management. Each multi-tenancy model has the use case corresponding to it. Customer can choose the multi-tenancy model which suits to their needs accordingly.

Sometimes, customer may need a hybrid multi-tenancy model which is comprised of two layers :

Layer1 - Separate CPD instances for different tenants and each in a dedicated group of namespace

Layer2 – Some tenants have their business units which share the CPD instance via access control.

Furthermore, the cost of the multi-tenancy models is not discussed in this article. But in general, more isolation, more expensive. But if it can cater to customer needs, even it’s expensive, it maybe cost-effective.

References

 

https://hanusiak-tomasz.medium.com/configuring-multi-tenant-cloud-pak-for-data-environment-on-openshift-b9c713834345

 

https://www.ibm.com/docs/en/cloud-paks/cp-data/4.0?topic=planning-multitenancy-support

 

https://www.ibm.com/docs/en/cloud-paks/cp-data/4.0?topic=considerations-multitenancy-network-security

Thanks
Thanks Sriram, Jingdong and Tomasz for their comments, suggestions and expertise sharing!

  • Sriram Srinivasan/Cupertino/IBM
  • Jingdong Sun/Rochester/IBM
  • Tomasz Hanusiak/Poland/IBM
      0 comments
      14 views

      Permalink