CPD 4.0 multi-tenancy support overview
Background
One of the important CP4D use scenarios is multi-tenancy. Public cloud providers and large enterprises have strong demands for taking CP4D to provide the Data and AI as a service for their tenants. Its important to understand that the concept of a "Tenant" itself may vary - in some cases, it maybe completely different Enterprises, in other cases, it maybe different departments inside one company or it maybe a situation where some tenancy criteria can be relaxed from a cost perspective. In this article we’ll introduce the multi-tenancy mechanisms and the corresponding use cases that CPD 4.0 supports.
Multi-tenancy support overview
In high-level, the multi-tenancy models that CPD 4.0 supports could be classified into two categories Platform level multi-tenancy and Service level multi-tenancy. Platform level multi-tenancy focus on differentiating the multi-tenancy models from the perspectives including Isolation, Security and Management. Service level multi-tenancy will introduce some specific services or resources on CPD available for multi-tenancy support in the specific use cases and especially in terms of access control.
Platform level multi-tenancy
Separate CPD instances each in a dedicated group of namespace
Use case
A large enterprise named compA plans to use CPD as their cloud service to server their own clients from different areas. Each of their clients is a separate company, institute or university.
Each client needs fully isolation for their data, resource, network, security, user management.
And compA has detailed multi-tenancy requirements as follows.
- Multiple clients in a shared OpenShift cluster
- Different clients need to be isolated from each other completely
- Separate monitoring and quota
- Separate set of add-ons for each tenant
- Different Authentication (LDAP/AD/SAML2) mechanisms available for different tenants
- Separate Admins & user roles for each tenant
Topology
Basically, the multi-tenancy model here is about tethering a group of OpenShift Projects/namespaces for one tenant.
In the following chart, as an example, the namespace cpd-instance-1 and tethered namespace A are tethered group for tenant A’s services deployment.
Tethered namespace is optional and not required. And it’s useful when you want to deploy some of your service isolated from other services for better resource quota management, license monitoring or metering.
Note:
Not all services support Tethered namespace.
The key features of this mechanism are as follows.
- Share the OCP cluster and the same set of Cloud Pak Foundational Services and CPD Operators.
- Full isolation - nothing of CPD instance is (typically) shared between tenants.
- User access control including admin for the CPD instance could be maintained by the tenant.
From the tenancy aspects, this multi-tenancy model supports Isolation, Security and Management with the following mechanisms.
Tenancy Aspects
|
Mechanism
|
Isolation
|
A tethered group of Kube namespaces/RH OS projects for each CPD instance
|
- Users – Authentication, Roles & Authorization
- Compute Resources
- Access to CPD instance
- Storage usage
|
- If needed, can support different LDAP/AD or SAML configurations per instance (each CPD instance has its own unique user management service)
- Namespace Quotas – set by the cluster admin.
- Unique RHOS DNS based routes for each instance
- Completely separated PVCs (in different namespaces) with different PVs – no sharing across instances.
|
Security
|
RBAC privileges & Service Accounts scoped to that group of namespace
|
- Restricting Kube Cluster-level privileges
- Network Access isolations
- Auditing
|
- Service Accounts & Role binding – scoped to RHOS Project namespaces
- Kube network policies for hardening
- Per-tenant SIEM forwarding supported
|
Management
|
Administration privileges scoped to a namespace
|
- Monitoring/Metering
- Ops – Back/DR, Scale, Patch,Upgrade
- Serviceability/Diagnosis
- Add-on installations
|
- Different CPD “Admin” User Roles for different instances. Tenants manage independently
- Ops Privileges scoped to that RHOS project namespace
- Serviceability utilities only captures diagnostics appropriate for that CPD instance
- Different CPD instances can have different sets of add-ons scaled differently
|
Separate CPD instances each with a dedicated group of worker nodes
Use case
A large enterprise named compB has existing applications running on their OCP cluster. The existing application could be Cloud Pak for Data, other Cloud Pak applications or Non-IBM applications. And now customer wants to install Cloud Pak for Data next to existing applications and they want to make sure the products don’t affect each other.
Even Cloud Pak for Data deploys in a dedicated group of namespace can offer separation, this however may not be enough. As a further requiremet, CompB wants to keep the applications’ workloads independent and each run in dedicated worker nodes.
Topology
Basically, this multi-tenancy model is about dedicating a group of workers to one or a group of namespace.
In the following chart, as an example, the tenant A’s CPD instance is deployed in the namespace cpd-instance-1 with the dedicated worker node group Group-1. Group-1 could be comprised of several worker nodes, such as worker01, worker02 and worker03.
The key feature of this mechanism is Cloud Pak for Data instances and other applications share the OCP cluster but run independently with dedicated worker nodes.
From the tenancy aspects, this multi-tenancy model supports Isolation, Security and Management with the following mechanisms.
Tenancy Aspects
|
Mechanism
|
Isolation
|
A dedicated groups of worker nodes for each CPD instance
|
1. Users – Authentication, Roles & Authorization
2. Compute Resources
3. Access to CPD instance
4. Storage usage
|
- If needed, can support different LDAP/AD or SAML configurations per instance (each CPD instance has its own unique user management service)
- Dedicated worker nodes and namespace Quotas – set by the cluster admin.
- Unique RHOS DNS based routes for each instance
- Completely separated PVCs (in different namespaces) with different PVs – no sharing across instances.
|
Security
|
RBAC privileges & Service Accounts scoped to the namespace with a dedicated worker node group
|
1. Restricting Kube Cluster-level privileges
2. Network Access isolations
3. Auditing
|
- Service Accounts & Role binding – scoped to RHOS Project namespaces
- Kube network policies for hardening
- Per-tenant SIEM forwarding supported
|
Management
|
Administration privileges scoped to a CPD instance
|
1. Monitoring/Metering
2. Ops – Back/DR, Scale, Patch,Upgrade
3. Serviceability/Diagnosis
4. Add-on installations
|
- Different CPD “Admin” User Roles for different instances. Tenants manage independently
- Ops Privileges scoped to that RHOS project namespace
- Serviceability utilities only captures diagnostics appropriate for that CPD instance
- Different CPD instances can have different sets of add-ons scaled differently
|
Note:
Please make sure that you review you license agreements before you deploy multiple CP4D instances on top of an existing OpenShift cluster.
Multi-tenancy with CPD Service instances & Resources
This part is introduced as a complement to the Platform level multi-tenancy. Some services can provision multiple service instances for different users or groups to support multi-tenancy. And some resources, such as Project and Catalog can support multi-tenancy by access control.
Shared CPD instance via access control
Use case
The Cloud Pak for Data customer compC has several business units/ departments. These departments are authenticated by the enterprise's central SAML2 or LDAP service. The customer compC would like to have their departments share the same one Cloud Pak for Data instance.
Considering all the business units within this client can have the same user authorization and authentication, so we can further support the tenants of compC client (different business unit) using tethered namespace or different service instances within the same CPD instance by access control.
For business units that do not need to track metering (resource usage) separately, compC can support them by different WSL projects, WKC catalogs, and/or Streams/Db2/WatsonServices instances within same namespace.
For business units that need to track metering (resource usage) separately, tethered namespace can be used.
Topology
This multi-tenancy model is about shared services and resources are access controlled by tenant scope.
In the above chart, tenant A and tenant B share the services and resources available in the same one CPD instance. But there’s access control enabled by the access privileges and roles which can help to manage the access to the services and resources. User groups could be used for easier access control. By assigning roles to user groups, users can inherit permissions via their group memberships.
Besides, tethered namespace can support additional isolation needs.
Tenancy Aspects
|
Mechanism
|
Isolation
|
Shared CPD instance in one Kubernetes namespace (No isolation! ) ⚠️
|
1. Users – Authentication, Roles & Authorization
2. Compute Resources
3. Access to CPD instance
4. Storage usage
|
- One LDAP/AD/SAML configuration
- No Quotas for individual users
- Single OpenShift Route for all users
- No separation of persistent volumes
|
Security
|
Resource level access privileges and roles
|
1. Restricting Kube Cluster-level privileges
2. Network Access isolations
|
- No Cluster Level Privileges
- No Network isolations between user workloads ⚠️
|
Management
|
Administration & Ops privileges cannot be granted to individual users ⚠️
|
1. Monitoring/Alerting
2. Operations – Backup/DR
3. Serviceability/Diagnosis
4. Add-on installations
|
- One CPD “Admin” User Role – ”tenants” cannot be granted Admin role
- All add-on services in that CPD instance potentially visible to all users
- Data Virtualization, Transformation & Enterprise Catalog are shared services.
- Granular access management for Analytics Projects, Databases & Service instances
|
WML-A is an example service which supports such kind of multi-tenancy.
It can be deployed into a tethered namespace.
Besides, for multiple organizations or users, WML-A support multi tenancy and resource management, such as:
- Design multiple resource limitation for each organization or user or workloads.
- Separate storage for each organization or user.
- Resource reclaim between different organization or user.
- Resource metering show for multiple organizations or users.
Note:
Considering there’s no strong isolation, this topology is not safe or reliable for multi-tenancy yet.
And please use with caution that the Enterprise Catalog, Search and Transformation Projects & the Data Virtualization service are meant to be shared across all users. Granting access to one tenant-user implies that this user may gain visibility to data or other artifacts owned by other tenant-users.
Besides, not all services support Tethered namespace. Each such tethered namespace can be associated with different quotas and security policies- but Authentication is still setup as part of the Control Plane namespace.
Summary
In this article, we introduced several multi-tenancy models that CPD 4.X supports from the perspectives of Isolation, Security and Management. Each multi-tenancy model has the use case corresponding to it. Customer can choose the multi-tenancy model which suits to their needs accordingly.
Sometimes, customer may need a hybrid multi-tenancy model which is comprised of two layers :
Layer1 - Separate CPD instances for different tenants and each in a dedicated group of namespace
Layer2 – Some tenants have their business units which share the CPD instance via access control.
Furthermore, the cost of the multi-tenancy models is not discussed in this article. But in general, more isolation, more expensive. But if it can cater to customer needs, even it’s expensive, it maybe cost-effective.
References
https://hanusiak-tomasz.medium.com/configuring-multi-tenant-cloud-pak-for-data-environment-on-openshift-b9c713834345
https://www.ibm.com/docs/en/cloud-paks/cp-data/4.0?topic=planning-multitenancy-support
https://www.ibm.com/docs/en/cloud-paks/cp-data/4.0?topic=considerations-multitenancy-network-security
Thanks
Thanks Sriram, Jingdong and Tomasz for their comments, suggestions and expertise sharing!
#CloudPakforDataGroup