IBM Cloud Pak System

Move your existing workloads between multiple IBM Cloud Pak Systems

By Hendrik van Run posted 19 days ago

  
Originally published as IBM Developer Recipe here on 4 August 2021 by Hendrik van Run, Jonathan Deberdt, Rahul Nema and Vaibhav Gadge.

Introduction

On 19 July 2021, IBM Cloud Pak System 2.3.3.3 Interim Fix 1 was released. This release comes with built-in support for workload mobility, refer to the blog post What’s new in IBM Cloud Pak System 2.3.3.3 Interim Fix 1 to learn more about this release. This tutorial will focus on the new workload mobility feature.

In IBM Cloud Pak System 2.3.3.3 and earlier, workloads deployed on an IBM Cloud Pak Systems could not be moved out of the appliance. The IBM Cloud Pak System 2.3.3.3 Interim Fix 1 now provides a way to move an existing Virtual System Instance from one IBM Cloud Pak System to another IBM Cloud Pak System. For example, this new feature is especially useful if you want to rebalance your workload between systems or in the process of moving to newer systems.

Figure 1: Move your existing workloads between two IBM Cloud Pak System appliances

This workload migration is performed in an automated fashion without downtime and without re-instantiating virtual machines (no data loss on the virtual machines) through the usage of VMware Cross vCenter vMotion. Note that there might be a short cutover time while the workload maybe unresponsive or unavailable as it is being moved (typically less than one second). If you were to ping your virtual macines's IP address during the move, you might notice the loss of one ICMP packet during the cutover. We recommend you familiarize yourself with VMware Cross vCenter vMotion:

Scenarios

Inter-system workload mobility

IBM Cloud Pak System users with multiple systems are likely to move some of their existing workloads between their systems for a variety of reasons. Capacity management is an obvious one, but there could be other drivers. The IP groups on the source and target IBM Cloud Pak System appliances must have the same VLAN ID, subnet, and gateway, but they can have different names, Environment Profiles, and Cloud Groups. Consequently, the workload mobility helps to re-organize (group or separate) your virtual system instances. You can use the feature to consolidate several cloud groups into bigger ones or, on the contrary, to isolate some virtual system instances that were previously grouped.

Figure 2: Moving Virtual System instances from one IBM Cloud Pak System appliance to another one to manage capacity

Using the feature twice on the same virtual system instance, you can move a virtual system instance to another environment profile, or cloud group, or IP group, or a combination of the three, which is not possible otherwise. To do so, you need to first temporarily move the virtual system instance to another IBM Cloud Pak System and then back to the orignal IBM Cloud Pak System (using the new Environment Profile/Cloud Group/IP Groups).

Intra-generational workload mobility

Workload mobility also allows to move virtual system instances from a Gen3 IBM Cloud Pak System (i.e. W3500 and W3550 models) to a new Gen4 IBM Cloud Pak System (i.e. W4600 model) without having to redeploy.

Figure 3: Inter-generational workload mobility

Planned outage on IBM Cloud Pak System or datacenter

In case there is a planned outage (power maintenance) on the datacenter or on one IBM Cloud Pak System, you can move in advance the most business-critical workloads out of the impacted systems to keep them available during the outage. Remember that the workload mobility is performed without downtime.

Prerequisites

This article assumes you are familiar with the following aspects:

  • IBM Cloud Pak System
  • Virtual System Patterns

The Workload mobility feature is available for all Intel-based models of IBM Cloud Pak System: W3500, W3550, and W4600.

Since workload migration for IBM Cloud Pak System relies on Cross vCenter vMotion, you must make sure that the following network requirements are met:

  • The IPv4 subnets of the virtual machines you want to move are available on both IBM Cloud Pak System appliances and are using the same VLAN ID.
  • The vCenter and compute nodes in IBM Cloud Pak System must have an external IPv4 address configured.
  • The IBM Cloud Pak System administration networks and their interconnections must meet the requirements as documented by VMware (latency, bandwidth, etc). Especially, you must have the following connectivity:


Figure 4: Typical network design

Workload migration allows you to move your virtual system instances to another IBM Cloud Pak System that includes a different environment profile, cloud group, compute nodes, and IP Groups. However, the network configuration of the virtual machines will remain the same. The virtual machines will keep the same IP addresses, hostnames, subnet, gateway, DNS, VLAN ID, routes, and so on. Consequently, the end-users do not get impacted by the move. Nevertheless, it mandates that the VLAN ID and subnets used by your virtual machines must be available on both the source and target IBM Cloud Pak System appliances.

Moreover, there are currently some prerequisites and limitations regarding the workload that can be migrated and the configuration for the systems. Review the following material to prepare your systems and make sure you can run workload migrations:

Moving your first deployed workload

Before proceeding with a workload migration, review the prerequisites to make sure the procedure can be applied. The Virtual System Instance corresponding to the the workload you intend to move must be in running state. All virtual machines of the Virtual System Instance must be up and running during the migration. The process updates the configuration of the installed tooling used by the Cloud Pak System to communicate and manage virtual machines.

On a production system, it is advised to perform a planned reboot of the Virtual System Instance before the workload migration to make sure that the Virtual System Instance restarts properly without any change. As already stated, it is a live migration of the Virtual System Instance, meaning that it will stay available for end users. However, any other change (CPU/memory/disk addition, virtual machine stop/start, scaling, snapshot) must be prevented during the migration. You must not do any operations like "Mark", "Stop", "Delete", "Manage", "Maintain" on a deployment that is getting migrated on source or target.

1. Collect required information

Since workload migration is only available through the REST API, you’ll need to gather some information to build the request’s payload. Let’s take the following Virtual System Instance:

Figure 5: Virtual System on source IBM Cloud Pak System

Note the "Deployment ID" at the top of the screen.

Collect the following information:

  • target_System_IP_address: (required) This is the IP address or FQDN of the target IBM Cloud Pak System appliance to which the Virtual System Instance will be moved to.
  • source_deployment_id: (required) Log into the source IBM Cloud Pak System appliance and look at the Virtual System Instance. Then, get the ID that is available at the top of the page. For reference, see the ID highlighted in the previous screenshot.
  • target_environment_profile_name: (required, case-sensitive) This is the name of the environment profile on the target Cloud Pak System appliance. This is case sensitive. It is not mandatory for the target environment profile to have the same name as the source environment profile. If the target environment profile name contains some special characters, such as spaces, square, curly brackets, backslash, ampersand, percentage, question mark, or period, then you can use the attribute target_environment_profile_id instead of target_environment_profile_name. The ID of the environment profile can be found by requesting https://<target_System_IP_address>/resources/environmentProfiles. This ID is a numeric value.
  • target_cloud_group_name: (required, case-sensitive) It is the name of the cloud group on the target IBM Cloud Pak System appliance. This is case sensitive. It is not mandatory for the target cloud group to have the same name as the source cloud group. Make sure that the cloud group is available and is declared in the target environment profile.
  • source_rack_ipaddress: (required) This is the IP address or FQDN of the source IBM Cloud Pak System appliance that is initially hosting the Virtual System Instance.
  • source_admin_user and source_admin_password: (required) These are the credentials of a user that has administrator privileges on the source IBM Cloud Pak System appliance
  • target_ip_group: (required, case-sensitive) This mandatory parameter maps the network interfaces of each virtual machine to an IP Group on the target system. Each virtual machine of the Virtual System Instance has a node name and ordered network interfaces. Note that the workload migration supports up to 7 network interfaces. You can get the node name and the order of the network interfaces in the virtual system instance page > "virtual machine perspective" section. The "Network Interface 0" is the internal IPv6 interface of the VM managed by IBM Cloud Pak System and is not assigned to any IP Group, so consider from "Network Interface 1" onwards.
Figure 6: Virtual machine perspective

Figure 7: Network interfaces in the virtual machine perspective

In the screenshots, you can see that the node name is 'OS_Node' and that the network interface 1 has been assigned to the IP address 172.25.56.250. Also, collect the name of the IP Groups for each NIC on the target system. To do so, you can search for the IP addresses in the "IP Usage" page of the "Reports" menu. With this information, build a list using the following format:

"target_ip_group": 
"['node_name_1': 'eth1':'IP_Group_1','eth2':'IP_Group_2'],
'node_name_2':['eth1':'IP_Group_3','eth2':'IP_Group_4']]"

Note that the label for each NIC must use the format ethX, where X is a number that starts with 1. It is just a way to order the NICs and does not reflect the name of the network interface on the virtual machine's operating system. Using the same screenshot as an example, we will write the following:

"target_ip_group": "['OS_Node':['eth1':'RacktoRack']]"
  • vm_user_credentials: (optional but recommended) This parameter is optional but it's a best practice to provide it. Use the same node names for the "target_ip_group" parameter and provide some administrative user credentials for each virtual machine of the instance by using the following format:
"vm_user_credentials": 
"['OS_Node':['user':'someusername','password':'somepassword'],
'OS_Node_1':['user':'anotherusername','password':'anotherpassword']]"

2. Other request parameters

  • name: (optional) This is the name of the migration job.
  • state: (required) This is a required parameter and must always be set to start.
  • premigration_validate: (optional) If yes (default value) or true, IBM Cloud Pak Systems run some validation checks and, if successful, start the migration. You can set to VALIDATE_ONLY to run the validation tests without triggering the migration. n, no or false will skip the validation tests and directly start the migration.
  • migration_timeout_min: (optional) It is expressed in minutes, default value is 60. The operation waits either until the migration is complete or the specified time is reached.
  • disable_multithread_vmrelocate: (optional) The default is false. Set it to true if you want to make sure that virtual machines of the same virtual system instance are moved sequentially and not in parallel.
  • deletion_timeout_min: (optional) It is expressed in minutes, default value is 45. In some cases, the deletion tasks on the source system may take some extra time. Use this parameter to increase the timeout of these deletion operations.

3. Build the REST request

Workload mobility is only available through the REST API, so you'll need a REST client to perform the operation, such as cURL or SoapUI.

Before submitting the workload mobility request, make sure that there’s no workload mobility already in progress. This can be checked either by requesting the following URL http://<CPS_IP_address>/admin/resources/migrate_deployment?state=running on both source and target Cloud Pak System appliances. You can also look at the Job Queue in the CPS GUI (Menu "Troubleshooting" -> "Job Queue") and filter by name "migrate_deployment" and state "Running". Also, on the target cloud groups, avoid running other tasks, such as deploy, delete virtual machine/virtual instance, or resize. These tasks can cause issues with the placement engine and allocation calculations.

Build the request as follows (filled with sample values):
Method:

POST

Headers (note that the value of the Authorization header depends on the credentials you use to logon to IBM Cloud Pak System):

Content-Type: application/json
X-IBM-PureSystem-API-Version: 1.0
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

URI:

https://target_System_IP_address/admin/resources/migrate_deployment

Payload:

{
"source_deployment_id":"d-4dc5-b45f-b2a932486385",
"target_environment_profile_name":"ep_ip",
"target_cloud_group_name":"TestCG",
"source_rack_ipaddress":"9.9.9.9",
"source_admin_user":"cps_admin",
"source_admin_password":"be@ut1fulPasswd!",
"name":"Testmigration",
"state":"start",
"target_ip_group":"['Standalone_Node':['eth1':'AS_IP_Group'],'Database_Node':['eth1':'DB_IP_Group','eth2':'BackUp_IP_Group']]",
"vm_user_credentials":"['Standalone_Node':['user':'someusername','password':'somepassword'], 'Database_Node':['user':'anotherusername','password':'anotherpassword']]",
"migration_timeout_min":"120",
"premigration_validate":"true",
"disable_multithread_vmrelocate":"false",
"deletion_timeout_min":"60"
}

This REST call may take a few seconds before responding, especially if the premigration_validate parameter is set to true/yes (which is the default).

As an example, below is the payload of the request sent to move the virtual system instance shown on the screenshots:

{
"source_deployment_id":"d-15a57078-0b35-48c0-be1e-e6985a8f76f3",
"target_environment_profile_name":"Shared",
"target_cloud_group_name":"nagacg2ip",
"target_ip_group":"['OS_Node':['eth1':'RacktoRack']]",
"source_rack_ipaddress":"172.26.56.32",
"source_admin_user":"admin",
"source_admin_password":"passw0rd",
"name":"prajjob1",
"state":"start",
"deletion_timeout_min":"120",
"vm_user_credentials":"['OS_Node':['user':'root','password':'passw0rd']]",
"premigration_validate":"true"
}

4. What's going on then?

If the premigration_validate parameter is set to true or yes or is omitted, then the target IBM Cloud Pak System will perform several checks:

  • Request the source IBM Cloud Pak System to get information on the virtual system instances
  • Check for the presence and requirements of the Environment profile, cloud group, and IP Groups on target IBM Cloud Pak System
  • Whether the Virtual System Pattern that was used to deploy the virtual system instane on the source IBM Cloud Pak System is also present on the target IBM Cloud Pak System

If one of these checks is failing, it will return an HTTP 400 Bad Request error.

Otherwise, the target IBM Cloud Pak System creates a new "migrate_workload" job that will perform the following tasks:

  • Enable DRS (Dynamic Resource Scheduler) on VMware vCenter for the target cloud group.
  • Create a new virtual system instance on the target without creating virtual machines.

 

Figure 8: Virtual System Instance been registered on target

  • Trigger a VMware Cross-vCenter-vMotion of the virtual machine(s) between the source and the target vCenters.

Figure 9: Migration jobs in progress

  • After the movement of the virtual machines to the target vCenter, the target IBM Cloud Pak System connects to the virtual machines and reconfigure the agents (Maestro, Activation Engine, Tivoli Monitoring). These agents that run on virtual machines now point to the target IBM Cloud Pak System. During this step, the status of the virtual machines and the virtual system instance might be inconsistent (for example, in error or launching state, even if the VMs are running fine). Script packages and software components are not re-run during the process.
  • After the target IBM Cloud Pak System looks fine, the virtual system instance and the related objects (additional volumes, IP addresses) are deleted from the source IBM Cloud Pak System. Note that only metadata gets deleted from the source IBM Cloud Pak System, as the actual the virtual machines have already been moved and are no longer known by vCenter.

After the workoad migration is complete, you can use the virtual system instance normally.

The deployment ID is different for the newly created virtual system instance object on the target IBM Cloud Pak System. See the difference between figures 5 and 8. The history of the instance before the migration is lost. In the history of the virtual system instance, you can notice the following message: "Instance is migrated from source to target system successfully."

Figure 10: Migration completed

5. Post-migration actions

Reboot the Windows virtual machines

There is a known issue related to snapshots that is impacting the Windows virtual machines. You need to reboot the Windows virtual machines before you can create a new snapshot on the target IBM Cloud Pak System.

ACLs

Access permissions on the virtual system instance do not get migrated. The user value set in the REST call during migration is the owner of the instance in the target system. Manually grant access to the users or groups based on the privileges set in the source IBM Cloud Pak System, possibly, for the original owner.

Figure 11: ACL ("Access granted to" section) on a Virtual System Instance

Licences

Licensing parts that were added post the instance deployment on the source IBM Cloud Pak System are not retained on the target. You must manually add them again.

Disable DRS

If you are on a Virtual Manager Cloud Groups, you need to run the following action. This is not needed in other cases, except requested by IBM Support. As written above, DRS is enabled at the beginning of the process. However, it is not disabled at the end of the workload mobility process. An internal job is regularly running to disable DRS if needed. But this job does not exist for Virtual Manager Cloud Groups. Consequently, and only in this case or if requested by IBM Support, you need to disable DRS after the workload migrations are complete. To do so, send the following REST call to the target CPS on the right cloud group:

Method:

POST

Headers (note that the value of the Authorization header depends on the credentials you use to logon to IBM Cloud Pak System):

Content-Type: application/json
X-IBM-PureSystem-API-Version: 1.0
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

URI:

https://target_System_IP_address/admin/resources/migrate_deployment

Payload:

{  "target_cloud_group_name": "CloudGroup_Name", "disable_drs" : "true"}

Conclusion

This article provided the know-how to leverage the workload mobility feature of IBM Cloud Pak System 2.3.3.3 Interim Fix 1. For example, you can use this feature in the following circumstances:

  • Re-balance your workload between two IBM Cloud Pak System appliances
  • Move your workloads to a new IBM Cloud Pak System appliance without redeployment for migration
  • Move existing business critical workloads from an IBM Cloud Pak System ahead of a planned outage
0 comments
12 views

Permalink