Cloud Pak for Data

Come for answers. Stay for best practices. All we’re missing is you.

View Only

Back to discussions

Expand all | Collapse all

Import of sub sequencers

1. Import of sub sequencers

Like
Tapas Pradhan
Posted Sun July 21, 2024 11:02 PM

Reply
Dear All,

While working on a migration activity from 11.7 to CP4D, I noticed a behavior of cp4d which I wanted to confirm if that is the expected functionality.

For this explanation lets assume , i have a sequencer A calling a sub sequencer B thru Job Activity stage. As per my migration strategy, I am performing import of assets individually thru dsjob migrate using isx. Lets assume B was already migrated and after migrating sequence A without any dependencies, I can see cp4d is identifying the sub-sequencer node which links to B as "Run Datastage Job" type rather than "Run Pipeline Job".

When i try to open node B from A as view asset it is not able to fetch it (404 Not Found CDIWA0078E Pipeline JSON not found for flow), but when I try to view it as job it is able to take me to B.Datastage Sequencer page.

In my opinion, while migrating independent sequencers job from 11.7 to cp4d, the nodes in the sequencer which are calling other sequencers should not convert it to "Run Datastage Job" type.

However, when we export both A and B together into an isx and migrate it into cp4d, the node which calls B now rightly identifies as "Run Pipelines job" type.

Kindly let me know if we always need to include the dependencies in the isx file while migrating it for the first time to avoid any such similar issues or there is a different way to do this. The problem with this approach is that the common components which are called by org wide sequencers have to be repeatedly exported to avoid the issue.

Thanks for your help.

Best,

Tapas

------------------------------
Tapas Pradhan
------------------------------
2. RE: Import of sub sequencers

Like
YONG LI
Posted Mon July 22, 2024 08:04 AM

Reply
Hi Tapas,

The first one: First import sub sequence B, then import sequence A, B appears `Run DataStage job`, is a defect, it should be treated as `Run Pipeline Job`, we will fix this asap.

In general, when you import isx, you should import with full dependency. Missing dependency, especially missing ParameterSet, PROJDEF, shared container or nested sequence, is a common problem that can lead to frustration. If your project is small to middle size (5000 jobs or fewer), export the whole thing as isx with full dependency is much faster way to manage your migration. If it is bigger, you may want to manage the import using partitioning the workload by folder method. Even with this, it is much better to include full dependency as well. Granted it will make your isx file a little bigger, the migration service can resolve the difference properly this way.

------------------------------
YONG LI
------------------------------

Original Message
3. RE: Import of sub sequencers

Like
Tapas Pradhan
Posted Mon July 22, 2024 07:58 PM
Edited by Tapas Pradhan Mon July 22, 2024 07:59 PM

Reply
Thank you Yong for acknowledging this as a defect.

I agree to your point that we should import with full dependency. On a full fledged 11.7 project, we usually use incdep option from istool starting from the master(top level) sequencer to figure out complete dependency hierarchy, but there are two notable issues with this approach if my DataStage project has a huge repository

1) istool typically scans the whole repo for each dependency discovery. This takes around 3 - 5 mins (benchmarked on a project with 17k jobs) to figure out a dependency for one single sequencer. This makes our migration slow when dealing with thousands of sequencers.

2) istool sometimes also timeout when figuring out bigger dependencies. We did try to increase the timeout settings for istool but that didn't make much difference. As istool is failing on certain conditions, it makes this process unreliable when we try to implement scripts to automate the migration.

If there is another reliable way we can do it much more faster please guide. Would you mind share some links where I can learn more about " import using partitioning the workload by folder method" ?

I would wait for your revert on which product release, are you planning to fix the original issue that I had highlighted. Thanks for your time reading my post and replying.

------------------------------
Tapas Pradhan
------------------------------

Original Message
4. RE: Import of sub sequencers

Like
Victoria Rickmann
Posted Mon July 22, 2024 02:20 PM

Reply
Hello Tapas,

if you migrate each Datastage job on its own, how should migration services know B is a job sequence. Maybe you as a developer know this because of the naming convention you are following. Migration service only sees a job activity with out further information.

CP4D documentations says: Make sure that the ISX file export includes any dependencies, such as parameter sets and table definitions.

https://www.ibm.com/docs/en/cloud-paks/cp-data/4.8.x?topic=data-migrating-datastage-jobs

Hope this helps.

Rgds Victoria

------------------------------
Victoria Rickmann
------------------------------

Original Message
5. RE: Import of sub sequencers

Like
Tapas Pradhan
Posted Mon July 22, 2024 08:43 PM
Edited by Tapas Pradhan Mon July 22, 2024 08:47 PM

Reply
Hi Victoria,

You are correct. As a developer we follow naming conventions which helps us identifying the kind of jobs we are dealing with without looking at it.

In my opinion, migration service should have that intelligence (either of its own logic or thru user-defined configuration) to figure out if a Job Activity is calling a job or sequencer and should not convert all the Job Activity Stage to Run DataStage Job when migrated standalone.

Let me explain more by taking the same example. The job sequencer B is already migrated individually to cp4d. As we know every asset name is uniquely identified in cp4d, when master sequencer A is getting migrated with without dependencies in the isx , it should ideally identify job B as sequencer (as the metadata is already present in cp4d) and accordingly migrate "Job Activity" Stage to "Run Pipeline job".

Lets consider another scenario where sequencer A is getting migration first and we do not have sequencer B migrated yet. In this scenario, i could think of couple of options which should do but not necessarily take this approach:-

1) It should errored out and ask out for missing dependent .
2) Job Activity should not resolve to either Run Pipeline job or Run DataStage job but rather wait for B to get migrated before it can decide which stage to resolve to ( highly impossible ! )

I have highlighted few issues with ISX file exports on my previous reply. Also, if we have common components being called by each master sequencer, it adds a overhead to export and import the same common components repeatedly in each master sequencer move and reduces efficiency of the migration process.

Please let me know your views on this. Thank you for taking time and reverting back on the issue.

------------------------------
Tapas Pradhan
------------------------------

Original Message
6. RE: Import of sub sequencers

Like
Ralf Martin

IBM Champion
Posted Mon July 29, 2024 11:00 AM

Reply
Hi Tapas,

I agree with you, that it would be nice to be able to migrate incremental from legacy to CP4D, but I also understand IBMs point of view. As already commented in December it is designed by IBM not be an incremental process and they decided to invest their resources rather in CP4D itself than in features of an incremental migration. So export your whole Legacy project (you don't have to check the "include related objects", as long as you include everything) and import it in one step, then have some weeks/months where you run CP4D and Legacy in parallel and if everything is fine, switch off Legacy.

I guess you are an IBM customer, if this is so, then IBM will help you in the migration process, as they are doing with great effort for my client.

KR Ralf

------------------------------
Ralf Martin
Principal Consultant
infologistix GmbH
Bregenz
------------------------------

Original Message

Cloud Pak for Data

Cloud Pak for Data

Import of sub sequencers

Tapas PradhanSun July 21, 2024 11:02 PM

YONG LIMon July 22, 2024 08:04 AM

Tapas PradhanMon July 22, 2024 07:58 PM

Victoria RickmannMon July 22, 2024 02:20 PM

Tapas PradhanMon July 22, 2024 08:43 PM

Ralf MartinMon July 29, 2024 11:00 AM

1. Import of sub sequencers

2. RE: Import of sub sequencers

3. RE: Import of sub sequencers

4. RE: Import of sub sequencers

5. RE: Import of sub sequencers

6. RE: Import of sub sequencers

Additional
Resources

Office

Quick Links

Cloud Pak for Data

Cloud Pak for Data

Import of sub sequencers

Tapas PradhanSun July 21, 2024 11:02 PM

YONG LIMon July 22, 2024 08:04 AM

Tapas PradhanMon July 22, 2024 07:58 PM

Victoria RickmannMon July 22, 2024 02:20 PM

Tapas PradhanMon July 22, 2024 08:43 PM

Ralf MartinMon July 29, 2024 11:00 AM

1. Import of sub sequencers

2. RE: Import of sub sequencers

3. RE: Import of sub sequencers

4. RE: Import of sub sequencers

5. RE: Import of sub sequencers

6. RE: Import of sub sequencers

Additional Resources

Office

Quick Links

Additional
Resources