IBM i Global

IBM i

A space for professionals working with IBM’s integrated OS for Power systems to exchange ideas, ask questions, and share expertise on topics like RPG and COBOL development, application modernization, open source integration, system administration, and business continuity.

#Power

#IBMi
#Power

View Only

Back to discussions

Expand all | Collapse all

PASE performance question

1. PASE performance question

Like
Glenn Robinson
Posted Tue June 07, 2022 08:02 AM

Reply
I have bash script which manipulates some text file using many grep, awk and sed statements.

When I run this on my V7R3 LPAR it takes around 6 minutes to run the script.

When I run the same script on my RHEL ppc64le LPAR it takes just 5.8 seconds.

Both LPARs are connected to the same FS7200 storage.

Both LPARs have have the same CPU and Memory assigned.

I'm guessing that when I use pipes or execute commands as in put to loops this is kicking off additional IBM i jobs which is why this takes so long comapred to RHEL.

Any suggestions/comments on how I can make PASE more performant so that I can get closer to RHEL performance?

Thanks
Glenn

------------------------------
Glenn Robinson
------------------------------
2. RE: PASE performance question

Like
Satid Singkorapoom
Posted Tue June 07, 2022 08:51 AM
Edited by Satid Singkorapoom Tue June 07, 2022 09:14 AM

Reply
Dear Glenn

Was you job the only job running in IBM i LPAR when you performed the task? If not, how much other workload was at that time? It may not be easy to answer your question if many other jobs are also running.

Are all the LUNs from FS7200 allocated to both IBM i and RHEL LPARs carved out of the same storage pool? If not, then we need to know if IBM i LPAR has as good a disk response time as RHEL LPAR or not.

How many additional jobs were launched by your loop? If more than a few, then this is one major cause of the performance drag. In such a case, make sure *BASE memory pool (pool number 2) in your IBM i has sufficient memory allocated and its MAX ACTIVE parameter should be set at a value of least 1000. Please use WRKSYSSTS command and press F10 every 10 seconds or so and observe the memory faulting rate of pool 2 when your script is running (please ignore "Pages" value). If the faulting rate is as high as some 500 or more consistently during the script run, this can be an issue and you may try adding more memory to pool 2 to see if it reduces the faulting rate or not.

Did you run the Bash script from an SSH client? A tutorial article on running Bash shell in IBM i PASE recommends this as you can read here : https://www.itjungle.com/2014/09/17/fhg091714-story01/

A few more articles on using Bash in IBM i but not sure if they will be useful or not : https://jbh.github.io/categories/ibm%20i/

------------------------------
Satid Singkorapoom
------------------------------

Original Message
3. RE: PASE performance question

Like
Glenn Robinson
Posted Tue June 07, 2022 09:27 AM

Reply
Satid,

My apologies, I should have made this clearer.

There are no other users or batch jobs running on the IBM i LPAR apart from my own ssh client session and default IBM jobs. In fact, this LPAR runs native IBM i workloads without any performance problems.

Neither the IBM i or RHEL LPAR have any obvious performance constraints. Both LPARs share the same storage pool and the FS7200 shows blazing fast volume response times.

I just tried running the script with the same text files on an AIX 7.1 LPAR on the same Power9 system as the RHEL and IBM i LPAR and that ran in about 8 seconds.

I can't see anything in the docs you kindly sent that I haven't read before.

What I can see on WRKACTJOB is that the script generated hundreds of shirt lived QP0ZSPWP jobs on the system.

------------------------------
Glenn Robinson
------------------------------

Original Message
4. RE: PASE performance question

Like
Glenn Robinson
Posted Tue June 07, 2022 09:40 AM

Reply
I have just made the changes to SSHD to use the QP0ZSPWT pre tart jobs as described here

This has reduced the run time by about 1 minute but it's still significantly slower than RHEL or AIX.

------------------------------
Glenn Robinson
------------------------------

Original Message
5. RE: PASE performance question

Like
Satid Singkorapoom
Posted Tue June 07, 2022 08:14 PM
Edited by Satid Singkorapoom Tue June 07, 2022 08:37 PM

Reply
Dear Glenn

Hundreds of short-lived QP0ZSPWP jobs is indicative of the run-time performance issue and I suspect this is by design in IBM i's support for Spawn() and Fork() when running UNIX shell script in PASE. One basic and crucial thing you must do is to make sure that, in WRKSYSSTS screen, you set MAX ACT for pool 2 to not less than 1000 (because a lot of jobs need high "activity level" in the memory pool in which they run) and allocate sufficient amount of memory to it. And also allocate memory to pool 1 to at least twice the amount shown in its "Reserved Size".

When you run ADDPJE for QP0ZSPWP jobs, you also need to set its Initial Number of Jobs to a high value (such as 100) and Additional Number of Jobs to a high value as well (such as 30). If you did not do this, use CHGPJE command to change it.

If you can run your script 3 times consecutively in IBM i, does the resulting run-time the same for each run? If the first run takes the longest, then it should be indicative of the nature of PASE support in IBM i. .

By the way, what is the IFS directory path you put your files in IBM i? Just want to check if, by any chance, you happen to put your files in an improper IFS file system such as /QDLS.   The safe default should be under /QOpenSys file system.

Another thing you can try is to change the script to reduce spawn() and fork() but I do not know enough to be more specific on this. There are many IBM i functions running in PASE but definitely not as shell scripts.   I suspect IBM i PASE is not optimized for running shell scripts, more for programs and procedures. Hope there is an IBM i developer who takes care of deploying PASE-based functions respond to your question. If none does, what you can try is to open a PMR to IBM i WW Support and ask your question.

------------------------------
Satid Singkorapoom
------------------------------

Original Message
6. RE: PASE performance question

Like
Glenn Robinson
Posted Tue June 07, 2022 09:40 PM

Reply
Satid,

Yeah, I did think about changing the subsystem memory and ALs but this is a script which will be used very infrequently so I don't want to configure work management based upon the requirements of this script.

My script is running in my home directory at present, definitely not in /QDLS :-0

As I mentioned, the script has many grep, cat, sort etc commands executed for each line in my text file so I need to reduce the number of sub shells created by making the code more bash efficient or rewrite in python.

The performance issue was more of an observation than anything else so I just wanted to see if there was a quick fix to improving performance in bash scripts under PASE.

It's a shame as I'm a big fan on OSS on IBM i but it looks like I may have to abandon using bash scripts for complex string manipulation in PASE and use python or use bash on AIX or RHEL

Thanks for your input.

Glenn

------------------------------
Glenn Robinson
------------------------------

Original Message
7. RE: PASE performance question

Like
Jack Woehr

IBM Champion
Posted Tue June 07, 2022 10:04 PM
Edited by Jack Woehr Tue June 07, 2022 10:08 PM

Reply
On 6/7/22 7:40 PM, Glenn Robinson via IBM Community wrote:

0100018140f8256b-493ab681-598c-4b11-9828-6ca6b9af5250-000000@email.amazonses.com">
It's a shame as I'm a big fan on OSS on IBM i but it looks like I may have to abandon using bash scripts for complex string manipulation in PASE and use python or use bash on AIX or RHEL

2 thoughts:

Python is not a punishment :) Go ahead and recast your script in Python on the IBM i (yes, it's there) just for the exercise.

Another possibility is to refactor your script. There are almost certain to be suboptimal elements to it from your description. My spider sense is tingling :)
8. RE: PASE performance question

Like
Satid Singkorapoom
Posted Wed June 08, 2022 05:09 AM
Edited by Satid Singkorapoom Wed June 08, 2022 05:11 AM

Reply
Dear Mr. Jack

>>>> Another possibility is to refactor your script. There are almost certain to be suboptimal elements to it from your description. <<<<

Having read all Mr. Glenn's posts, I'm quite sure that Mr. Glenn does not see any "suboptimal" element in his script because his script took just 8 seconds to run in both AIX and RHEL LPARs as opposed to 5 minutes in IBM i LPAR. I figure his view of anything suboptimal is that it is in IBM i PASE. I would call this a limitation in IBM i unless someone else could chime in on what else can be done to improve his script run time in IBM i.

------------------------------
Satid Singkorapoom
------------------------------

Original Message
9. RE: PASE performance question

Like
Glenn Robinson
Posted Wed June 08, 2022 05:27 AM

Reply
Satid,

Agree. I can convert to python or I can refactor my bash script, However, If I'd done the same scripting on RHEL or AIX I would not have felt the need to start a topic on IBM communities as there would be no performance issue to write about.

------------------------------
Glenn Robinson
------------------------------

Original Message
10. RE: PASE performance question

Like
Glenn Robinson
Posted Wed June 08, 2022 05:24 AM

Reply
Don't get me wrong, I'm a python fan and use it, and Ansible, a lot.

This was one of these simple requirements but then grew in to something a little bigger.

I know I can refactor to reduce the use of so many subshells but it still very disappointing that the same script runs so much faster on RHEL and AIX on the same P9.

I've been working on IBM i/i5/OS/OS400 for over 30 years and it's issues such as these which give our fantastic system a bad name . . . and it's hard to defend.

I won't make excuses for not writing the most sreamlined scripts but it is difficult to stomach such a significant difference in performance.

As i said previously, this is a rarely used script so I'm more interested in why the hell PASE is so slow compared to RHEL and AIX.

------------------------------
Glenn Robinson
------------------------------

Original Message
11. RE: PASE performance question

Like
Satid Singkorapoom
Posted Wed June 08, 2022 05:32 AM
Edited by Satid Singkorapoom Wed June 08, 2022 05:55 AM

Reply
Mr. Glenn

>>>> As i said previously, this is a rarely used script so I'm more interested in why the hell PASE is so slow compared to RHEL and AIX. <<<<

I think there is an answer to your question as I just found the following statement in this URL https://www.ibm.com/docs/en/i/7.3?topic=i-optimizing-performance
[QUOTE]
If you run an application in PASE for i that performs a large number of fork() operations, it will not run as fast as it runs on AIX®. This is because each PASE for i fork() operation starts a new IBM® i job, which can have a significant impact on performance.
[UNQUOTE]

One lesson I learn from my 31-year experience with IBM i is that nothing is perfect. Even Superman is mortified by Kryptonite! So, excessive number of fork() is the Kryptonite for PASE.

------------------------------
Satid Singkorapoom
------------------------------

Original Message
12. RE: PASE performance question

Like
Diego KESSELMAN BARRIONUEVO

IBM Champion
Posted Wed June 08, 2022 08:25 AM

Reply
What's your DNS configuration?
Check you are using *LOCAL and a valid DNS server address or *NONE.
Also check you have an entry on your hosts table for your hostname.

------------------------------
Diego KESSELMAN BARRIONUEVO
General Manager
ESSELWARE Soluciones, SA de CV
CDMX DIF
------------------------------

Original Message
13. RE: PASE performance question

Like
Glenn Robinson
Posted Wed June 08, 2022 11:47 AM

Reply
Yes, they are all as they should be.

This is purely to do with PASE, bash and subshells.

------------------------------
Glenn Robinson
------------------------------

Original Message
14. RE: PASE performance question

Like
Diego KESSELMAN BARRIONUEVO

IBM Champion
Posted Wed June 08, 2022 01:41 PM

Reply
Are you using the original or GNU (can install with YUM) sed?

------------------------------
Diego KESSELMAN BARRIONUEVO
General Manager
ESSELWARE Soluciones, SA de CV
CDMX DIF
------------------------------

Original Message
15. RE: PASE performance question

Like
Diego KESSELMAN BARRIONUEVO

IBM Champion
Posted Wed June 08, 2022 01:43 PM

Reply
... or have you tried using BASH built-in functions for string manipulation?

------------------------------
Diego KESSELMAN BARRIONUEVO
General Manager
ESSELWARE Soluciones, SA de CV
CDMX DIF
------------------------------

Original Message
16. RE: PASE performance question

Like
Glenn Robinson
Posted Wed June 08, 2022 03:17 PM

Reply
Well now you're talking :-)

I'm on a mission to:

1. Optimise the script as much as possible to remove as many sub shell invocations as possible.

2. Become a bash script (on IBM i) master!!!!

I'll report back.

------------------------------
Glenn Robinson
------------------------------

Original Message
17. RE: PASE performance question

Like
Glenn Robinson
Posted Thu June 09, 2022 05:56 AM

Reply
For those interested I have made significant progress.

I have removed pretty much every grep, sed and cut command I was using via pipes with bash built ins and variable expansion.

The good new is that this script now takes 1 minute 38 seconds to run . . . compared to almost 5 minutes - WOW!

So I've pushed the updates to our Github server and then pulled the repo down to my RHEL server. This now runs in 3 seconds compared to 8 seconds.

The main culprit, for IBM i, i sa while loop which has the following redirection:

while read -r member do -- snip -- done < <(awk "/${zone}/{ f = 1; next} /zone name/{ f = 0 } f" "${file_zone}")

This is executed around 250 times in the script. It locates a block of rows in a file and returns the rows in that block.

I can't see a more efficient way of doing this so correct me if I;m wrong.

I tried sed but that added about 20 seconds on to the runtime.

------------------------------
Glenn Robinson
------------------------------

Original Message
18. RE: PASE performance question

Like
Satid Singkorapoom
Posted Thu June 09, 2022 08:39 AM

Reply
Mr. Glenn

Your reported improvement is undoubtedly laudable. You are abundantly endowed with tenacity and ability worthy of admiration and applause !

------------------------------
Satid Singkorapoom
------------------------------

Original Message
19. RE: PASE performance question

Like
Glenn Robinson
Posted Thu June 09, 2022 08:56 AM

Reply
Satid,

Thank you. I've been around a long, long time . . . . never too old to learn :-)

I may well write this in python too . . . just for the comparison.

Glenn

------------------------------
Glenn Robinson
------------------------------

Original Message
20. RE: PASE performance question

Like
ac
Posted Thu June 09, 2022 10:58 AM
Edited by ac Thu June 09, 2022 10:59 AM

Reply
If you really want to do all this in bash plus common utilities like awk as a pure exercise and PASE stress-test, post somewhere (i.e. github) the code plus data, so if there is some PASE expert able to suggest something or one wants to suggest "his" way to ode it (using bash plus common unix utilities) could do it objectively ... otherwise for a discussion is very difficult to assess and suggest something without the exact code....

------------------------------
ace ace
------------------------------

Original Message
21. RE: PASE performance question

Like
Glenn Robinson
Posted Wed June 15, 2022 11:23 AM

Reply
A final (hopefully) update on this.

I used python3 to replace the hefty BASH script . . . . . . . this runs in just under 3 seconds on i, AIX and RHEL.

My takeaways from this:

1. Use BASH carefully on i, especially when using subshells repeatedly

2. Never be afraid to switch from a scripting language to a 'proper' programming language when something is getting more complex than was originally intended. I'll hold my hand up to being guilty of this crime on this occasion.

------------------------------
Glenn Robinson
------------------------------

Original Message
22. RE: PASE performance question

Like
ac
Posted Wed June 15, 2022 12:52 PM

Reply
Downloading a working copy of a "proper" language like Python or PHP is so easy nowadays via a couple of "yum" commands that it is also a pity not to use them for i.e. real production needs, with also all the accompanying libraries available, and you have also easy and fast access to the local DB2 for proper processing, SQL, ibmi toolkit in case for direct interaction with existing PGMs.

For example we use PHP even as a pure scripting language (no web in this case) i.e. to handle SFTP or FTPS connection transfers in various way, then simply invoked by RPG programs.

Bash tends to get ugly very very fast ;)

------------------------------
ace ace
------------------------------

Original Message
23. RE: PASE performance question

Like
Edmund Reinhardt
Posted Tue June 21, 2022 05:35 AM

Reply
I just wanted to chime and say that this has been our experience as well. We rewrote large portions of bash to python 3 in the http://github.com/IBM/ibmi-bob project and it was many orders of magnitude faster, more capable and easier to maintain.

Edmund Reinhardt

IBM i Application Development Tooling Architect

+1 647 403 6195 Mobile

+1 905 413 3125 Office

edmund.reinhardt@ca.ibm.com

IBM

Original Message
24. RE: PASE performance question

Like
Satid Singkorapoom
Posted Wed June 15, 2022 08:22 PM

Reply
Dear Glenn

This last post of yours concludes a very beneficial learning experience for all.

------------------------------
Satid Singkorapoom
------------------------------

Original Message
25. RE: PASE performance question

Like
ac
Posted Thu June 09, 2022 04:18 AM
Edited by ac Thu June 09, 2022 04:18 AM

Reply
We are of course obliged to take in account that process creation in an "i" system is a much more heavy operation than a pure POSIX (simply because a process/job in i provides more capability/visibility/instrumentation etc.) and maybe - yes - it is also a codepath that still need some optimization - but - will never be the same as a pure unix (lighter weight) even with job prestarting...

in any case... which binaries are you using? GNUs?
post your "which sed grep awk bash" ....

------------------------------
ace ace
------------------------------

Original Message
26. RE: PASE performance question

Like
Glenn Robinson
Posted Thu June 09, 2022 05:57 AM

Reply
/QOpenSys/usr/bin/sed
/QOpenSys/pkgs/bin/grep
/QOpenSys/pkgs/bin/awk
/QOpenSys/pkgs/bin/bash
/QOpenSys/pkgs/bin/cut

------------------------------
Glenn Robinson
------------------------------

Original Message
27. RE: PASE performance question

Like
Kurt Thomas
Posted 9 days ago

Reply
This is the real reason, thanks, ace, for posting it. On Linux/Unix a fork "costs" next to nothing in performance terms, in PASE, it involves a much bigger machinery. Frank Soltis mentions this somewhere, in "Fortress", I think.

IBM i even had to prevent the fork jobs issuing QHST messages to keep the performance manageable.

------------------------------
Kurt Thomas
Senior System Engineere
Fortra
------------------------------

Original Message
28. RE: PASE performance question

Like
ac
Posted 9 days ago
Edited by ac 9 days ago

Reply
on their part native jobs IBMi have then much faster context switches during use, no TLB / cache trashing, that counts a lot in a system that values to be transactional and concurrent.

For example, for high throughput need in particular tasks, one would usually prestart X jobs that then read from a fast DTAQ and leave them on. This is a decent (and very fast...) design and the overall throughput can be also reasonably capped and manipulated and thought of, and also easily debugged.

Back to POSIX stuff, and the POSIX view of the world, yes, we often basically reduced the problem and the machines in a giant continuous string parsing and generating concoction, a practical, useful, at times fun, dance between processes. But that time spent compounds easily for non trivial problems in the data space, especially in systems where creation and tear up of processes is continuous.

....In any case, in PASE is worth using PHP or python (etc.) for things that surpasses the 10 lines of code, you get proper syntax and library support.... shell lang was built for shell problems...

------------------------------
--ft
------------------------------

Original Message

IBM i Global

IBM i

PASE performance question

Glenn RobinsonTue June 07, 2022 08:02 AM

Satid SingkorapoomTue June 07, 2022 08:51 AM

Glenn RobinsonTue June 07, 2022 09:27 AM

Glenn RobinsonTue June 07, 2022 09:40 AM

Satid SingkorapoomTue June 07, 2022 08:14 PM

Glenn RobinsonTue June 07, 2022 09:40 PM

Jack WoehrTue June 07, 2022 10:04 PM

Satid SingkorapoomWed June 08, 2022 05:09 AM

Glenn RobinsonWed June 08, 2022 05:27 AM

Glenn RobinsonWed June 08, 2022 05:24 AM

Satid SingkorapoomWed June 08, 2022 05:32 AM

Diego KESSELMAN BARRIONUEVOWed June 08, 2022 08:25 AM

Glenn RobinsonWed June 08, 2022 11:47 AM

Diego KESSELMAN BARRIONUEVOWed June 08, 2022 01:41 PM

Diego KESSELMAN BARRIONUEVOWed June 08, 2022 01:43 PM

Glenn RobinsonWed June 08, 2022 03:17 PM

Glenn RobinsonThu June 09, 2022 05:56 AM

Satid SingkorapoomThu June 09, 2022 08:39 AM

Glenn RobinsonThu June 09, 2022 08:56 AM

acThu June 09, 2022 10:58 AM

Glenn RobinsonWed June 15, 2022 11:23 AM

acWed June 15, 2022 12:52 PM

Edmund ReinhardtTue June 21, 2022 05:35 AM

Satid SingkorapoomWed June 15, 2022 08:22 PM

acThu June 09, 2022 04:18 AM

Glenn RobinsonThu June 09, 2022 05:57 AM

Kurt Thomas9 days ago

ac9 days ago

1. PASE performance question

2. RE: PASE performance question

3. RE: PASE performance question

4. RE: PASE performance question

5. RE: PASE performance question

6. RE: PASE performance question

7. RE: PASE performance question

8. RE: PASE performance question

9. RE: PASE performance question

10. RE: PASE performance question

11. RE: PASE performance question

12. RE: PASE performance question

13. RE: PASE performance question

14. RE: PASE performance question

15. RE: PASE performance question

16. RE: PASE performance question

17. RE: PASE performance question

18. RE: PASE performance question

19. RE: PASE performance question

20. RE: PASE performance question

21. RE: PASE performance question

22. RE: PASE performance question

23. RE: PASE performance question

24. RE: PASE performance question

25. RE: PASE performance question

26. RE: PASE performance question

27. RE: PASE performance question

28. RE: PASE performance question

Related Content

IBM i disk device representation on the IFS

/QOpenSys SYMLNK

Cannot find 2022 post related to vulnerability fixed in ACS release 1.1.9.x by Satid Singkorapoom

New IBM i 7.4 and 7.5 and support for self-encrypting drives (SED) for some NVMe in Power10 machine

Bringing AI to the Green Screen: How IBM i Can Talk to LLMs

Additional Resources

Office

Quick Links

Additional
Resources