IBM i Global

IBM i Global

Connect, learn, share, and engage with IBM Power.

 View Only
Expand all | Collapse all

PASE performance question

  • 1.  PASE performance question

    Posted Tue June 07, 2022 08:02 AM
    I have bash script which manipulates some text file using many grep, awk and sed statements.

    When I run this on my V7R3 LPAR it takes around 6 minutes to run the script.

    When I run the same script on my RHEL ppc64le LPAR it takes just 5.8 seconds.


    Both LPARs are connected to the same FS7200 storage.

    Both LPARs have have the same CPU and Memory assigned.

    I'm guessing that when I use pipes or execute commands as in put to loops this is kicking off additional IBM i jobs which is why this takes so long comapred to RHEL.

    Any suggestions/comments on how I can make PASE more performant so that I can get closer to RHEL performance?

    Thanks
    Glenn

    ------------------------------
    Glenn Robinson
    ------------------------------


  • 2.  RE: PASE performance question

    Posted Tue June 07, 2022 08:51 AM
    Edited by Satid Singkorapoom Tue June 07, 2022 09:14 AM

    Dear Glenn

    Was you job the only job running in IBM i LPAR when you performed the task?  If not, how much other workload was at that time?  It may not be easy to answer your question if many other jobs are also running. 

    Are all the LUNs from FS7200 allocated to both IBM i and RHEL LPARs carved out of the same storage pool?  If not, then we need to know if IBM i LPAR has as good a disk response time as RHEL LPAR or not.

    How many additional jobs were launched by your loop? If more than a few, then this is one major cause of the performance drag.  In such a case, make sure *BASE memory pool (pool number 2) in your IBM i has sufficient memory allocated and its MAX ACTIVE parameter should be set at a value of least 1000.   Please use WRKSYSSTS command and press F10 every 10 seconds or so and observe the memory faulting rate of pool 2 when your script is running (please ignore "Pages" value). If the faulting rate is as high as some 500 or more consistently during the script run, this can be an issue and you may try adding more memory to pool 2 to see if it reduces the faulting rate or not.

    Did you run the Bash script from an SSH client?  A tutorial article on running Bash shell in IBM i PASE recommends this as you can read here :  https://www.itjungle.com/2014/09/17/fhg091714-story01/

    A few more articles on using Bash in IBM i but not sure if they will be useful or not : https://jbh.github.io/categories/ibm%20i/

    ------------------------------
    Satid Singkorapoom
    ------------------------------



  • 3.  RE: PASE performance question

    Posted Tue June 07, 2022 09:27 AM
    Satid,

    My apologies, I should have made this clearer.

    There are no other users or batch jobs running on the IBM i LPAR apart from my own ssh client session and default IBM jobs. In fact, this LPAR runs native IBM i workloads without any performance problems.

    Neither the IBM i or RHEL LPAR have any obvious performance constraints. Both LPARs share the same storage pool and the FS7200 shows blazing fast volume response times.

    I just tried running the script with the same text files on an AIX 7.1 LPAR on the same Power9 system as the RHEL and IBM i LPAR and that ran in about 8 seconds.

    I can't see anything in the docs you kindly sent that I haven't read before.

    What I can see on WRKACTJOB is that the script generated hundreds of shirt lived QP0ZSPWP jobs on the system.

    ------------------------------
    Glenn Robinson
    ------------------------------



  • 4.  RE: PASE performance question

    Posted Tue June 07, 2022 09:40 AM
    I have just made the changes to SSHD to use the QP0ZSPWT pre tart jobs as described here

    This has reduced the run time by about 1 minute but it's still significantly slower than RHEL or AIX.

    ------------------------------
    Glenn Robinson
    ------------------------------



  • 5.  RE: PASE performance question

    Posted Tue June 07, 2022 08:14 PM
    Edited by Satid Singkorapoom Tue June 07, 2022 08:37 PM
    Dear Glenn

    Hundreds of short-lived QP0ZSPWP jobs is indicative of the run-time performance issue and I suspect this is by design in IBM i's support for Spawn() and Fork() when running UNIX shell script in PASE.  One basic and crucial thing you must do is to make sure that, in WRKSYSSTS screen, you set MAX ACT for pool 2 to not less than 1000 (because a lot of jobs need high "activity level" in the memory pool in which they run) and allocate sufficient amount of memory to it.  And also allocate memory to pool 1 to at least twice the amount shown in its "Reserved Size".    

    When you run ADDPJE for QP0ZSPWP jobs, you also need to set its Initial Number of Jobs to a high value (such as 100) and Additional Number of Jobs to a high value as well (such as 30).     If you did not do this, use CHGPJE command to change it. 

    If you can run your script 3 times consecutively in IBM i, does the resulting run-time the same for each run?  If the first run takes the longest, then it should be indicative of the nature of PASE support in IBM i.  . 

    By the way, what is the IFS directory path you put your files in IBM i?  Just want to check if, by any chance, you happen to put your files in an improper IFS file system such as /QDLS.   The safe default should be under /QOpenSys file system.

    Another thing you can try is to change the script to reduce spawn() and fork() but I do not know enough to be more specific on this. There are many IBM i functions running in PASE but definitely not as shell scripts.   I suspect IBM i PASE is not optimized for running shell scripts, more for programs and procedures. Hope there is an IBM i developer who takes care of deploying PASE-based functions respond to your question. If none does, what you can try is to open a PMR to IBM i WW Support and ask your question.

    ------------------------------
    Satid Singkorapoom
    ------------------------------



  • 6.  RE: PASE performance question

    Posted Tue June 07, 2022 09:40 PM
    Satid,

    Yeah, I did think about changing the subsystem memory and ALs but this is a script which will be used very infrequently so I don't want to configure work management based upon the requirements of this script.

    My script is running in my home directory at present, definitely not in /QDLS :-0

    As I mentioned, the script has many grep, cat, sort etc commands executed for each line in my text file so I need to reduce the number of sub shells created by making the code more bash efficient or rewrite in python.

    The performance issue was more of an observation than anything else so I just wanted to see if there was a quick fix to improving performance in bash scripts under PASE.

    It's a shame as I'm a big fan on OSS on IBM i but it looks like I may have to abandon using bash scripts for complex string manipulation in PASE and use python or use bash on AIX or RHEL


    Thanks for your input.

    Glenn

    ------------------------------
    Glenn Robinson
    ------------------------------



  • 7.  RE: PASE performance question

    Posted Tue June 07, 2022 10:04 PM
    Edited by Jack Woehr Tue June 07, 2022 10:08 PM
    On 6/7/22 7:40 PM, Glenn Robinson via IBM Community wrote:
    0100018140f8256b-493ab681-598c-4b11-9828-6ca6b9af5250-000000@email.amazonses.com">
    It's a shame as I'm a big fan on OSS on IBM i but it looks like I may have to abandon using bash scripts for complex string manipulation in PASE and use python or use bash on AIX or RHEL

    2 thoughts:

    1. Python is not a punishment :) Go ahead and recast your script in Python on the IBM i (yes, it's there) just for the exercise.
    2. Another possibility is to refactor your script. There are almost certain to be suboptimal elements to it from your description. My spider sense is tingling :)


  • 8.  RE: PASE performance question

    Posted Wed June 08, 2022 05:09 AM
    Edited by Satid Singkorapoom Wed June 08, 2022 05:11 AM
    Dear Mr. Jack

    >>>> Another possibility is to refactor your script. There are almost certain to be suboptimal elements to it from your description.  <<<<

    Having read all Mr. Glenn's posts, I'm quite sure that Mr. Glenn does not see any "suboptimal" element in his script because his script took just 8 seconds to run in both AIX and RHEL LPARs as opposed to 5 minutes in IBM i LPAR.  I figure his view of anything suboptimal is that it is in IBM i PASE.  I would call this a limitation in IBM i unless someone else could chime in on what else can be done to improve his script run time in IBM i.

    ------------------------------
    Satid Singkorapoom
    ------------------------------



  • 9.  RE: PASE performance question

    Posted Wed June 08, 2022 05:27 AM
    Satid,

    Agree. I can convert to python or I can refactor my bash script, However, If I'd done the same scripting on RHEL or AIX I would not have felt the need to start a topic on IBM communities as there would be no performance issue to write about.

    ------------------------------
    Glenn Robinson
    ------------------------------



  • 10.  RE: PASE performance question

    Posted Wed June 08, 2022 05:24 AM
    Don't get me wrong, I'm a python fan and use it, and Ansible, a lot.

    This was one of these simple requirements but then grew in to something a little bigger.

    I know I can refactor to reduce the use of so many subshells but it still very disappointing that the same script runs so much faster on RHEL and AIX on the same P9.

    I've been working on IBM i/i5/OS/OS400 for over 30 years and it's issues such as these which give our fantastic system a bad name . . . and it's hard to defend.

    I won't make excuses for not writing the most sreamlined scripts but it is difficult to stomach such a significant difference in performance.

    As i said previously, this is a rarely used script so I'm more interested in why the hell PASE is so slow compared to RHEL and AIX.

    ------------------------------
    Glenn Robinson
    ------------------------------



  • 11.  RE: PASE performance question

    Posted Wed June 08, 2022 05:32 AM
    Edited by Satid Singkorapoom Wed June 08, 2022 05:55 AM
    Mr. Glenn

    >>>>  As i said previously, this is a rarely used script so I'm more interested in why the hell PASE is so slow compared to RHEL and AIX. <<<<

    I think there is an answer to your question as I just found the following statement in this URL  https://www.ibm.com/docs/en/i/7.3?topic=i-optimizing-performance
    [QUOTE]
    If you run an application in PASE for i that performs a large number of fork() operations, it will not run as fast as it runs on AIX®. This is because each PASE for i fork() operation starts a new IBM® i job, which can have a significant impact on performance.
    [UNQUOTE]

    One lesson I learn from my 31-year experience with IBM i is that nothing is perfect. Even Superman is mortified by Kryptonite!   So, excessive number of fork() is the Kryptonite for PASE.

    ------------------------------
    Satid Singkorapoom
    ------------------------------



  • 12.  RE: PASE performance question

    Posted Wed June 08, 2022 08:25 AM
    What's your DNS configuration?
    Check you are using *LOCAL and a valid DNS server address or *NONE.
    Also check you have an entry on your hosts table for your hostname.


    ------------------------------
    Diego KESSELMAN BARRIONUEVO
    General Manager
    ESSELWARE Soluciones, SA de CV
    CDMX DIF
    ------------------------------



  • 13.  RE: PASE performance question

    Posted Wed June 08, 2022 11:47 AM
    Yes, they are all as they should be.

     This is purely to do with PASE, bash and subshells.

    ------------------------------
    Glenn Robinson
    ------------------------------



  • 14.  RE: PASE performance question

    Posted Wed June 08, 2022 01:41 PM
    Are you using the original or GNU (can install with YUM) sed?

    ------------------------------
    Diego KESSELMAN BARRIONUEVO
    General Manager
    ESSELWARE Soluciones, SA de CV
    CDMX DIF
    ------------------------------



  • 15.  RE: PASE performance question

    Posted Wed June 08, 2022 01:43 PM
    ... or have you tried using BASH built-in functions for string manipulation?

    ------------------------------
    Diego KESSELMAN BARRIONUEVO
    General Manager
    ESSELWARE Soluciones, SA de CV
    CDMX DIF
    ------------------------------



  • 16.  RE: PASE performance question

    Posted Wed June 08, 2022 03:17 PM
    Well now you're talking :-)

    I'm on a mission to:

    1. Optimise the script as much as possible to remove as many sub shell invocations as possible.

    2. Become a bash script (on IBM i) master!!!!

    I'll report back.

    ------------------------------
    Glenn Robinson
    ------------------------------



  • 17.  RE: PASE performance question

    Posted Thu June 09, 2022 05:56 AM
    For those interested I have made significant progress.

    I have removed pretty much every grep, sed and cut command I was using via pipes with bash built ins and variable expansion.

    The good new is that this script now takes 1 minute 38 seconds to run . . . compared to almost 5 minutes - WOW!

    So I've pushed the updates to our Github server and then pulled the repo down to my RHEL server. This now runs in 3 seconds compared to 8 seconds.

    The main culprit, for IBM i, i sa while loop which has the following redirection:


    while read -r member
    do
    -- snip --
    done < <(awk "/
    ${zone}/{ f = 1; next} /zone name/{ f = 0 } f" "${file_zone}")

    This is executed around 250 times in the script. It locates a block of rows in a file and returns the rows in that block.

    I can't see a more efficient way of doing this so correct me if I;m wrong.

    I tried sed but that added about 20 seconds on to the runtime.


    ------------------------------
    Glenn Robinson
    ------------------------------



  • 18.  RE: PASE performance question

    Posted Thu June 09, 2022 08:39 AM
    Mr. Glenn

    Your reported improvement is undoubtedly laudable.  You are abundantly endowed with tenacity and ability worthy of admiration and applause !

    ------------------------------
    Satid Singkorapoom
    ------------------------------



  • 19.  RE: PASE performance question

    Posted Thu June 09, 2022 08:56 AM
    Satid,

    Thank you. I've been around a long, long time . . . . never too old to learn :-)

    I may well write this in python too . . . just for the comparison.

    Glenn

    ------------------------------
    Glenn Robinson
    ------------------------------



  • 20.  RE: PASE performance question

    Posted Thu June 09, 2022 10:58 AM
    Edited by ac Thu June 09, 2022 10:59 AM
    If you really want to do all this in bash plus common utilities like awk as a pure exercise and PASE stress-test, post somewhere (i.e. github) the code plus data, so if there is some PASE expert able to suggest something or one wants to suggest "his" way to ode it  (using bash plus common unix utilities) could do it objectively ... otherwise for a discussion is very difficult to assess and suggest something without the exact code....

    ------------------------------
    ace ace
    ------------------------------



  • 21.  RE: PASE performance question

    Posted Wed June 15, 2022 11:23 AM

    A final (hopefully) update on this.

    I used python3 to replace the hefty BASH script . . . . . . . this runs in just under 3 seconds on i, AIX and RHEL.

    My takeaways from this:

    1. Use BASH carefully on i, especially when using subshells repeatedly


    2. Never be afraid to switch from a scripting language to a 'proper' programming language when something is getting more complex than was originally intended. I'll hold my hand up to being guilty of this crime on this occasion.



    ------------------------------
    Glenn Robinson
    ------------------------------



  • 22.  RE: PASE performance question

    Posted Wed June 15, 2022 12:52 PM
    Downloading a working copy of a "proper" language like Python or PHP is so easy nowadays via a couple of "yum" commands that it is also a pity not to use them for i.e. real production needs, with also all the accompanying libraries available, and you have also easy and fast access to the local DB2 for proper processing, SQL, ibmi toolkit in case for direct interaction with existing PGMs.

    For example we use PHP even as a pure scripting language (no web in this case) i.e. to handle SFTP or FTPS connection transfers in various way, then simply invoked by RPG programs.

    Bash tends to get ugly very very fast ;)

    ------------------------------
    ace ace
    ------------------------------



  • 23.  RE: PASE performance question

    Posted Tue June 21, 2022 05:35 AM

    I just wanted to chime and say that this has been our experience as well.  We rewrote large portions of bash to python 3 in the http://github.com/IBM/ibmi-bob project and it was many orders of magnitude faster, more capable and easier to maintain.

     

    Edmund Reinhardt

    IBM i Application Development Tooling Architect

    +1 647 403 6195 Mobile

    +1 905 413 3125 Office

    edmund.reinhardt@ca.ibm.com

     

    IBM

     






  • 24.  RE: PASE performance question

    Posted Wed June 15, 2022 08:22 PM
    Dear Glenn

    This last post of yours concludes a very beneficial learning experience for all.

    ------------------------------
    Satid Singkorapoom
    ------------------------------



  • 25.  RE: PASE performance question

    Posted Thu June 09, 2022 04:18 AM
    Edited by ac Thu June 09, 2022 04:18 AM
    We are of course obliged to take in account that process creation in an "i" system is a much more heavy operation than a pure POSIX (simply because a process/job in i provides more capability/visibility/instrumentation etc.) and maybe - yes - it is also a codepath that still need some optimization - but - will never be the same as a pure unix (lighter weight) even with job prestarting...

    in any case... which binaries are you using? GNUs?
    post your "which sed grep awk bash" ....

    ------------------------------
    ace ace
    ------------------------------



  • 26.  RE: PASE performance question

    Posted Thu June 09, 2022 05:57 AM
    /QOpenSys/usr/bin/sed
    /QOpenSys/pkgs/bin/grep
    /QOpenSys/pkgs/bin/awk
    /QOpenSys/pkgs/bin/bash
    /QOpenSys/pkgs/bin/cut

    ------------------------------
    Glenn Robinson
    ------------------------------