DataPower

DataPower

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only
Expand all | Collapse all

Maximum Size for datapower file processing

  • 1.  Maximum Size for datapower file processing

    Posted Sun April 19, 2020 11:51 PM
    Hi All, 
    What is maximum size  of non-xml file size that can be processed  IDG.7.7.1.3. 
    We have file till 6 GB size  and would like to  transform then using gateway script action. Currently , I am able to process ~ 10MB. The maximum message size in XMl-manager is set to 0 .

    ------------------------------
    Rashmi Chandra
    ------------------------------


  • 2.  RE: Maximum Size for datapower file processing

    Posted Mon April 20, 2020 05:45 AM
    There is no size limitation.


    For XML processing, a streamable stylesheet can process  arbitrary big XML documents.

    Years back with 5.0.0.0 firmware as first 64bit firmware, I processed a 25GB XML file non-streaming. There is a rule of thumb that internal memory size of XML file is roughly 3 times the file size. I did process on an IDG with 96GB memory, expect to be able to process 50GB XML file in an IDG-X2 with 192GB ram size. Of course only one at a time, since all transactons share DataPower ram.

    For Non-XML processing, you can process files up to 1GB with .readAsBuffer(). For files bigger than 1GB please use .readAsBuffers().

    Btw, 7.7 firmware is out of support since 2018.4.1 shipped last year.



    ------------------------------
    Hermann Stamm-Wilbrandt
    Compiler Level 3 support & Fixpack team lead
    IBM DataPower Gateways (⬚ᵈᵃᵗᵃ / ⣏⠆⡮⡆⢹⠁⡮⡆⡯⠂⢎⠆⡧⡇⣟⡃⡿⡃)
    ------------------------------



  • 3.  RE: Maximum Size for datapower file processing

    Posted Mon April 20, 2020 11:36 AM
      |   view attached
    Hi Herman, 
    I am using .readAsBuffers() to read the file . The appliance is  able to process ~70mb file without transformation.The problem start when i am trying to transform the data using "For loop".  The appliance is becoming unreachable for few mins  and action never completes. The same script works fine for small size files.   I have attached the gateway script. Please let us know if there is more suitable approach.


    ------------------------------
    Rashmi Chandra
    ------------------------------

    Attachment(s)

    js
    test.js   720 B 1 version


  • 4.  RE: Maximum Size for datapower file processing

    Posted Mon April 20, 2020 03:42 PM

    I told you the 1GB limit for buffer, there is an additional 256MB limit for strings (you convert the buffers to a single string "var str = bufs.toString();").

    > var NoOfRecord = str.split(/\r\n|\r|\n/).length;
    >

    Do expect long processing times for regular expression processing on >70MB strings.

    For next steps you should use .readAsBuffer(), that will allow for dealing with up to 1GB files.

    Then do the processing with for loops in the single buffer read, but without conversion to string and without regexp.
    That is likely to work much quicker.



    ------------------------------
    Hermann Stamm-Wilbrandt
    Compiler Level 3 support & Fixpack team lead
    IBM DataPower Gateways (⬚ᵈᵃᵗᵃ / ⣏⠆⡮⡆⢹⠁⡮⡆⡯⠂⢎⠆⡧⡇⣟⡃⡿⡃)
    ------------------------------



  • 5.  RE: Maximum Size for datapower file processing

    Posted Tue April 21, 2020 08:40 AM
    Hi Rashmi,
    Per my experience, which is a bit dated, GatewayScript is rather sensitive to large payloads if you want to work heavily with them.
    Back in the old dw DP forum, like 4 years ago we discussed with a user some 600MB to 2GB payload processing.
    That thread got migrated to the CSP forum, you find it at https://www.ibm.com/mysupport/s/question/0D50z00006AB67W/gatewayscript-and-large-files - just the formatting is maybe a bit ugly and hard to follow.
    Expectably a first challenge is the 256 MB String limit.
    Next limit is the 1 GB for one Buffer.
    I'm not sure if anyone managed to successfully and fast process anything above 2 GB. In my testing back then things went slow like minutes fast if beyond 1.5 GB.
    Not sure if traditional DP binary processing is the better model for such payloads.

    Anyway, you are talking about just 70 MB, so ensure your loops are not to wild and you stay efficient in what you do, leveraging built in functions and proper regexes or whatever is needed.

    You see I didn't look specifically into your GWS hear, but just wanted to give you a broader picture of what to expect.

    ------------------------------
    August Ziesemer
    ------------------------------



  • 6.  RE: Maximum Size for datapower file processing

    Posted Wed April 22, 2020 09:34 AM

    Hello,

    For this kind of message, I would go for a file approach with an integration solution such a IBM Integration Bus that can handle records instead of the whole message at once.

    Might an ETL even better for specific scenarios.



    ------------------------------
    Pierre Richelle
    WebSphere Client Technical Professional
    IBM
    brussels
    ------------------------------



  • 7.  RE: Maximum Size for datapower file processing

    Posted Thu April 23, 2020 03:42 AM
    Thank you August  for the above information . it does helped me  in deciding better  solution.

    ------------------------------
    Rashmi Chandra
    ------------------------------



  • 8.  RE: Maximum Size for datapower file processing

    Posted Thu April 23, 2020 07:27 AM
    Edited by Rashmi Chandra Tue April 28, 2020 02:21 AM
    Thank you  Herman. I went ahead and analysed my solution again in light  of above  shared information. 
    We have ~7gb data to process and the appliance have only 8 gb of memory .  Please suggest, what configuration (firmware , RAM size) of datapower appliance or any another  that can work for above processing . IS going  with IIB ot B2B datapower appliance is a more suitable choice . Please recommend.
    @Hermann Stamm-Wilbrandt
    ------------------------------
    Rashmi Chandra
    ------------------------------



  • 9.  RE: Maximum Size for datapower file processing

    Posted Mon April 20, 2020 07:13 AM
     
    Hi there is no set limit; it depends on how much memory the system has; what is the concurrency, and how long it takes the message to process.
     
    In general if you are working w/ large files it is good practice to use a slm policy to set a limit on the maximum concurrency to prevent high memory.






  • 10.  RE: Maximum Size for datapower file processing

    Posted Thu April 23, 2020 07:54 AM

    Hello Rashmi,
    Can you provide some information on the file structure ? 

    Is it record based ?
    How do you process the file ? Is it record per record or do you have inter-depencies (record n requires record x)?



    ------------------------------
    Pierre Richelle
    IBM Hybrid Cloud Integration Specialists
    IBM
    bruxelles
    0474681892
    ------------------------------



  • 11.  RE: Maximum Size for datapower file processing

    Posted Mon April 27, 2020 03:03 AM
      |   view attached
    Hi, 
    I am propcesing line by line of the file .The problem start when i am trying to transform the data using "For loop".  The appliance is becoming unreachable for few mins  . The same script works fine for small size files.   


    ------------------------------
    Rashmi Chandra
    ------------------------------

    Attachment(s)

    js
    Script.js   720 B 1 version


  • 12.  RE: Maximum Size for datapower file processing

    Posted Mon April 27, 2020 05:51 AM

    Did you say the other day you need to process a request sized 7 GB? Natively, such will not fit into 8 GB RAM unless just streaming through successfully. Maybe start with 32 GB RAM and higher? (Just thinking about 1 request at a time and base processing. You need to load test that part yourself, anything else is just guessing nonsense.)

    Or is that some total number of data that you can cut down arbitrarily?


    So if large request has to many rows so looping through line by line takes too much time, an approach could be to front your current service with a pre-process service that first cut chunks of x rows (1k, 10k whatever produces a good balance for you) and then sends the smaller pieces to the current service that does the actual business logic. How do you feel about this idea?

    As said before, other transformation might tools like IIB maybe offer more native help for this usecase?



    ------------------------------
    August Ziesemer
    ------------------------------