InfoSphere Optim

InfoSphere Optim

Connect with Db2, Informix, Netezza, open source, and other data experts to gain value from your data, share insights, and solve problems.

 View Only
  • 1.  Masking large tables (100 million rows plus) on mainframe

    Posted Tue June 12, 2018 01:59 PM
    We recently had a requirement to mask a few tables in a very large DB2 database.  For each table that had PII data, we masked one table at a time, masking all the rows - no subsetting or relationships were involved.  The approach was to do a DB2 unload, treat the unload file as a legacy table and do a typical process of extract-convert-insert to create a masked copy of the file.  The output was then used with DB2 utilities to reload the entire table.

    In that process, we were able to use tape datasets up to the point of inserting the masked extract file back into another flat file for input to the DB2 load.  Optim created the converted extract file on tape with no problem, but could not consume it in an insert or convert-to-load-format job.  We ended up copying the >huge< files to DASD for that step and consumed enough space that we had to get our mainframe storage team involved since we needed over 360,000 cylinders for the largest file!  Imagine this in your JCL:
    SPACE=(CYL,(8000,5000),RLSE)
    and Optim requires that it has 255 extents or fewer.

    Has anyone else in the group run into  this limitation, and how did you work through it?  I raised a PMR and was told that tape for converted extract files is not supported.  I also tried importing the file into the Optim directory unsuccessfully.

    ------------------------------
    Regards,
    Dave Lewis
    ------------------------------

    #InfoSphereOptim
    #Optim


  • 2.  RE: Masking large tables (100 million rows plus) on mainframe

    Posted Wed June 13, 2018 08:01 AM
    ​I don't recall us encountering this limit. We do mask IMS and VSAM as Legacy files. I guess they are not that huge to cause alarm among the DASD group.

    ------------------------------
    Ed Lipson
    BNYMellon
    ------------------------------



  • 3.  RE: Masking large tables (100 million rows plus) on mainframe

    Posted Wed June 13, 2018 06:04 PM
    ​The limit on number of extents only applies to PS BASIC data sets.

    Optim will take PS LARGE or PSE, even using EAVs, and there should be no problem with the size of the data sets or number of extents.

    Greg C. Optim/z Dev

    ------------------------------
    Greg C.
    ------------------------------



  • 4.  RE: Masking large tables (100 million rows plus) on mainframe

    Posted Thu June 14, 2018 09:21 AM
    ​Dave, 
        We have some large files that we mask on the mainframe, and when I setup the JCL for masking I specify DATACLAS=HOSTCOMP on the Extract/Convert and Insert files. The control file is the only Optim processing file that can not be compressed, but all other files may.  Check with your Tech Support area to see if there are host compression options available for you to use when allocating very large files for processing. Not only does it reduce the allocation space needed for the file, but it improves run time as well.
      

    ------------------------------
    Lindsey O'Rourke
    ------------------------------



  • 5.  RE: Masking large tables (100 million rows plus) on mainframe

    Posted Thu June 14, 2018 07:33 PM
    There are other approaches worth mentioning:

    1.  If you  have DB2 for z/OS High Performance Unload (HPU) instead of using DB2 Unload, you can mask the data using a sample HPU exit program and the Optim ODPP masking libraries while the data is being unloaded from DB2.  This means it will land masked and ready to be reloaded directly into DB2.  I'm pretty sure HPU allows you to write the masked file directly to tape.

    The Optim HPU sample exit is available in the Optim for z/OS sample PDS.

    However, this requires you have DB2 HPU.  Also it doesn't permit Optim Lua Procedures or Optim Column Map User exits to be used.

    2.  Optim masking can also be called from IBM File Manager.  Again no Lua procs or col map user exits.

    ------------------------------
    Peter Costigan
    Offering Manager, Optim Solutions
    ------------------------------



  • 6.  RE: Masking large tables (100 million rows plus) on mainframe

    Posted Fri June 15, 2018 02:54 PM
    ​You may also just install the ODPP UDFs to do your data masking during the DB2 unload. The UDFs are handy for this as they are just part of the SQL SELECT criteria. We use this for several DB2 table unloads from production to non-production environments. This way the data is already masked in the unload file and using the UDF means you don't need any custom exit code (unless you have a need for something beyond what the base ODPP providers offer).

    Just another option to consider.

    -Keith

    ------------------------------
    Keith Tidball
    Progressive Insurance
    ------------------------------