Db2 for z/OS & Db2ZAI

 View Only
  • 1.  Exact Algorithm behind CSRCESRV

    Posted 3 days ago

    Hi there,

    I would like to expand a String compressed using the macro CSRCESRV in another programming language (in my particular case: Java). I understand that CSRCESRV applies run-time encoding to save on space. Unfortunately, the exact procedure applied is not documented anywhere, and so far I haven't been successful in reversing this compression myself. I am especially confused by the presence of multiple control bytes in the compressed string (ASCII, bytes 0x00 to 0x31, as well as 0x7F), which I am confident (yet not 100% sure) are not a part of the original uncompressed string.

    Could you please state the exact algorithm that CSRCESRV uses to compress a String?

    Reference:
     - https://www.ibm.com/docs/en/SSLTBW_3.1.0/com.ibm.zos.v3r1.ieaa700/macro.htm

     - https://www.ibm.com/docs/en/zos/3.1.0?topic=services-provided-by-csrcesrv

    Thank you,

    Til



    ------------------------------
    Til Mohr
    ------------------------------


  • 2.  RE: Exact Algorithm behind CSRCESRV

    IBM Champion
    Posted 3 days ago

    This is actually a simple substitution compression. If you have the Hex data:

     

    80 A0C1 A0C2 A0C3 A0E7 A0E8 A0E9 40

     

     

    If the first byte is x80 -> Compression block follows.

    If the next byte does not have the high bit set – end of compression block and then just normal stuff follows (this naturally limits the maximum length of one string to be 127 bytes)

    But if it is set, then the next byte is the number of repeated characters ANDed with x7F which gives, in this case, x20 (32) the next character is the character to be repeated here xC1 for 32

    Step over the next two bytes and repeat...

     

    Hope that helps!

     

     

    Roy Boxwell

    SOFTWARE ENGINEERING GmbH and SEGUS Inc.
    -Product Development-



    Vagedesstrasse 19
    40479 Dusseldorf/Germany
    Tel. +49 (0)211 96149-675
    Fax +49 (0)211 96149-32
    Email:
    R.Boxwell@seg.de
    Web  http://www.seg.de

    Link zur Datenschutzerklärung


    Software Engineering GmbH
    Amtsgericht Düsseldorf, HRB 37894
    Geschäftsführung: Gerhard Schubert, Ulf Heinrich

     






  • 3.  RE: Exact Algorithm behind CSRCESRV

    Posted 3 days ago

    The compressed output format comprises a sequence of blocks, as Roy describes.  Each block starts with a control byte, the high order bit of this byte determines whether a sequence of repeated characters is encoded, or input characters are copied as-is into the output.  The low-order 7 bits of the control byte form a length value, in the range 0 to 127.  The first block of the output must be x'80' to indicate RLE-compressed blocks follow.  If the high-order bit of a subsequent control byte is set, it is followed by a single byte which holds the character to be repeated the number of times encoded in the low-order 7 bits of the control byte.  If the high-order bit of the control byte is reset, the length in the low-order 7 bits of the control byte holds the number of following bytes that have been copied as-is into the output block.

    Here's a worked example, with a file consisting of long sequences of blanks and non-repeating strings, to illustrate how the RLE format looks:

    Input:
            000000: 40404040 40404040 40404040 40404040 |@@@@@@@@@@@@@@@@| |                |
                    --------   same as above   --------
            000120: 40404040 40404040 40404081 82838485 |@@@@@@@@@@@.....| |           abcde|
            000130: 86878889 91929394 95969798 99A2A3A4 |................| |fghijklmnopqrstu|
            000140: A5A6A7A8 A9F1F2F3 F4F5F6F7 F8F9F0C1 |................| |vwxyz1234567890A|
            000150: C2C3C4C5 C6C7C8C9 D1D2D3D4 C1D5D6D7 |................| |BCDEFGHIJKLMANOP|
            000160: D8E2E3E4 E5E6E7E8 E94E617E 81828384 |.........Na~....| |QSTUVWXYZ+/=abcd|
            000170: 85868788 89919293 94959697 9899A2A3 |................| |efghijklmnopqrst|
            000180: A4A5A6A7 A8A9F1F2 F3F4F5F6 F7F8F9F0 |................| |uvwxyz1234567890|
            000190: C1C2C3C4 C5C6C7C8 C9D1D2D3 D4C1D5D6 |................| |ABCDEFGHIJKLMANO|
            0001A0: D7D8E2E3 E4E5E6E7 E8E94E61 7E818283 |..........Na~...| |PQSTUVWXYZ+/=abc|
            0001B0: 84858687 88899192 93949596 979899A2 |................| |defghijklmnopqrs|
            0001C0: A3A4A5A6 A7A8A9F1 F2F3F4F5 F6F7F8F9 |................| |tuvwxyz123456789|
            0001D0: F0C1C2C3 C4C5C6C7 C8C9D1D2 D3D4C1D5 |................| |0ABCDEFGHIJKLMAN|
            0001E0: D6D7D8E2 E3E4E5E6 E7E8E94E 617E8182 |...........Na~..| |OPQSTUVWXYZ+/=ab|
            0001F0: 83848586 87888991 92939495 96979899 |................| |cdefghijklmnopqr|
            000200: A2A3A4A5 A6A7A8A9 F1F2F3F4 F5F6F7F8 |................| |stuvwxyz12345678|
            000210: F9F0C1C2 C3C4C5C6 C7C8C9D1 D2D3D4C1 |................| |90ABCDEFGHIJKLMA|
            000220: D5D6D7D8 E2E3E4E5 E6E7E8E9 4E617E81 |............Na~.| |NOPQSTUVWXYZ+/=a|
            000230: 82838485 86878889 91929394 95969798 |................| |bcdefghijklmnopq|
            000240: 99A2A3A4 A5A6A7A8 A9F1F2F3 F4F5F6F7 |................| |rstuvwxyz1234567|
            000250: F8F9F0C1 C2C3C4C5 C6C7C8C9 D1D2D3D4 |................| |890ABCDEFGHIJKLM|
            000260: C1D5D6D7 D8E2E3E4 E5E6E7E8 E94E617E |.............Na~| |ANOPQSTUVWXYZ+/=|
            000270: 40404040 40404040 40404040 40404040 |@@@@@@@@@@@@@@@@| |                |
                    --------   same as above   --------
            000290: 40E3C8C5 C5D5C4                     |@......         | | THEEND         |
    Output:
            000000: 80FF40FF 40AD407F 81828384 85868788 |..@.@.@.........| |.. . . "abcdefgh|
            000010: 89919293 94959697 9899A2A3 A4A5A6A7 |................| |ijklmnopqrstuvwx|
            000020: A8A9F1F2 F3F4F5F6 F7F8F9F0 C1C2C3C4 |................| |yz1234567890ABCD|
            000030: C5C6C7C8 C9D1D2D3 D4C1D5D6 D7D8E2E3 |................| |EFGHIJKLMANOPQST|
            000040: E4E5E6E7 E8E94E61 7E818283 84858687 |......Na~.......| |UVWXYZ+/=abcdefg|
            000050: 88899192 93949596 979899A2 A3A4A5A6 |................| |hijklmnopqrstuvw|
            000060: A7A8A9F1 F2F3F4F5 F6F7F8F9 F0C1C2C3 |................| |xyz1234567890ABC|
            000070: C4C5C6C7 C8C9D1D2 D3D4C1D5 D6D7D8E2 |................| |DEFGHIJKLMANOPQS|
            000080: E3E4E5E6 E7E8E97F 4E617E81 82838485 |........Na~.....| |TUVWXYZ"+/=abcde|
            000090: 86878889 91929394 95969798 99A2A3A4 |................| |fghijklmnopqrstu|
            0000A0: A5A6A7A8 A9F1F2F3 F4F5F6F7 F8F9F0C1 |................| |vwxyz1234567890A|
            0000B0: C2C3C4C5 C6C7C8C9 D1D2D3D4 C1D5D6D7 |................| |BCDEFGHIJKLMANOP|
            0000C0: D8E2E3E4 E5E6E7E8 E94E617E 81828384 |.........Na~....| |QSTUVWXYZ+/=abcd|
            0000D0: 85868788 89919293 94959697 9899A2A3 |................| |efghijklmnopqrst|
            0000E0: A4A5A6A7 A8A9F1F2 F3F4F5F6 F7F8F9F0 |................| |uvwxyz1234567890|
            0000F0: C1C2C3C4 C5C6C7C8 C9D1D2D3 D4C1D5D6 |................| |ABCDEFGHIJKLMANO|
            000100: D7D8E2E3 E4E5E647 E7E8E94E 617E8182 |.......G...Na~..| |PQSTUVW.XYZ+/=ab|
            000110: 83848586 87888991 92939495 96979899 |................| |cdefghijklmnopqr|
            000120: A2A3A4A5 A6A7A8A9 F1F2F3F4 F5F6F7F8 |................| |stuvwxyz12345678|
            000130: F9F0C1C2 C3C4C5C6 C7C8C9D1 D2D3D4C1 |................| |90ABCDEFGHIJKLMA|
            000140: D5D6D7D8 E2E3E4E5 E6E7E8E9 4E617EA1 |............Na~.| |NOPQSTUVWXYZ+/=~|
            000150: 4006E3C8 C5C5D5C4                   |@.......        | | .THEEND        |



    ------------------------------
    Andrew Mattingly
    ------------------------------