Informix

nested-group-icon.png

DB2

Expand all | Collapse all

curious about extendable chunks

  • 1.  curious about extendable chunks

    Posted Wed January 27, 2021 03:59 PM
    We're in the process of migrating to 14.10 from a version (never mind how old) that did not have storage pools.  I attended sessions at IIUG about storage pool and the ability to automatically add space as a dbspace filled up.  As I understood it at the time, it would do this by allocating a new chunk from the available space allocated to the storage pool.

    Now that I'm actually working with 14.10 in a test system, I see that it also has the ability to extend an existing chunk, rather than just adding a new one.  I've looked at the Administration Reference, and I've not seen any information on how this actually happens.  From memory, it seems that a chunk was supposed to be a contiguous block of pages, and I'm trying to understand how that works with extendable chunks.

    In the case of cooked files, it makes sense, so long as you use a single OS file for each chunk.  The instance could suddenly extend the chunk, the OS would simply extend the file, and the instance could still reference a page as an offset of some number of bytes from the start of the file, corresponding to (page_number * page size) from the beginning of the chunk.  Of course, the cooked file could be broken up into multiple pieces on the disk, with pieces of other files interwoven, preventing it from being contiguous on the disk, but from the perspective of the file, they'd appear to be contiguous.

    Where it gets fuzzier is in the case of a raw device, or even a cooked file that contains multiple chunks, each created as some offset from the beginning of the device/file.  In that case, it seems like you would no longer be able to count on a chunk being a single contiguous block of pages.  So, in the past, I might have something like this on a single raw device, where the *_chk2 and *_chk3 are added as the database grows:
                   
    device: /ifmx_links/ifmx_disk1 -> /dev/vg01/rdsk1

    chunk offset size
    tbldbs_chk1 0 500000
    idxdbs_chk1 500000 100000
    tbldbs_chk2 600000 500000
    tbldbs_chk3 1100000 500000
    idxdbs_chk2 1600000 100000

    So if you have extendable chunks, assuming that the device above was part of the storage pool, would tbldbs_chk1 simply allocate more pages, similar to what I manually did when I added tbldbs_chk2 in the past, even though those pages would not be contiguous to the existing pages in the chunk?  Is there a performance impact by having that situation?  Or are chunks only extendible so long as nothing has been allocated (via the '-o' parameter of onspaces) immediately past the original end point of the chunk, so that those pages do end up being contiguous?

    As I said, I tried looking in the manual, but I didn't see where it addressed either the question of "how does it work" or "is there a performance penalty".

    Thanks in advance.


    ------------------------------
    Mark Collins
    ------------------------------


  • 2.  RE: curious about extendable chunks

    Posted Wed January 27, 2021 05:52 PM
    Mark:

    Your understanding is correct. You can only do extendable chunks if you are using COOKED file chunks not RAW or COOKED devices. 

    If you are using RAW devices you can put the devices with any existing offsets and the device size into a storage pool entry. If new chunks are need to expand a dbspace the storage pool will allocate a chunk from one of the devices in the storage pool.

    Dbspace expansion and chunk extension are performed by a task scheduler task that executes hourly. You can adjust the timing of that task if needed. You can also use an API function to add a chunk or extend a chunk. 

    The default extend/expand size is 10,000 K or 10MB, so for larger and more active dbspaces you will need to adjust that value for those dbspaces using one of the API functions as well.

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 3.  RE: curious about extendable chunks

    Posted Wed January 27, 2021 06:05 PM
    Edited by Mark Collins Wed January 27, 2021 06:46 PM

    Thanks Art.  That clears things up a bit.  I'm only seeing extendable chunks now because I'm playing with a test database using cooked files.  I've always used raw devices on my "real" instances, but we're still trying to figure out raw devices under RH 8.

    You mentioned "raw or cooked devices".  Just to confirm that I'm thinking the same thing you are, raw devices are character special files (devices), and cooked are block special files, with no file systems stored on them, right?  [edit to add - someone I worked with at a previous shop used to call them character raw and block raw, so I may be misinterpreting your "cooked devices" comment.]  And only character raw devices allow you to use KAIO, unless something has changed.  Can cooked devices use DIRECT_IO?  Or do they always go through the OS cache?  I've only ever used cooked files in file systems (for quick tests) and character special files for raw.



    ------------------------------
    Mark Collins
    ------------------------------



  • 4.  RE: curious about extendable chunks

    Posted Wed January 27, 2021 06:58 PM
    All correct Mark. COOKED devices or block devices cannot use either KAIO or DIRECT_IO (which uses KAIO), so they are always going through the OS cache using O_SYNC which is safe but slow.


    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 5.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 01:05 PM
    Just been corresponding with Art directly, and sent him listings from an actual IDS 12.10.FC12 instance on CentOS 7.8 proving that DIRECT_IO works just fine on Linux block devices these days. Try using them and check that the output of "onstat -d" shows "D" in the last character of the chunk flags column denoting that DIRECT_IO is enabled:

    https://www.ibm.com/support/knowledgecenter/en/SSGU8G_14.1.0/com.ibm.adref.doc/ids_adr_0504.htm

    Raw devices are deprecated!

    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 6.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 01:14 PM

    Doug,

    Where did that link indicate that raw devices are deprecated?  I did a search for 'raw' and found only this:

    offset
    
    The offset into the file or raw device in base page size


    I also searched for 'deprecated', but as soon as I got 'dep' entered, it showed '0/0 results'.  




    ------------------------------
    Mark Collins
    ------------------------------



  • 7.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 01:26 PM

    https://en.wikipedia.org/wiki/Raw_device

    Raw device

    Partial quote:

    In Linux kernel, raw devices were deprecated and scheduled for removal at one point, because the O_DIRECT flag can be used instead.[2] However, later the decision was made to keep raw devices support since some software cannot use the O_DIRECT flag.




    ------------------------------
    Vladimir Kolobrodov
    ------------------------------



  • 8.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 01:35 PM
    Mark:

    RAW ended up not being deprecated in the end. Linus hates raw devices, and Red Hat has removed the supporting utilities from their distributions from one release to another and has always had to add them back in by user request. It is not clear what the future of RAW will be ultimately, but with O_DIRECT support in both the Linux kernel and Informix the overhead of using even cooked filesystem files is only about 5% versus raw. I have not tested block device performance under direct io because the last time I did those performance tests, either Informix or the Linux kernel did not support it (not sure which it was though). I was surprised when Doug let me know that direct io and cooked/block devices do now work together in Informix.

    Art

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 9.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 02:17 PM
    Art, Vladimir,

    Thank you for clearing that up.

    I had heard that Linus was not a fan of raw devices, but never understood why he was so passionately opposed to technology that provided such a benefit to those products capable of using them.  I suppose I should Google that, just out of curiosity.

    ------------------------------
    Mark Collins
    ------------------------------



  • 10.  RE: curious about extendable chunks

    Posted Sat January 30, 2021 04:49 AM

    Question:

    For those using filesystem files - how often do you umount and run fsck to check the filesystem is consistent?

    ------------------------------
    David Williams
    ------------------------------



  • 11.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 01:14 PM

    Correction:

    "In Linux kernel, raw devices were deprecated and scheduled for removal at one point, because the O_DIRECT flag can be used instead. However, later the decision was made to keep raw devices support since some software cannot use the O_DIRECT flag."

    https://en.wikipedia.org/wiki/Raw_device



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 12.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 03:23 AM
    Hello Mark,

    you don't need a storage pool to extend chunks. You only need to set that chunk extendable:

    $ echo 'EXECUTE FUNCTION task("modify chunk extendable on", 4);' | dbaccess sysadmin - # set chunk#4 extendable
    $ echo 'EXECUTE FUNCTION task("modify chunk extend", 4, "8GB");' | dbaccess sysadmin - # extents chunk#4 8GB

    That are the steps to manually extend chunks. If you want that informix expands dbspaces by itself you need some more
    steps:

    $ onmode -wf SP_AUTOEXPAND=1
    $ onmode -wf SP_THRESHOLD=262144
    $ onmode -wf SP_WAITTIME=300

    Now ALL dbspaces and it chunks with extandable flag set to on are expanded when there getting full.
    In online.log you can see that it work:

    05/11/20 18:21:25 Chunk 7 in space 'datdbs01' has been extended by 131072 kb.
    05/11/20 18:38:20 Chunk 7 in space 'datdbs01' has been extended by 8000000 kb.
    05/11/20 19:21:23 Chunk 7 in space 'datdbs01' has been extended by 131920 kb.
    05/11/20 19:25:41 Chunk 7 in space 'datdbs01' has been extended by 8000000 kb.

    This does not work with smart blobs and blob dbspaces and during backup:

    05/13/20 09:26:45 Extend chunk 8 failed. System archive in progress. Try again later.
    05/13/20 09:26:45 Extend chunk 4 failed. System archive in progress. Try again later.
    05/13/20 09:26:45 Extend chunk 6 failed. System archive in progress. Try again later.
    05/13/20 09:26:45 Extend chunk 9 failed. System archive in progress. Try again later.

    Since you don't want to expand some dbspaces like rootdbs, tmpdbs, llogdbs and plogdbs you need to that
    these dbspaces to not expandable:

    $ echo 'EXECUTE FUNCTION task("modify space sp_sizes", "rootdbs", 0);
    $ echo 'EXECUTE FUNCTION task("modify space sp_sizes", "plogdbs", 0);
    $ echo 'EXECUTE FUNCTION task("modify space sp_sizes", "llogdbs", 0);
    $ echo 'EXECUTE FUNCTION task("modify space sp_sizes", "tmpdbs", 0);

    The job which is doing that is the sysadmin "mon_low_storage" job.
    You can modify the interval between every start like that (for example every 10 minutes):

    $ dbaccess -e sysadmin - <<EOF
    UPDATE ph_task SET tk_start_time = "00:00:00",
    tk_stop_time = "00:00:10",
    tk_frequency = INTERVAL (10) MINUTE TO MINUTE,
    tk_next_execution = ROUND(CURRENT, 'HH')::DATETIME YEAR TO SECOND
    WHERE tk_name = "mon_low_storage";
    UPDATE ph_task SET tk_next_execution = ROUND(CURRENT, 'HH')::DATETIME YEAR TO SECOND
    WHERE tk_name = "mon_low_storage";
    UPDATE ph_task SET tk_next_execution = ROUND(CURRENT, 'HH')::DATETIME YEAR TO SECOND
    WHERE tk_name = "mon_low_storage";
    EOF

    This also works with raw devices each on top of a logical volume. On Linux you need some udev rules to create the raw
    device /dev/raw/rawX and the sysmlink to the logical volume.

    Cheers,
    Markus

    ------------------------------
    Markus Holzbauer
    ------------------------------



  • 13.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 12:50 PM
    Markus,

    Thank you for the detailed examples.

    I am curious about the udev rules that you mentioned.  We're having some issues getting our raw devices to work, even without worrying about storage pools or expanding dbspaces.

    [informix@sandbox]$ ls -l /dev/raw
    total 0
    crw-rw----. 1 informix informix 253, 11 Jan 28 12:18 dm-11
    crw-rw----. 1 root     disk     162,  0 Jan 14 16:49 rawctl
    [informix@sandbox]$ pwd;ls -l
    /informix/links
    total 0
    lrwxrwxrwx. 1 informix informix 14 Jan 28 12:21 ifmx_raw_1 -> /dev/raw/dm-11
    [informix@sandbox]$ onspaces -a idxdbs -p /informix/links/ifmx_raw_1 -o 0 -s 100000
    Verifying physical disk space, please wait ...
    Error opening file /informix/links/ifmx_raw_1.​

    As you can see, the raw device /dev/raw/dm-11 has the correct ownership and permissions, and is a character raw device.  I've got a symbolic link /informix/links/ifmx_raw_1 that points to that character raw device, and yet when I run onspaces to create a dbspace, it fails.  There is no error message in the online.log file, and the return code is 1.  The sysadm has confirmed that there are no messages in syslog, either.

    Can you share the udev rules that you mentioned, so that we can compare to what we have in place?  We're running this on RHEL 8.2, not sure if that matches what you're running on.


    ------------------------------
    Mark Collins
    ------------------------------



  • 14.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 01:33 PM
    Yet, the raw devices should work wiht informix, if there is no other diagnostics, try something like

    strace -f onspaces -a idxdbs -p /informix/links/ifmx_raw_1 -o 0 -s 100000

    the output will give O/S error code for the operation.

    Most likely cause would be access not aligned on device block size.

    ------------------------------
    Vladimir Kolobrodov
    ------------------------------



  • 15.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 01:41 PM
    Digging a little deeper, I asked my sysadm for more information on the major/minor numbers for the raw device.  The minor number, 11, matches the device number in 'dm-11'.  The major number is system-assigned, using values from /proc/devices.  That list is:

    [informix@gsvgsandbox02 links]$ cat /proc/devices
    Character devices:
      1 mem
      4 /dev/vc/0
      4 tty
      4 ttyS
      5 /dev/tty
      5 /dev/console
      5 /dev/ptmx
      7 vcs
     10 misc
     13 input
     21 sg
     29 fb
     99 ppdev
    128 ptm
    136 pts
    162 raw
    180 usb
    188 ttyUSB
    189 usb_device
    202 cpu/msr
    203 cpu/cpuid
    226 drm
    243 aux
    244 hidraw
    245 usbmon
    246 bsg
    247 watchdog
    248 ptp
    249 pps
    250 cec
    251 rtc
    252 dax
    253 tpm
    254 gpiochip
    
    Block devices:
      8 sd
      9 md
     11 sr
     65 sd
     66 sd
     67 sd
     68 sd
     69 sd
     70 sd
     71 sd
    128 sd
    129 sd
    130 sd
    131 sd
    132 sd
    133 sd
    134 sd
    135 sd
    253 device-mapper
    254 mdp
    259 blkext
    ​

    The major number of our character raw device (above) is 253.  Since we're dealing with a character device, 253 would be 'tpm', which he tells me is the Trusted Platform Module.  I'm guessing here, as I haven't heard back from our server team yet, but could that be because they have configured the VM to use encryption on all disks?  So the OS sees that the VM is providing it an encrypted disk, and onspaces is attempting to access that using some sort of low-level functions that are not supported by encrypted disks?  I'm probably grasping straws, as it seems like the VM should take any attempt to access the disk and do whatever sort of magic is necessary to make it work with the virtualized disk, but ...

    From the list in /proc/devices, it looks like major 162 is raw, and 244 is hidraw, which may be some variant of raw.  

    If anyone has successfully implemented raw devices on RHEL 8.2, please let me know the major/minor numbers that your raw devices show.

    Thank you.

    ------------------------------
    Mark Collins
    ------------------------------



  • 16.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 01:48 PM

    Before going too deep, the following should work or the onspaces is bound to fail:

    dd if=/informix/links/ifmx_raw_1 of=/dev/null bs=2k count=100000 iflag=direct



    ------------------------------
    Vladimir Kolobrodov
    ------------------------------



  • 17.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 02:42 PM

    Well that was enlightening:

    [informix@sandbox]$ ls -lrt /informix/links/ifmx_raw_1
    lrwxrwxrwx. 1 informix informix 14 Jan 28 12:21 /informix/links/ifmx_raw_1 -> /dev/raw/dm-11
    [informix@sandbox]$ ls -l /dev/raw/dm-11
    crw-rw----. 1 informix informix 253, 11 Jan 28 12:18 /dev/raw/dm-11
    [informix@sandbox]$ dd if=/informix/links/ifmx_raw_1 of=/dev/null bs=2k count=100000 iflag=direct
    dd: failed to open '/informix/links/ifmx_raw_1': No such device or address
    


    But clearly there is a file, so ....



    ------------------------------
    Mark Collins
    ------------------------------



  • 18.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 03:14 PM

    Some simple diagnostic steps:

    raw -qa

    It'll either show list of defined raw devices or

    raw: Cannot open master raw device '/dev/raw/rawctl': No such file or directory

    in which case you'll need to load module:

    modprobe raw

    and (re)define raw device  as follows:

    raw /dev/raw/raw1 /dev/<some-block-device>

    After which, the "raw -qa" will show something, for me -

    # raw -qa
    /dev/raw/raw1: bound to major 7, minor 1

    Different Linux distros may have different top path to raw devices / control, but "raw -qa" at least will give an idea where to look



    ------------------------------
    Vladimir Kolobrodov
    ------------------------------



  • 19.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 03:56 PM
    Vladimir,

    Thank you.  Once we did the 'raw /dev/raw/raw11 /dev/block/253:11' (to match the major/minor numbers assigned to the existing device), I was able to do 'dd' and subsequently 'onspaces'.  That's a huge relief.

    ------------------------------
    Mark Collins
    ------------------------------



  • 20.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 03:49 PM

    Mark:

    I'm just looking at this indetail for the first time. RAW devices in Linux are usually located in /dev/raw/ and are normally named raw<N> for the Nth raw device, did someone actually create the device dm-11 as a raw device as:

    raw  /dev/raw/raw<N>  <major> <minor>



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 21.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 03:58 PM
    Art,

    I'm not certain exactly how dm-11 was created, but based on Vladimir's post, my sysadm ran 'raw /dev/raw/raw11 /dev/block/253:11' and I now have access to the device, both from 'dd' and from 'onspaces'.

    ------------------------------
    Mark Collins
    ------------------------------



  • 22.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 05:03 PM
    I've asked my sysadm, and the dm-11 device was created with the command 'mknod dm-11 c 253 11'.  Coming from HP-UX, mknod is the way to create raw devices there, and the command was available in Linux, so that's what he used.  I don't know how that differs from the 'raw' command, as both created character files, but only the 'raw' resulted in a file that could be used by Informix.

    Now to try creating a new instance that uses that raw device for its rootdbs, along with the other dbspaces.

    ------------------------------
    Mark Collins
    ------------------------------



  • 23.  RE: curious about extendable chunks

    Posted Thu January 28, 2021 06:42 PM
    Mark:

    The problem with using mknod is that it is a kernel utility and RAW isn't supported by the kernel in Linux at all. The raw utility invokes a special raw disk driver in Linux, that's why you need to use it instead of mknod to create the raw device files.

    Linus kept raw out of the kernel itself. This was how users kept it available.

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 24.  RE: curious about extendable chunks

    Posted Fri January 29, 2021 02:11 AM

    Hello Mark,

    the udev rules file (I named it /etc/udev/rules.d/99-ifx-raw.rules) has 2 parts.
    The first part is to bound the raw device (character device) to the logical volume,
    the second part is to make the symlink and change ownership that informix can use it.
    You need a entry for each chunk/logical volume in each part of that rules file.

    Here is that which worked for me with RHEL 7.x:

    #
    # Adding raw devices for IBM Informix Server
    #
    SUBSYSTEM!="block", GOTO="ifx_raw_end1"
    KERNEL!="dm-*", GOTO="ifx_raw_end1"
    ACTION!="add|change", GOTO="ifx_raw_end1"
    
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_rootdbs_lv", RUN+="/bin/raw /dev/raw/raw1 %N"
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_plogdbs_lv", RUN+="/bin/raw /dev/raw/raw2 %N"
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_llogdbs_lv", RUN+="/bin/raw /dev/raw/raw3 %N"
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_ttmpdbs1_lv", RUN+="/bin/raw /dev/raw/raw4 %N"
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_ttmpdbs2_lv", RUN+="/bin/raw /dev/raw/raw5 %N"
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_ttmpdbs3_lv", RUN+="/bin/raw /dev/raw/raw6 %N"
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_sblobdbs_lv", RUN+="/bin/raw /dev/raw/raw7 %N"
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_datdbs_lv", RUN+="/bin/raw /dev/raw/raw8 %N"
    ENV{DM_VG_NAME}=="ifxvg", ENV{DM_LV_NAME}=="ifxdata_idxdbs_lv", RUN+="/bin/raw /dev/raw/raw9 %N"
    
    LABEL="ifx_raw_end1"
    
    
    SUBSYSTEM!="raw", GOTO="ifx_raw_end2"
    ACTION!="add|change", GOTO="ifx_raw_end2"
    
    KERNEL=="raw1", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_rootdbs_lv"
    KERNEL=="raw2", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_plogdbs_lv"
    KERNEL=="raw3", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_llogdbs_lv"
    KERNEL=="raw4", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_ttmpdbs1_lv"
    KERNEL=="raw5", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_ttmpdbs2_lv"
    KERNEL=="raw6", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_ttmpdbs3_lv"
    KERNEL=="raw7", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_sblobdbs_lv"
    KERNEL=="raw8", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_datdbs_lv"
    KERNEL=="raw9", OWNER:="informix", GROUP:="informix", MODE:="0660", SYMLINK+="ifxvg/rifxdata_idxdbs_lv"
    
    LABEL="ifx_raw_end2"



    After you have created that file the new rule can be loaded with:

    # udevadm control --reload-rules
    # udevadm trigger


    Hope this helps.

    Cheers,
    Markus



    ------------------------------
    Markus Holzbauer
    ------------------------------



  • 25.  RE: curious about extendable chunks

    Posted Fri January 29, 2021 11:30 AM
    Markus,

    Thank you.  I have forwarded this info to our sysadm.

    ------------------------------
    Mark Collins
    ------------------------------