Informix

nested-group-icon.png

DB2

Expand all | Collapse all

Reclaim dbspace without shrinking/repacks capabilities

  • 1.  Reclaim dbspace without shrinking/repacks capabilities

    Posted 16 days ago
    Hello dear IIUG members,

    i have a database without shrinking/repacks capabilities.
    I have a couple of huge tables with lenghtly records.
    I have a couple of dbspace that i could use but with limited space (it won't be a long term solution; and there are IO issues if i put all heavily read/write tables in the same ones) 
    I use mainly linux command line or dbaccess but i have ServerStudio which is quite cool to visualize things (but i hardly use it to send admin commands to the db and i do not have Sentinel)

    But I must minimize the downtime (especially during that covid period^^)


    I can unload quite some records from the tables and delete them
    but i forgot..

    1) will the space in the extents/new extents will be available again for the new records i try to insert? (so i won't have to add a new extents)

    2) or must i be prepared to truncate the table (which takes aaaaages to: turn off the db, unload the datas that i wont reload, unload datas that i will reload, load those latter datas (maybe though external table to quick-up this step), rebuild the stats (as auto update statitics doesnt work with complex index so i had to write custom update stats :'(), put the db online in solomode, check if it is working and put it in multiple mode)

    3) or must i be prepared to fragment the table  (same issues with the unload/load/stats-index rebuild but i prefer it to the truncate honestly so i can still access all the datas)

    4) or do you have any other ideas?

    Kr/Bàv/Mvg,

    ------------------------------
    Marc H
    ------------------------------


  • 2.  RE: Reclaim dbspace without shrinking/repacks capabilities

    Posted 16 days ago
    Marc:

    Why do you think that you do not have REPACK and SHRINK functionality? Those have been in every release since v11.50, are you running an older release than that?

    Comments on your questions below:


    I can unload quite some records from the tables and delete them but i forgot..

    1) will the space in the extents/new extents will be available again for the new records i try to insert? (so i won't have to add a new extents)
    • Yes, deleted row space is immediately available for use by new rows with a small caviat. The engine uses the bitmaps to determine what pages have space for new rows. So in a table with narrow rows pages with only a small number of deleted rows in them, may not trigger the flag, so it is possible that sparse deletes will waste a small amount of space. Not a major thing.
    2) or must i be prepared to truncate the table (which takes aaaaages to: turn off the db, unload the datas that i wont reload, unload datas that i will reload, load those latter datas (maybe though external table to quick-up this step), rebuild the stats (as auto update statitics doesnt work with complex index so i had to write custom update stats :'(), put the db online in solomode, check if it is working and put it in multiple mode)
    • TRUNCATE will optionally free the unused space for reuse by other tables, but if the REUSE STORAGE option is passed then the space freed up by deleted rows after the reload will still be there for the table itself to reuse with all of the free space at the end of the table rather than scattered as it would be under #1 above, so that is an advantage for efficiency as new rows will be contiguous!
    • You don't have to write your own custom update statistics, just use my dostats utility in my utils2_ak package. It implements the full recommended UPDATE STATISTICS protocols described in the Performance Guide manual with some enhancements suggested by the engine developers. You can download the latest from my web site free: (www.askdbmgt.com/my-utilities).
    3) or must i be prepared to fragment the table  (same issues with the unload/load/stats-index rebuild but i prefer it to the truncate honestly so i can still access all the datas)
    • After deleting the rows you no longer need, you could ALTER FRAGMENT ON TABLE INIT IN <dbspace>; which could be the same dbspace it already resides in or another. That, like #2, will make all new rows contiguous but only the space needed for the copied data or specified in the table's EXTENT SIZE and NEXT SIZE attributes will be allocates. So, you might want to set the EXTENT SIZE to include the amount of free space you want to be pre-allocated before issuing the ALTER command.
    4) or do you have any other ideas?
    Idea #1:
    1. create a new table with all required indexes
    2. rename the old table
    3. create a VIEW that does a UNION between the new table and the old one and give it the name of the original table
    4. create an INSTEAD OF trigger on the VIEW to direct all new inserts to the new table
    5. slowly over time copy data you need to retain from the old table to the new one, optionally deleting those rows from the old table afterwards.
    6. once all of the needed data is in the new table run dostats against the new table then you can drop the VIEW, rename the new table to the original name, then drop the old table. Poof! Downtime should be on the order of seconds if you script  #3 & #4 at the beginning of the process and #6 at the end. (You may have to recreate any foreign keys that reference the original table to point to the new table. Also some complex FK relationships may make this idea impractical.)
    Idea #2:
    1. If you are running very recent v12.10 releases or v14.10 you can use loopback replication to sync all of the rows you need to keep into a new table over time with zero downtime. Then once everything is synced, run dostats on the new table, drop replication, rename the old and new tables, and eventually drop the old table. Again, downtime only at the end lasting only a few seconds.  (You may have to recreate any foreign keys that reference the original table to point to the new table.)
    There are a couple of other similar ways to approach this depending on what version of Informix you are running including using triggers, change data capture, smart triggers, or post commit triggers to maintain the "new" table while or after initially populating it instead of using loopback replication for versions earlier than 12.10/14.10 that don't support it.

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.






  • 3.  RE: Reclaim dbspace without shrinking/repacks capabilities

    Posted 16 days ago
    Thank you for you quick answer Art. Your help is always greatly appreciated!

    For the version, i am using 12.10.FC5W1_UE (and 12.10.FC5W1_EE; i'm not sure whate UE and EE means^^)
    Maybe i am confusing repack and compression.
    Compression helps you to get rid of emptied "new extent" but repack don't (new extents will stay allocated even if they are empty)?
    And Compression is an "Informix SQL Runtime Option" linked to the contract we signed?
    Is that correct?

    • I can unload quite some records from the tables and delete them but i forgot..
      1) will the space in the extents/new extents will be available again for the new records i try to insert? (so i won't have to add a new extents)
      • Yes, deleted row space is immediately available for use by new rows with a small caviat. The engine uses the bitmaps to determine what pages have space for new rows. So in a table with narrow rows pages with only a small number of deleted rows in them, may not trigger the flag, so it is possible that sparse deletes will waste a small amount of space. Not a major thing. But i'm gessing the « new extents » will allocated ?
    So if i repack, those spaces might be more likely to be detected by the engine and "new extents" might not allocated? (but i need to run an UPDATE STATISTICS to get the bitmap right?)
    • 2) or must i be prepared to truncate the table (which takes aaaaages to: turn off the db, unload the datas that i wont reload, unload datas that i will reload, load those latter datas (maybe though external table to quick-up this step), rebuild the stats (as auto update statitics doesnt work with complex index so i had to write custom update stats :'(), put the db online in solomode, check if it is working and put it in multiple mode)
      • TRUNCATE will optionally free the unused space for reuse by other tables, but if the REUSE STORAGE option is passed then the space freed up by deleted rows after the reload will still be there for the table itself to reuse with all of the free space at the end of the table rather than scattered as it would be under #1 above, so that is an advantage for efficiency as new rows will be contiguous!
    I have always used TRUNCATE without options, it is good to know

      • You don't have to write your own custom update statistics, just use my dostats utility in my utils2_ak package. It implements the full recommended UPDATE STATISTICS protocols described in the Performance Guide manual with some enhancements suggested by the engine developers. You can download the latest from my web site free: (www.askdbmgt.com/my-utilities).

    I was not precise, sorry. In fact i never tried using Auto Update Statistics on informix 11/12.
    It is a pretty long story... sorry^^
    When we migrated from informix 9 (on Solaris) to informix 11 (on Linux), we kept our former UPDATE STATISTICS script (scheduled with cron).
    It was working fine but one day, after an UPDATE STATISTICS HIGH on a table i guess,
    a critical program couldn't do a SELECT (with a LEFT OUTER JOIN) on some tables in the database.
    (with no SQL tracing and not having access to the source code of the program, and the program not logging its SQL error (if fact it was logged sowhere we didn't knew existed^^), it was quite hard to find what was the blocking query but finally managed to do it^^)
    Error: Internal error in routine opjoin().
    SQLState: IX000
    ErrorCode: -768
    Position 1028
    So we did a finderr to understand the error of course
    Tried to unload/delete records => the query wasn't working
    So we did oncheck, update stats, update stats and dropping bins, tried medium instead of high etc.
    Tried to drop/create table => the query was working then loaded the records, the query wasn't working anymore
    So i ended up try to run SET EXPLAIN to understand what was going wrong with the indexes...
    And it seems like the optimizer in the version 12 (compared to the one in informix 9) is a little confused with the combination of:
    -having multiple indexes
    -those indexes have nearly always multiple columns
    -those indexes sometimes shares the same columns
    -the order of the creation of the indexes (or the order of the column within?) didn't seem to matter anymore
    -having multiple "left outer join" in the select query
    So i drop/create/load the table again (running onstat -dr) but this time, i did my update statistics high only on "all the columns shared by the indexes", not simply the table => it worked :)
    (i searched how to influence the optimizer with no success. I'm not sure but i think the order of column in indexes doenst matter as much as before in the évaluation of the cost of "query plans")
    (We had IO issues because of the VM environnement... so we had to fix that first (not related to Informix, just the SCSI controler but i had to distribute the load between the disk as much as i could with alter fragment on the tables and their indexes etc.))
    Later, i compiled dostat.ec and read the 3 scripts (initial, evaluate, refresh)
    looked how the scheduled task, launch the stored proc which launch the COBOL etc.
    It is nicely done btw;)
    But to make it works, i had to make the developper of the program change the order of columns in their indexes and they wouldn't do any change without my manager's consent but my manager wasn't capable of understanding the problem so i kept using my scripts^^ (it seems managers don't need to understand a thing nowadays, just to manage cluelessly and without responsabilities^^)
    I should probably adapted the cobol to make it suits my need (update stat on specific columns) but, by then, i left the informaticadienst and now i'm just the end-user (a little scared about how badly the database is managed because atm, the IT managers seems to be more concerned about an issue with ILMT than the health of the database)

    So before i left the service, i made a cron running updatestat high only on the column of the indexes,
    and I ran a script to force the developper to have a correct usage of the database (based on "onstat -g ppf 0|sort -r -n -k +$sort_column)" and the results from systables tabtype and sysfragments fragtype. It is a bit long to explain but i can if you want^^)
    I'll download the new version when i have time but i don't have as much time as before to fine-tune the databases. Atm, i am just scared that the db isn't maintained properly and when there will be a problem, i'd like to have a solution already^^

    3) or must i be prepared to fragment the table  (same issues with the unload/load/stats-index rebuild but i prefer it to the truncate honestly so i can still access all the datas)

    • After deleting the rows you no longer need, you could ALTER FRAGMENT ON TABLE INIT IN <dbspace>; which could be the same dbspace it already resides in or another. That, like #2, will make all new rows contiguous but only the space needed for the copied data or specified in the table's EXTENT SIZE and NEXT SIZE attributes will be allocates. So, you might want to set the EXTENT SIZE to include the amount of free space you want to be pre-allocated before issuing the ALTER command.

    I did had to use the alter frament already because i have table with an allocated size of 16777215. The problem i have with the ALTER FRAGMENT is that i have to LOCK the table to do so... but i can't have downtime. If so, it has to be minimized so it means having already an idea and sql script ready to divide the table between dbspaces. We are working 7/7 24/24 with the database. 

    4) or do you have any other ideas?

    Idea #1:

    1. create a new table with all required indexes

    2. rename the old table

    3. create a VIEW that does a UNION between the new table and the old one and give it the name of the original table

    4. create an INSTEAD OF trigger on the VIEW to direct all new inserts to the new table

    5. slowly over time copy data you need to retain from the old table to the new one, optionally deleting those rows from the old table afterwards.

    6. once all of the needed data is in the new table run dostats against the new table then you can drop the VIEW, rename the new table to the original name, then drop the old table. Poof! Downtime should be on the order of seconds if you script  #3 & #4 at the beginning of the process and #6 at the end. (You may have to recreate any foreign keys that reference the original table to point to the new table. Also some complex FK relationships may make this idea impractical.)

    I like the idea (and tried something close before) but i can't change the source code unfortunatly and make the program look at the view instead of the table.
    Furthermore, there is a Java problem. If you first  make a copy the table in the good dbspace and swiftly change/switch the table names while the program is not accessing the table, the IBM JDBC prepared statement don't work anymore. The statement need to be prepared again.
    I haven't tried with i4gl but i'm pretty sure it is a problem too.

    Idea #2:

    1. If you are running very recent v12.10 releases or v14.10 you can use loopback replication to sync all of the rows you need to keep into a new table over time with zero downtime. Then once everything is synced, run dostats on the new table, drop replication, rename the old and new tables, and eventually drop the old table. Again, downtime only at the end lasting only a few seconds.  (You may have to recreate any foreign keys that reference the original table to point to the new table.)

    There are a couple of other similar ways to approach this depending on what version of Informix you are running including using triggers, change data capture, smart triggers, or post commit triggers to maintain the "new" table while or after initially populating it instead of using loopback replication for versions earlier than 12.10/14.10 that don't support it.

    The problem is a bit the same, i can alter the database but not the source nor stopping the program (restating it takes a lot of time, many tomcat servers and programs to start on different machine etc.).

    I wish there was something to lock a precise set of rows, move them to another dbspace, correct the index and free them transparently,
    so the programs only think the database is a little slow to return results but don't detect any change in the systable etc.

    Locking the whole table is annoying because
    1) you can't work on "cold datas" while the program is working on fresh ones
    2) even if you are admin, it is sometimes hard to get the lock because the program takes constantly some locks (rows, pages, index?) to work (and i'm not a fan of programs working with "dirty read" connections)

    Thank you again for the time you spent answering my questions and for your suggestions!
    (Sorry for the bad grammar and mistakes^^')

    Marc

    ------------------------------
    Marc H
    ------------------------------



  • 4.  RE: Reclaim dbspace without shrinking/repacks capabilities

    Posted 16 days ago
    Marc:

    See below:

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.







  • 5.  RE: Reclaim dbspace without shrinking/repacks capabilities

    Posted 16 days ago
    Marc:

    Oops, missed some of your follow up. Continuing my responses below:

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management

    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.