Informix

Informix

Connect with Db2, Informix, Netezza, open source, and other data experts to gain value from your data, share insights, and solve problems.

 View Only
  • 1.  Odd restore failure

    Posted Thu May 11, 2023 07:55 PM

    This is a new one for me. I am trying to restore a copy of a production server to a test machine for performance testing. I have tried ontape archives, onbar archives, and finally ifxclone all with the same result. The archive completes and wants the logical logs to be restored because:

    8:45:56  BLOBspace my_blob not recovered from same archive backup as DBspace rootdbce.

    18:45:56  Logical restore cannot be skipped. Perform a logical restore.
    18:45:56  Cannot change to On-Line Mode.

    I do not need the logs restored, the server's state at the time of the archive is just fine for the purpose. Even so, I tried but no manner of logical restore will succeed either. 

    It looks like the source server was restored using onbar and a dbspace level restore, but I have no record of that ever having been done.

    Anyone out there have any ideas? PMR next if nothing simple comes back.

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------


  • 2.  RE: Odd restore failure

    Posted Fri May 12, 2023 05:00 AM

    Hi Art,

    any clue from comparing rootdbs's and this my_blob's "DBspace archive status" in oncheck -pr, specifically Logical Log Unique Id + Position?

    With a true whole system backup (which ontape would do anyway, and I think ifxclone too, and which is the precondition for restore without logs) those positions should be the same for all spaces, namely that of the last archive checkpoint.

    Andreas



    ------------------------------
    Andreas Legner
    ------------------------------



  • 3.  RE: Odd restore failure

    Posted Fri May 12, 2023 06:55 AM

    Andreas:

    Thanks for expending brain power on this for me. Yea, onstat -g arc shows what's expected, the last onbar level 0 and level 1 archives which were both full server archives. Also, as you note, both my ontape archive and ifxclone runs were full server so that's not it.

    Here's what I finally figured out: about three years ago the client moved their server from on-prem to the cloud. When they did they split the dbspaces (the server had over 800 databases with two dbspaces per database plus a handful of shared dbspaces like this "web" space and root and logs) into two servers on the cloud. The only thing I can thnk of is that they must have done an onbar base restore of the root dbspace and logs dbspaces from the on-prem server to each of the two cloud servers then used dbspace restores using onbar moving one pair of dbspaces at a time to one server or the other probably over the course of several days (they are a service bureau - each database is a different customer's). Since the level 1 archives are taken every day and a level 0 weekly, they literally restored the "web" dbspace from a different archive than the root dbspace. How the engine knows that and why now, three years later, a restore must restore logical logs, I have no idea. I'm going to have to open a PMR because we cannot trust the archives at this point.

    Meanwhile, I was finally able to get the test system restored by using ifxclone to restore it as an RS secondary which did the logical restore automatically to sync the two servers. Then I converted the new server to "standard" and poof! Unfortunately when the new server came online as secondary the primary became hung on a checkpoint and had to be bounced. So, another reason to open a PMR. Something is not right with this server. Unfortunately the crew who built the cloud servers are long gone (they sold the company to a new crew who are my clients), so no one knows for certain what happened back then.

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 4.  RE: Odd restore failure

    Posted Fri May 12, 2023 07:09 AM
    Edited by Andreas Legner Sun May 14, 2023 05:21 PM

    Sounds a little fuzzy. Anyway, not envying you (haven other nice problems on my hands ;-) )



    ------------------------------
    Andreas Legner
    ------------------------------



  • 5.  RE: Odd restore failure

    Posted Sun May 14, 2023 03:41 AM
    Edited by Tomas Dalebjörk Sun May 14, 2023 03:43 AM

    Hi
    It was ages since I last worked with Informix, but of curiosity of how Informix handles online changes during a full backup.

    There are two types of backup, either it can be an online backup (also called hot backup), or an offline backup (also called cold backup).
    Some database types requires transactional logs to be backed up during the time when a hot backup is being performed, if there are no changes, than some database types doesnt produce any logs, and can therefor be restored without any logs.
    Some database types has the ability to incorporate the tranactional logs into the hot backup so that they are bundled together (db2 for example), and some has the ability to read the database data in a consistent way producing a backup (such as pg_dump, but I threats that as an export rather than a backup).

    If I recall, both onbar and ontape supports cold and warm (hot) backups, but I would guess that you have to have the transactional logs in order to be able to restore if the backup method was warm backup.

    Regards Tomas 
    CEO Spictera Ltd
    https://www.spictera.com



    ------------------------------
    Tomas Dalebjörk
    ------------------------------



  • 6.  RE: Odd restore failure

    Posted Sun May 14, 2023 05:21 PM

    I'd say the closest you can get with Informix to "cold backup", is with what is called external backup:
     - you create a checkpoint (point of on-disk consistency) while also blocking the server (any modifying activity)
     - while blocked, you copy all on-disk storage, with whatever means available

    Any other backups, onbar or ontape, occur with the server fully open for business. The differentiation to make here is between
     - "whole system backup": a single "archive checkpoint" for the entire backup which then will contain exactly this point (in time) of consistency
     - non-whole system backup: each storage space (dbspace, blobspace, sbspace) backed up separately, with its own archive checkpoint,
       so the complete backup will consist of objects consistent to different points in time

    A whole system backup can be restored by itself, without the need to also restore any transaction logs ("logical logs"); the backup contains everything that's required:
     - all used areas of all (non-temporary) storage spaces
     - a transaction log snapshot just big enough to roll back any transactions open at the archive checkpoint
     - the before images of all storage pages that underwent modification during the backup
    The before images section can be pretty sizable, depending on the duration of and DML traffic during the backup.

    Restoring a non-whole-system backup, on the other hand, will require some logical logs restored as well in order to bring the entire system to a state consistent across all spaces.  Each space's backup will also have its before image section (their overall size smaller than for whole system), a log snapshot is not needed/contained - since log will have to be restored anyway.

    Both types (plus the external backup/restore) support restoring as many logical (transaction) logs as there are (logical restore) after the space (physical) restore.

    ontape does whole system only while onbar offers the choice between the two.

    So to your question how online changes are handled during full backup:
     - "physical logging" of page before images gets duplicated, to physical log and backup
     - "logical logging" and backing up logical logs happens as usual while the backup is progressing

    Making sense?

    BR,
     Andreas



    ------------------------------
    Andreas Legner
    ------------------------------



  • 7.  RE: Odd restore failure

    Posted Mon May 15, 2023 09:56 AM

    Tomas:

    As Andreas pointed out, neither ontape nor onbar are used for a cold back up of an Informix server. When an Informix server is shutdown normally, a checkpoint precedes the shutdown and the disk image of the server is consistent, so it is simply sufficient to copy the chunk files/devices to a backup device in order to make a consistent, restorable, cold copy of the server.

    As far as how ontape and onbar work, you have to understand one thing to start: Informix was the first RDBMS that could be successfully archived while actively being modified by live transactions and for which such an archive could be used to restore the server, not only to its exact state at the time the archive was begun, but could even, with the restoration and processing of the transaction log backups (known in Informix parlance as the logical logs) recover to a consistent state as of the moment the last backed up logical log was saved.

    Note that I said that "an archive could be used to restore the server ... to its exact state at the time the archive was begun". That means that if I do not need the server's state to be one later than the start of the archive that I am restoring, I do not need to restore and roll forward the logical log backups! Normally, once the archives (level 0 and, if desired and available, level 1 and level 2 incremental archives) have been restored, bringing the engine to full online mode triggers what Informix calls "fast recovery". To understand fast recovery you need to know about the physical log and logical logs and what they contain. The physical log contains an unsullied image of each page modified since the last checkpoint and is the key to how Informix can restore a consistent archive taken while the server is being modified by active transactions. Any changes that happen during an archive will be recorded in the physical log as images of the unmodified pages. Those are copied into the archive to be restored overwriting any pages modified between the start of the archive and when the reads got to the modified pages in order to return the disk image to what it was at the beginning of the archive. The logical logs contain transactions and infrastructure changes as well as notations about checkpoints and other systemic events of import.

    When an Informix server starts up (or is taken online after an archive restore, fast recovery begins. The first thing that happens is that the physical log is checked to see if it is empty. After a crash the physical log will not be empty and so those physical log pages are written back to disk to restore the disk image to what it was at the last checkpoint. Then the logical logs written after the checkpoint (if any) or in the case of a startup after a restore any logical logs restored, are rolled forward to reprocess any transactions that happened after the last completed checkpoint. Finally incomplete transactions are rolled back and the server is placed online. 

    My problem was that the server refused to complete fast recovery because it claimed that "the web_#### dbspace was restored from a different archive than the root_dbspace" which would be very unusual.  But, as I noted in another update to this post, I was finally able to complete the restore a different way and I think I know how the "problem" with that "web" dbspace happened as well.



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------