Informix

 View Only
  • 1.  backup fail - 14.10.FC4W1

    Posted Thu November 09, 2023 01:44 AM

    Hello,

    Our backup stops to work (dev environment)

    RHEL 7.9

    informix  14.10.FC4W1

    There was no change before the backup failed

    no restart (440 days up)

    I if try manually

    ontape -s -L 0 >> $HOME/ontape.log 2>&1

    this is what i get:

    ontape.log

    Thu Nov  9 08:11:01 IST 2023

    Archive failed - ISAM error:  An error has occurred during archive back up.

    Program over.

    Online.log

    08:12:50  Maximum server connections 151

    08:12:50  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 13, Llog used 4

    08:12:51  shmat: [22]: operating system error

    08:12:51  Client could not attach server shared memory segment, use IFX_XFER_SHMBASE.

    08:12:51  Assert Warning: ISAM error:  An error has occurred during archive back up.

    08:12:51  IBM Informix Dynamic Server Version 14.10.FC4W1

    08:12:51   Who: Session(2990795, informix@att1, 24284, 0xbf0c3788)

                    Thread(3127632, ontape, 457795c8, 10)

                    File: rsarcutl.c Line: 115

    08:12:51   Action: init_archive()/ALLOC()

    08:12:51  stack trace for pid 26105 written to /IDS/informix/tmp/af.bd387862

    08:12:51   See Also: /IDS/informix/tmp/af.bd387862

    08:12:52  ISAM error:  An error has occurred during archive back up.

    08:17:52  Checkpoint Completed:  duration was 0 seconds.

    08:17:52  Thu Nov  9 - loguniq 57389, logpos 0x260018, timestamp: 0xa729f9cc Interval: 902595

    af.bd387862

    08:12:51

    08:12:51  IBM Informix Dynamic Server Version 14.10.FC4W1

    08:12:51  Assert Warning: ISAM error:  An error has occurred during archive back up.

    08:12:51   Who: Session(2990795, informix@att1, 24284, 0xbf0c3788)

                    Thread(3127632, ontape, 457795c8, 10)

                    File: rsarcutl.c Line: 115

    08:12:51   Action: init_archive()/ALLOC()

    08:12:51  SHM Globals and Master Pool/Master Block Adresses:

    08:12:51  shmcb =           0x000000004400e658

    08:12:51  rhead =           0x000000004407b800

    08:12:51  pool list =       0x000000004400e730

    08:12:51  block pool list = 0x0000000044075f38

    08:12:51  TRANSP =          0x00000000bf0c3788

    08:12:51  PARTP =           0x0000000000000000

    08:12:51  PARTNP =          0x0000000000000000

    08:12:51  OPENP =           0x00000000b5f41028

    08:12:51  FILEP =           0x00000000bb5c8f08

    08:12:51  Raw hex dump of stack located in /IDS/informix/tmp/af.bd387862.rawstk

    08:12:51  Stack for thread: 3127632 ontape

     base: 0x00000000c249f000

      len:   135168

       pc: 0x00000000014a458d

      tos: 0x00000000c24b96a0

    state: running

       vp: 10

    0x00000000014a458d (oninit) afstack

    0x00000000014a962d (oninit) afhandler

    0x00000000014a9ce2 (oninit) afwarn_interface

    0x0000000000f50cc1 (oninit) oops_error

    0x0000000000f67102 (oninit) init_archive

    0x0000000000f690c2 (oninit) isopen_arcbu

    0x00000000007fe5d4 (oninit) sqisopen_arcbu

    0x0000000000bc21f1 (oninit) tbj_open_archive

    0x0000000000b8eeae (oninit) sqmain

    0x00000000015d2e3b (oninit) spawn_thread

    0x0000000001495103 (oninit) th_init_initgls

    0x00000000014dc0bf (oninit) startup

      siginfo: <NULL>

    08:12:51   See Also: /IDS/informix/tmp/af.bd387862

    ---------------------------------

    Begin System Alarm Program Output

    ---------------------------------

    Assertion Failure Type: Warning

    Host Name:              att1

    Database Server Name:   att1

    Time of failure:        Thu Nov  9 08:12:52 IST 2023

    AF file:                /IDS/informix/tmp/af.bd387862

    Shared memory file:     None

    System Blocking:        OFF

    -------------------------------

    End System Alarm Program Output

    -------------------------------

    08:12:52  sh /IDS/informix/etc/evidence.sh 1 0 /IDS/informix/tmp/af.bd387862 2990795 0x457795c8 3127632 0xbad9e760 1025 0 0 0 0

    08:12:52

    ------------------ End of assertion failure 0 -----------------

    Any idea ?

    Thanks

    Sam



    ------------------------------
    Samuel To
    ------------------------------


  • 2.  RE: backup fail - 14.10.FC4W1

    Posted Thu November 09, 2023 03:19 AM

    Looks like you are out of shared memory, which given your uptime
    is not unlikely.

    Run:  "ipcs -m" to show all the shared memory segments in use.
    Then run "ipcrm <id>" using the value under "shmid" column to remove
    any that are not needed.  How you tell which are needed is an exercise
    for the reader.  If you have any from users that are not logged in
    or running processes, kill those off first.

    If you can afford database downtime, restarting Informix should free
    any shared memory it is using.

    Worst case rebooting the server will clear them all out.

    RTFM:  ipcs(1), ipcrm(1), shmat(2)

    scot




  • 3.  RE: backup fail - 14.10.FC4W1

    Posted Thu November 09, 2023 03:40 AM

    Hello Scot

    I'll restart informix.

    But I still have a question, could I know which ones to kill

    " If you have any from users that are not logged in
    or running processes, kill those off first. "

    When the owners are root or informix ?

    Thanks

    att1 > ipcs -m

    ------ Shared Memory Segments --------
    key        shmid      owner      perms      bytes      nattch     status
    0x52564826 1835008    informix   660        22028288   28
    0x52564827 1835009    informix   660        44048384   28
    0x52564801 294914     root       660        4911104    28
    0x52564802 294915     root       660        33439744   28
    0x52564803 294916     root       660        1815015424 28
    0x52564804 294917     root       660        8388608    28
    0x52564805 294918     root       666        1413120    28
    0x52564806 294919     informix   666        1413120    28
    0x52564807 294920     informix   666        1413120    28
    ...

    long list

    ...



    ------------------------------
    Samuel To
    ------------------------------



  • 4.  RE: backup fail - 14.10.FC4W1

    Posted Thu November 09, 2023 03:52 AM

    Use:  ipcs -mp

    The -p shows the PID of the process that created the shm segment.
    Then you can use "ps -ef PID" to see what process it was.

    The 'informix' ones were obviously created by the Informix database.
    The 'root' ones...no idea.  You have to look at the processes to see
    if you can kill them or not.  If they are root owned, you'll need to
    be root to ipcrm them.




  • 5.  RE: backup fail - 14.10.FC4W1

    Posted Thu November 09, 2023 04:08 AM

    OK

    Thanks



    ------------------------------
    Samuel To
    ------------------------------



  • 6.  RE: backup fail - 14.10.FC4W1

    IBM Champion
    Posted Thu November 09, 2023 06:35 AM

    Samuel:

    I would just reboot the entire system. That will free up any left over shared memory etc.

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 7.  RE: backup fail - 14.10.FC4W1

    Posted Thu November 09, 2023 06:59 AM

    Fine, thanks



    ------------------------------
    Samuel To
    ------------------------------