Original Message:
Sent: Thu November 09, 2023 03:52 AM
From: Scot Jenkins
Subject: backup fail - 14.10.FC4W1
Use: ipcs -mp
The -p shows the PID of the process that created the shm segment.
Then you can use "ps -ef PID" to see what process it was.
The 'informix' ones were obviously created by the Informix database.
The 'root' ones...no idea. You have to look at the processes to see
if you can kill them or not. If they are root owned, you'll need to
be root to ipcrm them.
Original Message:
Sent: Thu November 09, 2023 03:39 AM
From: Sh To
Subject: backup fail - 14.10.FC4W1
Hello Scot
I'll restart informix.
But I still have a question, could I know which ones to kill
" If you have any from users that are not logged in
or running processes, kill those off first. "
When the owners are root or informix ?
Thanks
att1 > ipcs -m<o:p></o:p>
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x52564826 1835008 informix 660 22028288 28
0x52564827 1835009 informix 660 44048384 28
0x52564801 294914 root 660 4911104 28
0x52564802 294915 root 660 33439744 28
0x52564803 294916 root 660 1815015424 28
0x52564804 294917 root 660 8388608 28
0x52564805 294918 root 666 1413120 28
0x52564806 294919 informix 666 1413120 28
0x52564807 294920 informix 666 1413120 28
...
<o:p></o:p>
long list
...
------------------------------
Samuel To
Original Message:
Sent: Thu November 09, 2023 03:18 AM
From: Scot Jenkins
Subject: backup fail - 14.10.FC4W1
Looks like you are out of shared memory, which given your uptime
is not unlikely.
Run: "ipcs -m" to show all the shared memory segments in use.
Then run "ipcrm <id>" using the value under "shmid" column to remove
any that are not needed. How you tell which are needed is an exercise
for the reader. If you have any from users that are not logged in
or running processes, kill those off first.
If you can afford database downtime, restarting Informix should free
any shared memory it is using.
Worst case rebooting the server will clear them all out.
RTFM: ipcs(1), ipcrm(1), shmat(2)
scot
Original Message:
Sent: Thu November 09, 2023 01:44 AM
From: Sh To
Subject: backup fail - 14.10.FC4W1
Hello,
Our backup stops to work (dev environment)
RHEL 7.9
informix 14.10.FC4W1
There was no change before the backup failed
no restart (440 days up)
I if try manually
ontape -s -L 0 >> $HOME/ontape.log 2>&1<o:p></o:p>
this is what i get:
ontape.log
Thu Nov 9 08:11:01 IST 2023<o:p></o:p>
Archive failed - ISAM error: An error has occurred during archive back up.<o:p></o:p>
Program over.<o:p></o:p>
Online.log
08:12:50 Maximum server connections 151<o:p></o:p>
08:12:50 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 13, Llog used 4<o:p></o:p>
<o:p> </o:p>
08:12:51 shmat: [22]: operating system error<o:p></o:p>
08:12:51 Client could not attach server shared memory segment, use IFX_XFER_SHMBASE.<o:p></o:p>
08:12:51 Assert Warning: ISAM error: An error has occurred during archive back up.<o:p></o:p>
<o:p> </o:p>
08:12:51 IBM Informix Dynamic Server Version 14.10.FC4W1<o:p></o:p>
08:12:51 Who: Session(2990795, informix@att1, 24284, 0xbf0c3788)<o:p></o:p>
Thread(3127632, ontape, 457795c8, 10)<o:p></o:p>
File: rsarcutl.c Line: 115<o:p></o:p>
08:12:51 Action: init_archive()/ALLOC()<o:p></o:p>
08:12:51 stack trace for pid 26105 written to /IDS/informix/tmp/af.bd387862<o:p></o:p>
08:12:51 See Also: /IDS/informix/tmp/af.bd387862<o:p></o:p>
08:12:52 ISAM error: An error has occurred during archive back up.<o:p></o:p>
<o:p> </o:p>
08:17:52 Checkpoint Completed: duration was 0 seconds.<o:p></o:p>
08:17:52 Thu Nov 9 - loguniq 57389, logpos 0x260018, timestamp: 0xa729f9cc Interval: 902595<o:p></o:p>
af.bd387862
08:12:51<o:p></o:p>
08:12:51 IBM Informix Dynamic Server Version 14.10.FC4W1<o:p></o:p>
<o:p> </o:p>
08:12:51 Assert Warning: ISAM error: An error has occurred during archive back up.<o:p></o:p>
<o:p> </o:p>
08:12:51 Who: Session(2990795, informix@att1, 24284, 0xbf0c3788)<o:p></o:p>
Thread(3127632, ontape, 457795c8, 10)<o:p></o:p>
File: rsarcutl.c Line: 115<o:p></o:p>
08:12:51 Action: init_archive()/ALLOC()<o:p></o:p>
08:12:51 SHM Globals and Master Pool/Master Block Adresses:<o:p></o:p>
<o:p> </o:p>
08:12:51 shmcb = 0x000000004400e658<o:p></o:p>
08:12:51 rhead = 0x000000004407b800<o:p></o:p>
08:12:51 pool list = 0x000000004400e730<o:p></o:p>
08:12:51 block pool list = 0x0000000044075f38<o:p></o:p>
08:12:51 TRANSP = 0x00000000bf0c3788<o:p></o:p>
08:12:51 PARTP = 0x0000000000000000<o:p></o:p>
08:12:51 PARTNP = 0x0000000000000000<o:p></o:p>
08:12:51 OPENP = 0x00000000b5f41028<o:p></o:p>
08:12:51 FILEP = 0x00000000bb5c8f08<o:p></o:p>
08:12:51 Raw hex dump of stack located in /IDS/informix/tmp/af.bd387862.rawstk<o:p></o:p>
08:12:51 Stack for thread: 3127632 ontape<o:p></o:p>
<o:p> </o:p>
base: 0x00000000c249f000<o:p></o:p>
len: 135168<o:p></o:p>
pc: 0x00000000014a458d<o:p></o:p>
tos: 0x00000000c24b96a0<o:p></o:p>
state: running<o:p></o:p>
vp: 10<o:p></o:p>
<o:p> </o:p>
0x00000000014a458d (oninit) afstack<o:p></o:p>
0x00000000014a962d (oninit) afhandler<o:p></o:p>
0x00000000014a9ce2 (oninit) afwarn_interface<o:p></o:p>
0x0000000000f50cc1 (oninit) oops_error<o:p></o:p>
0x0000000000f67102 (oninit) init_archive<o:p></o:p>
0x0000000000f690c2 (oninit) isopen_arcbu<o:p></o:p>
0x00000000007fe5d4 (oninit) sqisopen_arcbu<o:p></o:p>
0x0000000000bc21f1 (oninit) tbj_open_archive<o:p></o:p>
0x0000000000b8eeae (oninit) sqmain<o:p></o:p>
0x00000000015d2e3b (oninit) spawn_thread<o:p></o:p>
0x0000000001495103 (oninit) th_init_initgls<o:p></o:p>
0x00000000014dc0bf (oninit) startup<o:p></o:p>
<o:p> </o:p>
<o:p> </o:p>
<o:p> </o:p>
siginfo: <NULL><o:p></o:p>
<o:p> </o:p>
08:12:51 See Also: /IDS/informix/tmp/af.bd387862<o:p></o:p>
<o:p> </o:p>
---------------------------------<o:p></o:p>
Begin System Alarm Program Output<o:p></o:p>
---------------------------------<o:p></o:p>
<o:p> </o:p>
Assertion Failure Type: Warning<o:p></o:p>
Host Name: att1<o:p></o:p>
Database Server Name: att1<o:p></o:p>
Time of failure: Thu Nov 9 08:12:52 IST 2023<o:p></o:p>
AF file: /IDS/informix/tmp/af.bd387862<o:p></o:p>
Shared memory file: None<o:p></o:p>
System Blocking: OFF<o:p></o:p>
<o:p> </o:p>
<o:p> </o:p>
-------------------------------<o:p></o:p>
End System Alarm Program Output<o:p></o:p>
-------------------------------<o:p></o:p>
<o:p> </o:p>
08:12:52 sh /IDS/informix/etc/evidence.sh 1 0 /IDS/informix/tmp/af.bd387862 2990795 0x457795c8 3127632 0xbad9e760 1025 0 0 0 0<o:p></o:p>
08:12:52<o:p></o:p>
------------------ End of assertion failure 0 -----------------<o:p></o:p>
Any idea ?
Thanks
Sam
------------------------------
Samuel To
------------------------------