Informix

Connect with Db2, Informix, Netezza, open source, and other data experts to gain value from your data, share insights, and solve problems.

View Only

Back to discussions

Expand all | Collapse all

down chunk

1. down chunk

Mark Collins

Posted Wed April 01, 2020 11:18 AM

Informix 11.50.FC6 on HP-UX 11.31 PA-RISC

We're working toward upgrading to 14.10 on Linux, but until then, we're running the environment shown above. Since 11.50 is way past EOL, we're kind of out of luck for support from IBM/HCL. If it was a production system, I might try to push them for support, but since it's not technically a "system down" situation, I doubt I'd have much luck.

The current problem involves our disaster recovery server, which is kept nearly up-to-date via Continuous Log Restore. Every 15 minutes, our primary server does an 'onmode -l' to change logical log files, backs up any logical logs used during those 15 minutes, then transfers the backups to the DR server. A job on the DR server then applies the log backups to the instance.

This has been running for months without a problem We've occasionally brought the DR server to online mode to confirm that everything is working correctly, then restored from a level 0 and restarted the Continuous Log Restore.

Last night, we got an error which resulted in a down chunk on our DR server:

23:16:00  Resuming Logical Restore
23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.
23:16:02  Checkpoint Completed:  duration was 0 seconds.
23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 5170003

23:16:02  Maximum server connections 0
23:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 0

23:16:03  Checkpoint Completed:  duration was 0 seconds.
23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 5170004

23:16:03  Maximum server connections 0
23:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 0

23:16:03  Suspending Logical Restore
23:31:26  Resuming Logical Restore
23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.
23:31:30  Rollforward of log record failed. iserrno = 0
23:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074
23:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.
23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE
23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)
                Thread(98, xchg_2.0, c00000014568b0a8, 1)
                File: rsmirror.c Line: 1794
23:31:43   Results: Dynamic Server will block at next checkpoint
23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)
23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b12
23:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12
23:31:44  Chunk 7 is being taken OFFLINE.
23:31:44  Rollforward of log record failed. iserrno = 0
23:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074
23:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.
23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112
c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........
c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.
c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......
c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........
c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......
c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........
c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........
c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........
c000000146da00e0: 000000d7                              ....

logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10
c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........
c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.
c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..
c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........

logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112
c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........
c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.
c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......
c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........
c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......
c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........
c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........
c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........
c000000146db10e0: 000000d7                              ....
23:31:30  End of queued log recs
Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074
c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........
c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.D
c000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....
c000000146f6b090: 00000080                              ....
23:31:43
23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B000000

23:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.
23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)
                Thread(98, xchg_2.0, c00000014568b0a8, 1)
                File: rsmirror.c Line: 1794
23:31:43   Results: Dynamic Server will block at next checkpoint
23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)
23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk
23:31:43  Stack for thread: 98 xchg_2.0

 base: 0xc000000147673000
  len:   69632
   pc: 0x0000000000000000
  tos: 0xc000000147675380
state: running
   vp: 1

( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]
( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]
( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]
( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]
( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]
( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]
( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]
( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]
( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]
( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit]
(10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit]
(11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit]
(12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit]
(13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit]

 base: 0xc000000147673000
  len:   69632
   pc: 0x0000000000000000
  tos: 0xc000000147675380
state: running
   vp: 1



23:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12

---------------------------------
Begin System Alarm Program Output
---------------------------------

Assertion Failure Type: Warning
Host Name:              drserver
Database Server Name:   my_instance
Time of failure:        Tue Mar 31 23:31:44 EDT 2020
AF file:                /ifmx_dump/my_instance/af.44a0b12
Shared memory file:     None
System Blocking:        OFF

I'm not sure what the OLDRSAM:CHALLOC entry is showing. Is it saying that the table (partition) added an extent?

Our production instance is running with no reported problems. I've looked in the online.log for the relevant time period and there is nothing other than log complete/backup started/backup completed messages, and some checkpoint messages. Since the log backups came from there, I would expect any problems other than a failed disk to show up on that server as well, but as I said, it looks fine. Users are on the system, doing their normal work.

Our Unix sysadm has looked in syslog and and dmesg, but does not see anything that looks out of place. He also ran ioscan, and no issues were found. Looking at vgdisplay shows all volumes syncd and available. He has not run chkdsk yet, as the volume group is a RAID 10 striped across several disks, so it would take a while to complete.

Any suggestions on what to look for? I can just restore from the latest Level 0 archive and restart the continuous log restore, but I'd really like to be sure that there are no underlying problems first.

------------------------------
Mark Collins
------------------------------

#Informix

2. RE: down chunk

Eric Vercelletto

Posted Wed April 01, 2020 11:26 AM

Mark, did you try running oncheck on the CDR server ?

You look to have corruptions in chunk 7, that would be the cause of down chunk. Take a look at the .af file and see what it says.

Is this RSS that you are suing to replicate?

Original Message

Original Message------

23:16:00  Resuming Logical Restore
23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.
23:16:02  Checkpoint Completed:  duration was 0 seconds.
23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 5170003

23:16:02  Maximum server connections 0
23:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 0

23:16:03  Checkpoint Completed:  duration was 0 seconds.
23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 5170004

23:16:03  Maximum server connections 0
23:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 0

23:16:03  Suspending Logical Restore
23:31:26  Resuming Logical Restore
23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.
23:31:30  Rollforward of log record failed. iserrno = 0
23:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074
23:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.
23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE
23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)
                Thread(98, xchg_2.0, c00000014568b0a8, 1)
                File: rsmirror.c Line: 1794
23:31:43   Results: Dynamic Server will block at next checkpoint
23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)
23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b12
23:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12
23:31:44  Chunk 7 is being taken OFFLINE.
23:31:44  Rollforward of log record failed. iserrno = 0
23:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074
23:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.
23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112
c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........
c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.
c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......
c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........
c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......
c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........
c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........
c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........
c000000146da00e0: 000000d7                              ....

logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10
c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........
c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.
c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..
c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........

logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112
c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........
c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.
c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......
c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........
c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......
c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........
c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........
c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........
c000000146db10e0: 000000d7                              ....
23:31:30  End of queued log recs
Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074
c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........
c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.D
c000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....
c000000146f6b090: 00000080                              ....
23:31:43
23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B000000

23:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.
23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)
                Thread(98, xchg_2.0, c00000014568b0a8, 1)
                File: rsmirror.c Line: 1794
23:31:43   Results: Dynamic Server will block at next checkpoint
23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)
23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk
23:31:43  Stack for thread: 98 xchg_2.0

 base: 0xc000000147673000
  len:   69632
   pc: 0x0000000000000000
  tos: 0xc000000147675380
state: running
   vp: 1

( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]
( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]
( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]
( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]
( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]
( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]
( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]
( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]
( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]
( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit]
(10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit]
(11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit]
(12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit]
(13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit]

 base: 0xc000000147673000
  len:   69632
   pc: 0x0000000000000000
  tos: 0xc000000147675380
state: running
   vp: 1



23:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12

---------------------------------
Begin System Alarm Program Output
---------------------------------

Assertion Failure Type: Warning
Host Name:              drserver
Database Server Name:   my_instance
Time of failure:        Tue Mar 31 23:31:44 EDT 2020
AF file:                /ifmx_dump/my_instance/af.44a0b12
Shared memory file:     None
System Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

3. RE: down chunk

Mark Collins

Posted Wed April 01, 2020 11:40 AM

Eric,

We're not using RSS or CDR in this situation. It is the Continuous Log Restore, where the ontape is called iteratively to apply new logical log backups over a period of time. As each logical log is applied, the instance is left in a state that allows more logical log backups to be applied. The plan is that if we ever have an actual emergency and have to switch over to the DR server, we just run one last ontape command to bring the instance online.

I'm going to see if I can find the table tied to that partition so that I can run oncheck against it. I wish there was an oncheck option to check a whole chunk (or dbspace) at a time.

------------------------------
Mark Collins
------------------------------

Original Message

Original Message:
Sent: Wed April 01, 2020 11:26 AM
From: Eric Vercelletto
Subject: down chunk

Mark, did you try running oncheck on the CDR server ?

You look to have corruptions in chunk 7, that would be the cause of down chunk. Take a look at the .af file and see what it says.

Is this RSS that you are suing to replicate?

Original Message------

23:16:00  Resuming Logical Restore23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.23:16:02  Checkpoint Completed:  duration was 0 seconds.23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 517000323:16:02  Maximum server connections 023:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 023:16:03  Checkpoint Completed:  duration was 0 seconds.23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 517000423:16:03  Maximum server connections 023:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 023:16:03  Suspending Logical Restore23:31:26  Resuming Logical Restore23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.23:31:30  Rollforward of log record failed. iserrno = 023:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b1223:31:43   See Also: /ifmx_dump/my_instance/af.44a0b1223:31:44  Chunk 7 is being taken OFFLINE.23:31:44  Rollforward of log record failed. iserrno = 023:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........c000000146da00e0: 000000d7                              ....logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........c000000146db10e0: 000000d7                              ....23:31:30  End of queued log recsLog Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.Dc000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....c000000146f6b090: 00000080                              ....23:31:4323:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B00000023:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk23:31:43  Stack for thread: 98 xchg_2.0 base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 1( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit](10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit](11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit](12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit](13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit] base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 123:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12---------------------------------Begin System Alarm Program Output---------------------------------Assertion Failure Type: WarningHost Name:              drserverDatabase Server Name:   my_instanceTime of failure:        Tue Mar 31 23:31:44 EDT 2020AF file:                /ifmx_dump/my_instance/af.44a0b12Shared memory file:     NoneSystem Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

4. RE: down chunk

Like
Eric Vercelletto
Posted Wed April 01, 2020 11:57 AM

Reply
I see

This oncheck option would be a good candidate for Request for enhancement ...

here
https://ibm-data-and-ai.ideas.aha.io/?project=INFX

I will vote for it!
Eric Vercelletto
Data Management Architect and Owner / Begooden IT Consulting
Board of Directors, International Informix Users group
IBM Champion 2013,2014,2015,2016,2017,2018,2019,2020

Tel: +33(0) 298 51 3210
Mob : +33(0)626 52 50 68
skype: begooden-it
Google Hangout: eric.vercelletto@begooden-it.com
Email: eric.vercelletto@begooden-it.com
www : http://www.vercelletto.com
www https://kandooerp.org

Original Message
5. RE: down chunk

Like
Mark Collins
Posted Wed April 01, 2020 12:01 PM

Reply
I like your thinking. I'll get on that shortly.

------------------------------
Mark Collins
------------------------------

Original Message

6. RE: down chunk

Mark Collins

Posted Wed April 01, 2020 11:56 AM

I found the database:table combination identified by the partnum (pn) in the HINSERT log record, and ran oncheck -cd against that table. When I run it on our production instance, it does not report anything, and returns with a 0 return code. When I run it on our DR server (where the error occurred), I get "ISAM error: Primary and Mirror chunks are bad", with return code = 2. It ran so quickly that I'm not sure whether it actually looked at the table pages on the disk, or if it is basing its assessment on the fact that the chunk is down.

------------------------------
Mark Collins
------------------------------

Original Message

Original Message:
Sent: Wed April 01, 2020 11:26 AM
From: Eric Vercelletto
Subject: down chunk

Mark, did you try running oncheck on the CDR server ?

You look to have corruptions in chunk 7, that would be the cause of down chunk. Take a look at the .af file and see what it says.

Is this RSS that you are suing to replicate?

Original Message------

23:16:00  Resuming Logical Restore23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.23:16:02  Checkpoint Completed:  duration was 0 seconds.23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 517000323:16:02  Maximum server connections 023:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 023:16:03  Checkpoint Completed:  duration was 0 seconds.23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 517000423:16:03  Maximum server connections 023:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 023:16:03  Suspending Logical Restore23:31:26  Resuming Logical Restore23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.23:31:30  Rollforward of log record failed. iserrno = 023:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b1223:31:43   See Also: /ifmx_dump/my_instance/af.44a0b1223:31:44  Chunk 7 is being taken OFFLINE.23:31:44  Rollforward of log record failed. iserrno = 023:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........c000000146da00e0: 000000d7                              ....logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........c000000146db10e0: 000000d7                              ....23:31:30  End of queued log recsLog Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.Dc000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....c000000146f6b090: 00000080                              ....23:31:4323:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B00000023:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk23:31:43  Stack for thread: 98 xchg_2.0 base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 1( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit](10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit](11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit](12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit](13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit] base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 123:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12---------------------------------Begin System Alarm Program Output---------------------------------Assertion Failure Type: WarningHost Name:              drserverDatabase Server Name:   my_instanceTime of failure:        Tue Mar 31 23:31:44 EDT 2020AF file:                /ifmx_dump/my_instance/af.44a0b12Shared memory file:     NoneSystem Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

7. RE: down chunk

Like
Eric Vercelletto
Posted Wed April 01, 2020 12:08 PM

Reply
That's exactly where Tech Support could help ☹

Look, once I had a customer in a more or less similar situation (wanting to migrate to 14.10, which I obviously recommend).
TS has tools to look deep into your chunks, so what I would do is talk with your IBM representative (if he ever know what means the word INFORMIX), commit in some way that you are migrating to 14.10 and ask for special authorization to benefit for exceptional support.

My customer had stopped paying for support, he could negotiate reinstating TS contract then finally TS could solve the problem.

Worth a try, but I might lose some friend here ��

Eric

Eric Vercelletto
Data Management Architect and Owner / Begooden IT Consulting
Board of Directors, International Informix Users group
IBM Champion 2013,2014,2015,2016,2017,2018,2019,2020

Tel: +33(0) 298 51 3210
Mob : +33(0)626 52 50 68
skype: begooden-it
Google Hangout: eric.vercelletto@begooden-it.com
Email: eric.vercelletto@begooden-it.com
www : http://www.vercelletto.com
www https://kandooerp.org

Original Message

8. RE: down chunk

Andreas Legner

IBM Champion

Posted Wed April 01, 2020 12:16 PM

Seeing that CHALLOC failure, with CHALLOC indeed being an extent ofcurrently FREE pages being asigned/allocated to an object, typically a partition, one had to suspect some sort of extent poblem.
As this replicated over from primary, there's a chance the same problem got introduced there, unnoticed.

To be sure your real problem isn't on the primary (and might be the cause there for further havoc) I'd first run an 'oncheck -ce <dbspace_name>' on the dbspace containing chunk #7. Should that come back clean, you'd be good to recreate the 'secondary' from a fresh backup. Should it show an extent overlap or other error, one had to see from there - yet make sure you're not loosing your latest backup from before the initial problem.

------------------------------
Andreas Legner
------------------------------

Original Message

Original Message:
Sent: Wed April 01, 2020 11:17 AM
From: Mark Collins
Subject: down chunk

23:16:00  Resuming Logical Restore23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.23:16:02  Checkpoint Completed:  duration was 0 seconds.23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 517000323:16:02  Maximum server connections 023:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 023:16:03  Checkpoint Completed:  duration was 0 seconds.23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 517000423:16:03  Maximum server connections 023:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 023:16:03  Suspending Logical Restore23:31:26  Resuming Logical Restore23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.23:31:30  Rollforward of log record failed. iserrno = 023:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b1223:31:43   See Also: /ifmx_dump/my_instance/af.44a0b1223:31:44  Chunk 7 is being taken OFFLINE.23:31:44  Rollforward of log record failed. iserrno = 023:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........c000000146da00e0: 000000d7                              ....logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........c000000146db10e0: 000000d7                              ....23:31:30  End of queued log recsLog Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.Dc000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....c000000146f6b090: 00000080                              ....23:31:4323:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B00000023:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk23:31:43  Stack for thread: 98 xchg_2.0 base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 1( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit](10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit](11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit](12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit](13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit] base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 123:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12---------------------------------Begin System Alarm Program Output---------------------------------Assertion Failure Type: WarningHost Name:              drserverDatabase Server Name:   my_instanceTime of failure:        Tue Mar 31 23:31:44 EDT 2020AF file:                /ifmx_dump/my_instance/af.44a0b12Shared memory file:     NoneSystem Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

9. RE: down chunk

Mark Collins

Posted Wed April 01, 2020 12:56 PM

Andreas,

The oncheck -ce did not report any errors on the primary. When I run it on the DR server, I get:

> oncheck -ce

Validating extents for Space 'rootdbs' ...
ERROR: Failed to get header page for partnum 0x100001 (buffer may be locked).
       Please limit DDL/DML activity when running this command.

Validating extents for Space 'tempdbs1' ...

 Chunk Pathname                             Pagesize(k)  Size(p)  Used(p)  Free(p)
     2 /informix/links/rdb1                           2  1750000       53  1749947


Validating extents for Space 'llogsdbs' ...

Validating extents for Space 'indexdbs' ...
ERROR: Failed to get header page for partnum 0x400001 (buffer may be locked).
       Please limit DDL/DML activity when running this command.

Validating extents for Space 'tempdbs2' ...

 Chunk Pathname                             Pagesize(k)  Size(p)  Used(p)  Free(p)
     5 /informix/links/rdb2                           2  1750000       53  1749947


Validating extents for Space 'tempdbs3' ...

 Chunk Pathname                             Pagesize(k)  Size(p)  Used(p)  Free(p)
     9 /informix/links/rdb3                           2  1750000       53  1749947


Validating extents for Space 'trainingdbs' ...
ERROR: Failed to get header page for partnum 0x900001 (buffer may be locked).
       Please limit DDL/DML activity when running this command.

Validating extents for Space 'plog2dbs' ...

 Chunk Pathname                             Pagesize(k)  Size(p)  Used(p)  Free(p)
    21 /informix/links/rdb9                           2  3145728  3145728        0

I don't know whether the fact that the DR instance is still in Fast Recovery mode (due to the Continuous Log Restore) is causing any of those errors.

------------------------------
Mark Collins
------------------------------

Original Message

Original Message:
Sent: Wed April 01, 2020 12:16 PM
From: Andreas Legner
Subject: down chunk

Seeing that CHALLOC failure, with CHALLOC indeed being an extent ofcurrently FREE pages being asigned/allocated to an object, typically a partition, one had to suspect some sort of extent poblem.
As this replicated over from primary, there's a chance the same problem got introduced there, unnoticed.

To be sure your real problem isn't on the primary (and might be the cause there for further havoc) I'd first run an 'oncheck -ce <dbspace_name>' on the dbspace containing chunk #7. Should that come back clean, you'd be good to recreate the 'secondary' from a fresh backup. Should it show an extent overlap or other error, one had to see from there - yet make sure you're not loosing your latest backup from before the initial problem.

------------------------------
Andreas Legner

Original Message:
Sent: Wed April 01, 2020 11:17 AM
From: Mark Collins
Subject: down chunk

23:16:00  Resuming Logical Restore23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.23:16:02  Checkpoint Completed:  duration was 0 seconds.23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 517000323:16:02  Maximum server connections 023:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 023:16:03  Checkpoint Completed:  duration was 0 seconds.23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 517000423:16:03  Maximum server connections 023:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 023:16:03  Suspending Logical Restore23:31:26  Resuming Logical Restore23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.23:31:30  Rollforward of log record failed. iserrno = 023:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b1223:31:43   See Also: /ifmx_dump/my_instance/af.44a0b1223:31:44  Chunk 7 is being taken OFFLINE.23:31:44  Rollforward of log record failed. iserrno = 023:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........c000000146da00e0: 000000d7                              ....logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........c000000146db10e0: 000000d7                              ....23:31:30  End of queued log recsLog Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.Dc000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....c000000146f6b090: 00000080                              ....23:31:4323:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B00000023:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk23:31:43  Stack for thread: 98 xchg_2.0 base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 1( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit](10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit](11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit](12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit](13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit] base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 123:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12---------------------------------Begin System Alarm Program Output---------------------------------Assertion Failure Type: WarningHost Name:              drserverDatabase Server Name:   my_instanceTime of failure:        Tue Mar 31 23:31:44 EDT 2020AF file:                /ifmx_dump/my_instance/af.44a0b12Shared memory file:     NoneSystem Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

10. RE: down chunk

Like
Mark Jalkiewicz
Posted Wed April 01, 2020 03:54 PM

Reply
not sure if this defect is applicable to a CLR instance but worth a look since you cannot obtain support:

https://www.ibm.com/support/pages/node/4915095

IC68817: MISMATCH IN MIRROR SETTING IN ONCONFIG CAN LEAD TO CHALLOC ROLLFORWARD ERRORS ON ALL TYPES OF SECONDARY SERVERS

Original Message
11. RE: down chunk

Like
Mark Collins
Posted Wed April 01, 2020 07:03 PM

Reply
Mark,

Thanks. Both servers have same settings for mirroring - no mirroring enabled.

Still searching.

------------------------------
Mark Collins
------------------------------

Original Message

12. RE: down chunk

Art Kagel

IBM Champion

Posted Wed April 01, 2020 01:02 PM

If your admins don't see any problems with the disk structures on the DR site then I would run a set of onchecks on the primary during the next quiet time and if all is well there then take and restore a level 0 archive and put the server into log restore mode again. I suggest the following:

oncheck -cR
oncheck -ce
oncheck -cc <for each database>

That should be sufficient since the problem appears to be structural rather than a data/index problem. The following are therefore optional:

oncheck -cDI <for each database>
oncheck -cS

Of the first group, only the -cR will take much time (though if you have lots of dumb blobs the -ce may take a while as it validates the blobspace pages.

------------------------------
Art Kagel
------------------------------

Original Message

Original Message:
Sent: Wed April 01, 2020 11:17 AM
From: Mark Collins
Subject: down chunk

23:16:00  Resuming Logical Restore23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.23:16:02  Checkpoint Completed:  duration was 0 seconds.23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 517000323:16:02  Maximum server connections 023:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 023:16:03  Checkpoint Completed:  duration was 0 seconds.23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 517000423:16:03  Maximum server connections 023:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 023:16:03  Suspending Logical Restore23:31:26  Resuming Logical Restore23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.23:31:30  Rollforward of log record failed. iserrno = 023:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b1223:31:43   See Also: /ifmx_dump/my_instance/af.44a0b1223:31:44  Chunk 7 is being taken OFFLINE.23:31:44  Rollforward of log record failed. iserrno = 023:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........c000000146da00e0: 000000d7                              ....logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........c000000146db10e0: 000000d7                              ....23:31:30  End of queued log recsLog Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.Dc000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....c000000146f6b090: 00000080                              ....23:31:4323:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B00000023:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk23:31:43  Stack for thread: 98 xchg_2.0 base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 1( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit](10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit](11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit](12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit](13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit] base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 123:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12---------------------------------Begin System Alarm Program Output---------------------------------Assertion Failure Type: WarningHost Name:              drserverDatabase Server Name:   my_instanceTime of failure:        Tue Mar 31 23:31:44 EDT 2020AF file:                /ifmx_dump/my_instance/af.44a0b12Shared memory file:     NoneSystem Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

13. RE: down chunk

Mark Collins

Posted Wed April 01, 2020 01:13 PM

Art,

I've run the oncheck -ce in response to Andreas' post, and it did not report any problems on the primary. I just now ran the oncheck -cc for the database identified based on partnum from the HINSERT log entry, and the only thing that it reported was no sysdepend record for a view, but the message said that this could be ignored for views on tables in external databases, which this is, so no problems there.

I will try to get the oncheck -cR later today. I did run oncheck -cr, and it did not find anything. I know, the -cR will be much more thorough, but I figured it wouldn't hurt to do the quicker version first.

------------------------------
Mark Collins
------------------------------

Original Message

Original Message:
Sent: Wed April 01, 2020 01:02 PM
From: Art Kagel
Subject: down chunk

------------------------------
Art Kagel

Original Message:
Sent: Wed April 01, 2020 11:17 AM
From: Mark Collins
Subject: down chunk

23:16:00  Resuming Logical Restore23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.23:16:02  Checkpoint Completed:  duration was 0 seconds.23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 517000323:16:02  Maximum server connections 023:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 023:16:03  Checkpoint Completed:  duration was 0 seconds.23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 517000423:16:03  Maximum server connections 023:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 023:16:03  Suspending Logical Restore23:31:26  Resuming Logical Restore23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.23:31:30  Rollforward of log record failed. iserrno = 023:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b1223:31:43   See Also: /ifmx_dump/my_instance/af.44a0b1223:31:44  Chunk 7 is being taken OFFLINE.23:31:44  Rollforward of log record failed. iserrno = 023:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........c000000146da00e0: 000000d7                              ....logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........c000000146db10e0: 000000d7                              ....23:31:30  End of queued log recsLog Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.Dc000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....c000000146f6b090: 00000080                              ....23:31:4323:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B00000023:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk23:31:43  Stack for thread: 98 xchg_2.0 base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 1( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit](10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit](11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit](12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit](13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit] base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 123:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12---------------------------------Begin System Alarm Program Output---------------------------------Assertion Failure Type: WarningHost Name:              drserverDatabase Server Name:   my_instanceTime of failure:        Tue Mar 31 23:31:44 EDT 2020AF file:                /ifmx_dump/my_instance/af.44a0b12Shared memory file:     NoneSystem Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

14. RE: down chunk

Mark Collins

Posted Wed April 01, 2020 07:04 PM

Art,

Got a chance to run oncheck -cR. Nothing reported as errors:

> oncheck -cR

Validating IBM Informix Dynamic Server reserved pages

    Validating PAGE_PZERO...

    Validating PAGE_CONFIG...


    Validating PAGE_1CKPT & PAGE_2CKPT...
          Using check point page PAGE_1CKPT.

Validating physical log pages ...

Validating logical logs ...

    Validating PAGE_1DBSP & PAGE_2DBSP...
          Using DBspace page PAGE_2DBSP.

    Validating PAGE_1PCHUNK & PAGE_2PCHUNK...
          Using primary chunk page PAGE_2PCHUNK.

    Validating PAGE_1ARCH & PAGE_2ARCH...
          Using archive page PAGE_1ARCH.

------------------------------
Mark Collins
------------------------------

Original Message

Original Message:
Sent: Wed April 01, 2020 01:02 PM
From: Art Kagel
Subject: down chunk

------------------------------
Art Kagel

Original Message:
Sent: Wed April 01, 2020 11:17 AM
From: Mark Collins
Subject: down chunk

23:16:00  Resuming Logical Restore23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.23:16:02  Checkpoint Completed:  duration was 0 seconds.23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 517000323:16:02  Maximum server connections 023:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 023:16:03  Checkpoint Completed:  duration was 0 seconds.23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 517000423:16:03  Maximum server connections 023:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 023:16:03  Suspending Logical Restore23:31:26  Resuming Logical Restore23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.23:31:30  Rollforward of log record failed. iserrno = 023:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b1223:31:43   See Also: /ifmx_dump/my_instance/af.44a0b1223:31:44  Chunk 7 is being taken OFFLINE.23:31:44  Rollforward of log record failed. iserrno = 023:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........c000000146da00e0: 000000d7                              ....logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........c000000146db10e0: 000000d7                              ....23:31:30  End of queued log recsLog Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.Dc000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....c000000146f6b090: 00000080                              ....23:31:4323:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B00000023:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk23:31:43  Stack for thread: 98 xchg_2.0 base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 1( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit](10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit](11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit](12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit](13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit] base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 123:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12---------------------------------Begin System Alarm Program Output---------------------------------Assertion Failure Type: WarningHost Name:              drserverDatabase Server Name:   my_instanceTime of failure:        Tue Mar 31 23:31:44 EDT 2020AF file:                /ifmx_dump/my_instance/af.44a0b12Shared memory file:     NoneSystem Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

15. RE: down chunk

SangGyu Jeong

Posted Wed April 01, 2020 08:31 PM

@Mark Collins
I'm not sure if the Continuing Support Offering option is still available, but if so, I think it would be possible to contact IBM for this issue.
https://www.ibm.com/support/pages/informix-continuing-support-offering

------------------------------
SangGyu Jeong
Software Engineer
Infrasoft
Seoul Korea, Republic of
------------------------------

Original Message

Original Message:
Sent: Wed April 01, 2020 11:17 AM
From: Mark Collins
Subject: down chunk

23:16:00  Resuming Logical Restore23:16:00  Logical Log 64637 Complete, timestamp: 0x2befe189.23:16:02  Checkpoint Completed:  duration was 0 seconds.23:16:02  Tue Mar 31 - loguniq 64638, logpos 0xe018, timestamp: 0x2befe240 Interval: 517000323:16:02  Maximum server connections 023:16:02  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 719, Llog used 023:16:03  Checkpoint Completed:  duration was 0 seconds.23:16:03  Tue Mar 31 - loguniq 64638, logpos 0x1ab018, timestamp: 0x2beff0d7 Interval: 517000423:16:03  Maximum server connections 023:16:03  Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 329, Llog used 023:16:03  Suspending Logical Restore23:31:26  Resuming Logical Restore23:31:26  Logical Log 64638 Complete, timestamp: 0x2beff66e.23:31:30  Rollforward of log record failed. iserrno = 023:31:30  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  stack trace for pid 3951 written to /ifmx_dump/my_instance/af.44a0b1223:31:43   See Also: /ifmx_dump/my_instance/af.44a0b1223:31:44  Chunk 7 is being taken OFFLINE.23:31:44  Rollforward of log record failed. iserrno = 023:31:44  Log Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 507423:32:20  Logical Log 64639 Complete, timestamp: 0x2bf1dc64.23:32:39  Checkpoint blocked by down space, waiting for override or shutdown

Looking at the af file, there are several HINSERT and ADDITEM entries listed, until we get to this:

logpos:64639:15303fc HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146da0060: 00000084 00000028 00000112 00000000   .......( ........c000000146da0070: 00000000 00000000 000013d2 015326c0   ........ .....S&.c000000146da0080: 91e8c3aa 00611ad7 00611ad7 00078817   .....a.. .a......c000000146da0090: 00430004 00000000 00000000 80017f0a   .C...... ........c000000146da00a0: 00017f0b 30313139 39208000 00000000   ....0119 9 ......c000000146da00b0: 00800000 00000000 80000000 00000080   ........ ........c000000146da00c0: 00000000 00008000 00000000 00800000   ........ ........c000000146da00d0: 00000000 80000000 00000080 00000000   ........ ........c000000146da00e0: 000000d7                              ....logpos:64639:1530544 ADDITEM  tx:5074 pn:00611ad8 fl:              10c000000147004060: 00000040 0000001c 00000010 00000000   ...@.... ........c000000147004070: 00000000 00000000 000013d2 01532700   ........ .....S'.c000000147004080: 91e8c3aa 00611ad8 00611ad8 00611ad7   .....a.. .a...a..c000000147004090: 00078817 000002a9 00010004 80017f0b   ........ ........logpos:64639:1530338 HINSERT  tx:5074 pn:00611ad7 fl:             112c000000146db1060: 00000084 00000028 00000112 00000000   .......( ........c000000146db1070: 00000000 00000000 000013d2 01532784   ........ .....S'.c000000146db1080: 91e8c3ac 00611ad7 00611ad7 00078818   .....a.. .a......c000000146db1090: 00430004 00000000 00000000 80017f0b   .C...... ........c000000146db10a0: 00017f0c 30363339 39208000 00000000   ....0639 9 ......c000000146db10b0: 00800000 00000000 80000000 00000080   ........ ........c000000146db10c0: 00000000 00008000 00000000 00800000   ........ ........c000000146db10d0: 00000000 80000000 00000080 00000000   ........ ........c000000146db10e0: 000000d7                              ....23:31:30  End of queued log recsLog Record: log = 64639, pos = 0x1530584, type = OLDRSAM:CHALLOC(51), trans = 5074c000000146f6b060: 00000034 00000033 00000090 00000000   ...4...3 ........c000000146f6b070: 00000000 00000000 000013d2 01530544   ........ .....S.Dc000000146f6b080: 91e8c375 00000000 00220730 0000000a   ...u.... .".0....c000000146f6b090: 00000080                              ....23:31:4323:31:43  IBM Informix Dynamic Server Version 11.50.FC6WE Software Serial Number AAA#B00000023:31:43  Assert Warning: Chunk 7 is being taken OFFLINE.23:31:43   Who: Session(41, informix@drserver, 0, c0000001456c5288)                Thread(98, xchg_2.0, c00000014568b0a8, 1)                File: rsmirror.c Line: 179423:31:43   Results: Dynamic Server will block at next checkpoint23:31:43   Action: Shutdown (onmode -k) or override (onmode -O)23:31:43  Raw hex dump of stack located in /ifmx_dump/my_instance/af.44a0b12.rawstk23:31:43  Stack for thread: 98 xchg_2.0 base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 1( 0)  0x4000000000fb0008   legacy_hp_afstack + 0x320  [/informix/IDS11.50.fc6/bin/oninit]( 1)  0x4000000000faf4a4   afstack + 0x64  [/informix/IDS11.50.fc6/bin/oninit]( 2)  0x4000000000fae410   afhandler + 0xa98  [/informix/IDS11.50.fc6/bin/oninit]( 3)  0x4000000000fad904   afwarn_interface + 0x4c  [/informix/IDS11.50.fc6/bin/oninit]( 4)  0x4000000000a1eac8   bring_media_down + 0x9a0  [/informix/IDS11.50.fc6/bin/oninit]( 5)  0x4000000000b31c78   rollfwd_error + 0x2b8  [/informix/IDS11.50.fc6/bin/oninit]( 6)  0x4000000000b7f534   rlogm_redo + 0x82c  [/informix/IDS11.50.fc6/bin/oninit]( 7)  0x4000000000b20e48   scan_logredo + 0x998  [/informix/IDS11.50.fc6/bin/oninit]( 8)  0x4000000000b216e4   scan_logredo + 0x1234  [/informix/IDS11.50.fc6/bin/oninit]( 9)  0x4000000000b1f80c   next_lscan + 0x87c  [/informix/IDS11.50.fc6/bin/oninit](10)  0x4000000000fbb598   prod_loop1 + 0x2e8  [/informix/IDS11.50.fc6/bin/oninit](11)  0x4000000000fbbb30   producer_thread + 0x330  [/informix/IDS11.50.fc6/bin/oninit](12)  0x4000000000f7cf34   startup + 0xd4  [/informix/IDS11.50.fc6/bin/oninit](13)  0x4000000000f7cd1c   resume + 0x10c  [/informix/IDS11.50.fc6/bin/oninit] base: 0xc000000147673000  len:   69632   pc: 0x0000000000000000  tos: 0xc000000147675380state: running   vp: 123:31:43   See Also: /ifmx_dump/my_instance/af.44a0b12---------------------------------Begin System Alarm Program Output---------------------------------Assertion Failure Type: WarningHost Name:              drserverDatabase Server Name:   my_instanceTime of failure:        Tue Mar 31 23:31:44 EDT 2020AF file:                /ifmx_dump/my_instance/af.44a0b12Shared memory file:     NoneSystem Blocking:        OFF

------------------------------
Mark Collins
------------------------------

#Informix

Informix

Informix

down chunk

Mark CollinsWed April 01, 2020 11:18 AM

Eric VercellettoWed April 01, 2020 11:26 AM

Mark CollinsWed April 01, 2020 11:40 AM

Eric VercellettoWed April 01, 2020 11:57 AM

Mark CollinsWed April 01, 2020 12:01 PM

Mark CollinsWed April 01, 2020 11:56 AM

Eric VercellettoWed April 01, 2020 12:08 PM

Andreas LegnerWed April 01, 2020 12:16 PM

Mark CollinsWed April 01, 2020 12:56 PM

Mark JalkiewiczWed April 01, 2020 03:54 PM

Mark CollinsWed April 01, 2020 07:03 PM

Art KagelWed April 01, 2020 01:02 PM

Mark CollinsWed April 01, 2020 01:13 PM

Mark CollinsWed April 01, 2020 07:04 PM

SangGyu JeongWed April 01, 2020 08:31 PM

1. down chunk

2. RE: down chunk

3. RE: down chunk

4. RE: down chunk

5. RE: down chunk

6. RE: down chunk

7. RE: down chunk

8. RE: down chunk

9. RE: down chunk

10. RE: down chunk

11. RE: down chunk

12. RE: down chunk

13. RE: down chunk

14. RE: down chunk

15. RE: down chunk

Additional
Resources

Office

Quick Links

Informix

Informix

down chunk

Mark CollinsWed April 01, 2020 11:18 AM

Eric VercellettoWed April 01, 2020 11:26 AM

Mark CollinsWed April 01, 2020 11:40 AM

Eric VercellettoWed April 01, 2020 11:57 AM

Mark CollinsWed April 01, 2020 12:01 PM

Mark CollinsWed April 01, 2020 11:56 AM

Eric VercellettoWed April 01, 2020 12:08 PM

Andreas LegnerWed April 01, 2020 12:16 PM

Mark CollinsWed April 01, 2020 12:56 PM

Mark JalkiewiczWed April 01, 2020 03:54 PM

Mark CollinsWed April 01, 2020 07:03 PM

Art KagelWed April 01, 2020 01:02 PM

Mark CollinsWed April 01, 2020 01:13 PM

Mark CollinsWed April 01, 2020 07:04 PM

SangGyu JeongWed April 01, 2020 08:31 PM

1. down chunk

2. RE: down chunk

3. RE: down chunk

4. RE: down chunk

5. RE: down chunk

6. RE: down chunk

7. RE: down chunk

8. RE: down chunk

9. RE: down chunk

10. RE: down chunk

11. RE: down chunk

12. RE: down chunk

13. RE: down chunk

14. RE: down chunk

15. RE: down chunk

Additional Resources

Office

Quick Links

Additional
Resources