This question of mine is irrelevant to the purpose of this group but I could not help resisting my curiosity and hope others may share the same urge and it may help us gain some knowledge about data replication.
I just watched a short CNN report about this widely-reported disruption in FAA's computer system (I heard also from CNN the day before it was IBM system) that delivered the same content as the following news piece I got from CNN's web site :
[QUOTE from CNN]
The computer system that failed was the central database for all NOTAMs (Notice to Air Missions) nationwide. Those notices advise pilots of issues along their route and at their destination. It has a backup, which officials switched to when problems with the main system emerged, according to the source.
FAA officials told reporters early Wednesday that the issues developed in the 3 p.m. ET hour on Tuesday.
Officials ultimately found a corrupt file in the main NOTAM system, the source told CNN. A corrupt file was also found in the backup system.
[UNQUOTE]
The very last sentence above is the key point I would like to discuss here. How was the data corruption propagated to the backup system causing this mishap? By the mention of "backup system", I assume FAA computer system uses data replication of some type to a DR system.
A long time ago, I heard an ISV who sold "logical data replication" on iSeries (MIMIX or DataMirror which is now iCluster, I do not recall) compared their solution with "disk HW replication" solution and one point that caught my interest was that the ISV said that logical replication would NEVER propagate corrupted data to the target HA/DR system because it never touched the source tables and propagated the data at all. It touched change records in journaling object and propagated from there. (With IBM i remote journaling, I would see this fact remains intact). So, the target files would never be corrupted BY logical replication.
In contrast, the ISV said that disk HW replication worked by copying an entire physical disk sector (or page or cluster or whatever jargon used) image in memory to the target disk sector - verbatim bit by bit. If there was any glitch in system (SAN box or computer server) or application level SW that caused the data in the source tables to be corrupted, disk HW replication microcode would NOT possibly know about this and therefore would faithfully propagate the corrupted data sector without delay! I remember the ISV tech rep. even insisted to me he even knew of such rare but unfortunately possible mishap case before. I also heard about this from a BP as well but never personally encountered a case myself.
So, I'm wondering if any of you (especially in US) ever know if the problematic FAA's system use disk HW replication (and therefore had the issue described above) or not? If so, do you know whether this kind of undesired weak point was or will be addressed in SAN microcode yet? (I remember the SAN disk system microcode is owned by a company named FalconStor - or whatever new name it may have now, not IBM.)
Just curious and want to understand the issue.
------------------------------
Right action is better than knowledge; but in order to do what is right, we must know what is right.
-- Charlemagne
Satid Singkorapoom
------------------------------