Originally posted by: GregioRenato
Hi everyone,
I´m claim for help for the following scenario:
I have an environment AIX 6.1 with PowerHA Cross-Site LVM Mirroring with SAP and Oracle. it consists in 2 LPARs in different sites, and P795 and Cross-Site LVM Mirroring between 2 Storage Disk DS8700. My sites are connected by an DWDM Link. Exactly like image attached.
I have Cross-Site LVM Mirroring, my copies are with perfect state ( every PP in disk from Site A is cloned on PP in disk from Site B ), and all my VGs are with quorum Disabled.
When my DWDM Link fail, they need 3 seconds to automaticaly migrate for alternate/redundant LINK. When it happens( link lost to other storage Disk), AIX generate an error in ERRPT database and my VG identify that i lost access to disks from remote site and mark LVOLS in stale state.
Last month i had that problem, losing link between sites for 3 seconds, consequently i lost access from redundant Storage and my systems remained accessing disks from local Storage Only.
My HACMP didn´t detect errors, it was expected because i have Cross-Site LVM Mirroring, but i had a lot of other problems that cause a big impact for Oracle:
LVOLs Marked as Stale State ( expected )
AIX Generate error "PATH HAS FAILED" for disks from remote site (expected )
AIX Generate error "I/O ERROR DETECTED BY LVM" (Not Expected)
Oracle can´t access filesystem and lock
After stabilish the environment, i open an PMR at IBM, and i´m trying to identify "Why i have I/O ERROR DETECTED BY LVM if i have integrity in my Cross-Site LVM MIrroring implemented"
I think that this problem can have relation with some disk tunning parameters, like "hcheck_interval" or "rw_timeout". Where disks wait a lot of time for second disk mirror response time and oracle can´t wait this amount of time. So, i´m planning do an tunning in these parameters, putting arount 3 seconds.
Someone can help me to find solution for this problem?
Thanks,
Renato Gregio
#PowerHAforAIX#PowerHA-(Formerly-known-as-HACMP)-Technical-Forum