zPET - IBM Z and z/OS Platform Evaluation and Test - Group home

zFS Online Salvage and Disabled Aggregate Recovery

By Lora Milczewski posted Wed March 25, 2020 12:10 PM

  

Currently, the zFS kernel has the capability to perform a salvage verification and repair of an aggregate. If an aggregate has a problem, zFS will disable the aggregate for all access. The user must manually unmount the file system and run an offline repair.

 

If zFS finds a problem that might lead to a corruption of an aggregate, or if the aggregate is already corrupted, zFS will disable access to the aggregate. In approximately one minute’s time, zFS will initiate an internal re-mount (NORWSHARE) or a chgowner operation (RWSHARE) to re-initialize all the memory information about that aggregate to ensure the zFS memories start cleaning the aggregate. This frequently prevents corruptions from reaching disks, but sometimes the disk will indeed be corrupted. If the disk truly is corrupted, it’s likely the corruption will be encountered again and zFS will disable access again. If zFS disables the same aggregate 3 times it will remain disabled.

 

In V2R3, zFS provides the ability for the customer to initiate an online salvage, with online salvage, zFS can now initiate salvage verification and repair of a R/W file system without unmounting it, and if an aggregate has been disabled 3 times, zFS can now repair it online without the need for a manual unmount of the file system, the file system becomes usable again without the need to alter the mount tree.

 

The online salvage allows concurrent reading of the aggregate while salvage verification is taking place and user activities would only be stopped if a repair was needed. Since verification does all the repair computations in addition to verification, verification is over 99% of the time required to perform a salvage operation. Thus, zFS at least allows file and directory reading while the verification is being performed.

 

zFS also provides a new ZFSCALL_AGGR PFSCTL command (opcode 0x40000005) to allow an application to request a salvage of a zFS file system (it should be noted that this command could be long running so the task could wait a long time).

 

In order to invoke the command, the issuer must be logged in as a root user (UID=0) or have READ authority to the SUPERUSER.FILESYS.PFSCTL resource in the z/OS UNIXPRIV class. You can issue rlist unixpriv SUPERUSER.FILESYS.PFSCTL au to have a check.

 

zFS provides a salvage command (and API that goes with it) to allow the salvaging of a file system, the syntax of that command is shown here.

zfsadm salvage -aggregate name [{-verifyonly|-cancel}][-trace file_name][-level][-help]

 

Where:

  • aggregate name: specifies the name of the aggregate.
  • cancel: specifies that the salvage for this aggregate is to be canceled.
  • help: prints the online help for this command.
  • level: prints the level of the zfsadm command.
  • trace file_name: specifies the name of the file that will have the trace records written into it.
  • verifyonly: indicates whether only verification should be performed. If -verifyonly is not specified, then both verification and repair are performed.

Salvage processing is driven by the zFS owner. The zfsadm salvage command does not provide detailed status information. This information is available in the system log of the zFS owner. The “zfsadm fsinfo” or “F ZFS,FSINFO” command can also be used to display minimal point in time information about the progress of a salvage operation.

 

Commands “zfsadm fsino” and “F ZFS,FSINFO” provide a new status field indicator SL which means salvaging. The Legend in the command output possibly shows this new SL indicator and its meaning. The user can also specify the SL in the selection criteria to select aggregates that are salvaging. Additionally, the commands show the progress of an aggregate that is salvaging if the -owner statistics are selected.

 

An example is shown here:

File System Name: OMVSSPT.DEMM.ZFS

 

  *** owner information ***

  Owner:              Z4             Converttov5:           OFF,n/a

  Size:               35015000K      Free 8K Blocks:        3941490       

  Free 1K Fragments:  0              Log File Size:         32800K        

  Bitmap Size:        4840K          Anode Table Size:      197360K       

  File System Objects: 602003         Version:               1.5

  Overflow Pages:     0              Overflow HighWater:    0             

  Thrashing Objects:  0              Thrashing Resolution:  0             

  Token Revocations:  0              Revocation Wait Time:  0.000         

  Devno:              64953          Space Monitoring:      0,0

  Quiescing System:   Z4             Quiescing Job Name:    n/a           

  Quiescor ASID:      n/a            File System Grow:      ON,0

  Status:             RW,RS,Q,NE,NC,SL

  Audit Fid:          D7F2E4E2 F9F21A95 0000

  Backups:            0              Backup File Space:     0K

  Salvage Verify:     AN,33% started at Apr 17 02:29:15 2017 task 006AAB70

 

  File System Creation Time: Nov 3 03:02:44 2016

  Time of Ownership:        Apr 14 05:10:00 2017

  Statistics Reset Time:    Apr 14 05:10:00 2017

  Quiesce Time:             n/a

  Last Grow Time:           n/a

 

  Connected Clients:  Z1 Z2 Z3

 

 

Legend: RW=Read-write, Q=Quiesced, RS=Mounted RWSHARE, NE=Not encrypted

       NC=Not compressed, SL=Salvaging

       AN=Anode table scan (salvage step 4 of 10)

 

Where:

  • Salvage Repair – new line that is displayed if the aggregate is undergoing a salvage operation, and this field can have the following possible values, another value could be Salvage Verify is only verification requested:
  • QS – Aggregate is being quiesced.
  • CS – Core structures being validated, such as log file and AFL.
  • BS – The bitmap structure is being verified.
  • AN, PCT – The anode table is being scanned and its structure and file anodes are being verified and percent complete is shown.
  • DV, PCT – The directory tree and directory contents are being scanned and verified and the percent complete is shown.
  • AL – Anode table lists are being verified.
  • LC – Link counts are being verified.
  • BM – Bitmap contents are being verified.
  • RA – Aggregate anodes, bitmaps and directories being repaired.
  • RT – V5 directory trees are being repaired.
  • And the time of day is also shown when the long running command was started.
  • And the long running admin task that it is running on.

The following part are some cases I once performed in our environment.

Testing system is running in a colony address space. Testing file systems are OMVSSPT.DEMM.ZFS and OMVSSPT.DEMI.ZFS

 

Case 1: File system not mounted

zfsadm salvage -aggregate OMVSSPT.DEMM.ZFS -verifyonly

IOEZ00712E Could not salvage aggregate OMVSSPT.DEMM.ZFS because it could not be found

 

Case 2: No salvage running, issue -cancel

138:/u/demi $ zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -cancel

IOEZ00713E Error 121 reason code EF176C04 received salvaging aggregate OMVSSPT.DEMI.ZFS.

 

139:/u/demi $ bpxmtext EF176C04

zFS Wed May 31 10:39:22 EDT 2017

Description: There is no salvage operation to cancel. The aggregate is not

currently being salvaged.

Action: If the aggregate specified is correct, there is no action to take.

Otherwise specify the correct aggregate and try again.

 

Case 3: Successfully verified

zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -verifyonly

  • IOEZ00711I Aggregate OMVSSPT.DEMI.ZFS successfully verified.

 

zfsadm fsinfo -aggregate OMVSSPT.DEMI.ZFS

=>

128:/u/demi $ zfsadm fsinfo -aggregate OMVSSPT.DEMI.ZFS

File System Name: OMVSSPT.DEMI.ZFS

 

  *** owner information ***

  Owner:              Z3             Converttov5:           OFF,n/a

  Size:               64300320K      Free 8K Blocks:        7059895       

  Free 1K Fragments:  0              Log File Size:         32800K        

  Bitmap Size:        8880K          Anode Table Size:      8K            

  File System Objects: 9              Version:               1.5

  Overflow Pages:     0              Overflow HighWater:    0             

  Thrashing Objects:  0              Thrashing Resolution:  0             

  Token Revocations:  0              Revocation Wait Time:  0.000         

  Devno:              70089          Space Monitoring:      0,0

  Quiescing System:   Z3             Quiescing Job Name:    n/a           

  Quiescor ASID:      n/a            File System Grow:      ON,0

  Status:             RW,RS,Q,NE,CO,SL

  Audit Fid:          D7F2E4E2 F9F00017 0000

  Backups:            0              Backup File Space:     0K

  Salvage Verify:     BS started at Jun 26 03:03:50 2017 task 006ADE88

 

  File System Creation Time: Oct 21 02:32:33 2016

  Time of Ownership:        Jun 22 23:58:01 2017

  Statistics Reset Time:    Jun 22 15:49:12 2017

  Quiesce Time:             n/a

  Last Grow Time:           n/a

 

  Connected Clients:  Z2 Z1 Z4

 

 

Legend: RW=Read-write, Q=Quiesced, RS=Mounted RWSHARE, NE=Not encrypted

       CO=Compressed, SL=Salvaging

       BS=Bitmap struct scan (salvage step 3 of 10)

 

f omvs,pfs=zfs,fsinfo

=>

OMVSSPT.DEMI.ZFS                            Z3      RW,RS,Q,NE,CO,SL

 

Legend: RW=Read-write,RS=Mounted RWSHARE,NE=Not encrypted           

       NC=Not compressed,L=Low on space,RO=Read-only,Q=Quiesced    

       CO=Compressed,SL=Salvaging,GD=AGGRGROW disabled             

       GF=Grow failed,SE=Space errors reported,NS=Mounted NORWSHARE

 

N 4020000 Z3      2017177 03:03:50.02         00000090 IOEZ00729I Verification of aggregate OMVSSPT.DEMI.ZFS started   

M 0000000 Z3      2017177 03:03:50.04         00000290 IOEZ00705I Formatted v5 aggregate size 8037540 8K blocks, dataset

S                                                        size 628                                                        

E                                          628 00000290 8037540 8K blocks                                               

N 0000000 Z3      2017177 03:03:50.08         00000290 IOEZ00707I Log file size 4096 8K blocks, verified correct

N 0000000 Z3      2017177 03:03:53.96         00000290 IOEZ00709I Bitmap size 1109 8K blocks, verified correct             

M 0000000 Z3      2017177 03:03:53.96         00000290 IOEZ00951I Aggregate OMVSSPT.DEMI.ZFS anode table length=1(in 8K 634

E                                          634 00000290 blocks) LPI=0 not-encrypted compressed                              

 

N 0000000 Z3      2017177 03:04:11.79         00000290 IOEZ00782I Salvage has verified 1 of 1 pages in the anode table.      

M 0000000 Z3      2017177 03:04:11.80         00000290 IOEZ00782I Salvage has verified 1 of 1 directory block in the 669     

E                                          669 00000290 directory tree.                                                       

M 0000000 Z3      2017177 03:04:11.80         00000290 IOEZ00782I Salvage has verified 1 of 1 directories in the directory   

S                                                        670                                                                   

E                                          670 00000290 tree.                                                                 

M 0000000 Z3      2017177 03:04:11.80         00000290 IOEZ00782I Salvage has verified 1 of 1 pages in the partially-free 671

E                                          671 00000290 page list.                                                            

N 0000000 Z3      2017177 03:04:11.80         00000290 IOEZ00722I Primary file system size 1 8K blocks, verified correct     

M 0000000 Z3      2017177 03:04:11.86         00000290 IOEZ00739I Salvage processed 1 directory pages, 10 anodes, 3808 673   

E                                          673 00000290 indirect blocks and 1 anode table pages.                              

M 4020000 Z3      2017177 03:04:11.86         00000090 IOEZ00730I Verification of aggregate OMVSSPT.DEMI.ZFS completed, no   

S                                                        674                                                                   

E                                          674 00000090 errors found. 

 

Case 4: Salvage + cancel

Session 1: issued zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -verifyonly

148:/u/demi $ zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -verifyonly

IOEZ00713E Error 120 reason code EF1769AA received salvaging aggregate OMVSSPT.DEMI.ZFS.

149:/u/demi $ bpxmtext EF1769AA

zFS Wed May 31 10:39:22 EDT 2017

Description: A long-running operation was interrupted by an unmount, a file

system shutdown command, or a cancel command. This is normal for long-running

operations.

 

Action: To continue the long-running operation, you will need to restart it.

 

 

Session 2: issued zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -cancel

 

156:/u/demi $ zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -cancel

IOEZ00854I Salvage for aggregate OMVSSPT.DEMI.ZFS successfully canceled.

 

Syslog information:

 

N 4020000 Z3      2017177 03:38:58.53         00000090 IOEZ00729I Verification of aggregate OMVSSPT.DEMI.ZFS started       

M 0000000 Z3      2017177 03:38:58.55         00000290 IOEZ00705I Formatted v5 aggregate size 8037540 8K blocks, dataset   

S                                                        size 287

E                                          287 00000290 8037540 8K blocks

N 0000000 Z3      2017177 03:38:58.55         00000290 IOEZ00707I Log file size 4096 8K blocks, verified correct

N 0000000 Z3      2017177 03:38:58.67         00000290 IOEZ00709I Bitmap size 1109 8K blocks, verified correct

M 0000000 Z3      2017177 03:38:58.67         00000290 IOEZ00951I Aggregate OMVSSPT.DEMI.ZFS anode table length=1(in 8K 290

E                                          290 00000290 blocks) LPI=0 not-encrypted compressed      

M 0000000 Z3      2017177 03:38:59.90         00000290 IOEZ00867I Salvage for aggregate OMVSSPT.DEMI.ZFS has been 296   

E                                          296 00000290 interrupted.

M 0000000 Z3      2017177 03:38:59.90         00000290 IOEZ00739I Salvage processed 0 directory pages, 9 anodes, 3808 297

E                                          297 00000290 indirect blocks and 1 anode table pages.

M 4020000 Z3      2017177 03:38:59.90         00000090 IOEZ00867I Salvage for aggregate OMVSSPT.DEMI.ZFS has been 298   

E                                          298 00000090 interrupted.      

 

Case 5: salvage + shell unmount

Session 1: issued zfsadm salvage -aggregate OMVSSPT.DEMM.ZFS -verifyonly

160:/u/demi $ zfsadm salvage -aggregate OMVSSPT.DEMM.ZFS -verifyonly

IOEZ00713E Error 120 reason code EF1769AA received salvaging aggregate OMVSSPT.DEMM.ZFS.

161:/u/demi $ bpxmtext EF1769AA

zFS Wed May 31 10:39:22 EDT 2017

Description: A long-running operation was interrupted by an unmount, a file

system shutdown command, or a cancel command. This is normal for long-running

operations.

 

Action: To continue the long-running operation, you will need to restart it.

 

Session 2: issued

unmount -o immediate -f OMVSSPT.DEMM.ZFS

unmount -o force -f OMVSSPT.DEMM.ZFS

176:/u/demi $ unmount -o immediate -f OMVSSPT.DEMM.ZFS

FOMF0504I unmount error: 72 EF0969B5

EBUSY: The resource is busy

Description: An unmount was requested and the file system was busy with an administration command.

177:/u/demi $ unmount -o force -f OMVSSPT.DEMM.ZFS

 

Syslog information:

 

N 4020000 Z2      2017177 04:33:50.87         00000090 IOEZ00729I Verification of aggregate OMVSSPT.DEMM.ZFS started    

M 0000000 Z2      2017177 04:33:50.88         00000290 IOEZ00705I Formatted v5 aggregate size 4608000 8K blocks, dataset

S                                                        size 950

E                                          950 00000290 4608000 8K blocks

N 0000000 Z2      2017177 04:33:50.88         00000290 IOEZ00707I Log file size 4096 8K blocks, verified correct

N 0000000 Z2      2017177 04:33:55.72         00000290 IOEZ00709I Bitmap size 636 8K blocks, verified correct             

M 0000000 Z2      2017177 04:33:55.74         00000290 IOEZ00951I Aggregate OMVSSPT.DEMM.ZFS anode table length=1(in 8K 985

E                                          985 00000290 blocks) LPI=0 not-encrypted not-compressed                         

M 0000000 Z2      2017177 04:33:55.75         00000290 IOEZ00867I Salvage for aggregate OMVSSPT.DEMM.ZFS has been 986     

E                                          986 00000290 interrupted.

M 0000000 Z2      2017177 04:33:55.75         00000290 IOEZ00739I Salvage processed 0 directory pages, 3 anodes, 5 indirect

S                                                        987

E                                          987 00000290 blocks and 1 anode table pages.

M 4020000 Z2      2017177 04:33:55.75         00000090 IOEZ00867I Salvage for aggregate OMVSSPT.DEMM.ZFS has been 988     

E                                          988 00000090 interrupted.

N 0000000 Z2      2017177 04:33:55.80         00000290 IOEZ00048I Detaching aggregate OMVSSPT.DEMM.ZFS

 

Case 6: salvage + TSO UNMOUNT

In shell: issued zfsadm salvage -aggregate OMVSSPT.DEMM.ZFS -verifyonly

zfsadm salvage -aggregate OMVSSPT.DEMM.ZFS -verifyonly

IOEZ00713E Error 120 reason code EF1769AA received salvaging aggregate OMVSSPT.DEMM.ZFS.

136:/u/demi $ bpxmtext EF1769AA

zFS Wed May 31 10:39:22 EDT 2017

Description: A long-running operation was interrupted by an unmount, a file

system shutdown command, or a cancel command. This is normal for long-running

operations.

 

Action: To continue the long-running operation, you will need to restart it.

 

In TSO:

UNMOUNT FILESYSTEM('OMVSSPT.DEMM.ZFS') IMMEDIATE

 

RETURN CODE 00000072, REASON CODE EF0969B5. THE UNMOUNT FAILED FOR FILE SYSTEM OMVSSPT.DEMM.ZFS.

 

bpxmtext EF0969B5                                                        

zFS Wed May 31 10:39:22 EDT 2017                                         

Description: An unmount was requested and the file system was busy with an

administration command.                                                  

                                                                         

Action: Either wait for the command to complete or issue unmount with the

FORCE option to interrupt the administration command.    

 

UNMOUNT FILESYSTEM('OMVSSPT.DEMM.ZFS') FORCE

 

Syslog information:

 

N 4020000 Z4      2017178 01:40:53.41         00000090 IOEZ00729I Verification of aggregate OMVSSPT.DEMM.ZFS started   

M 0000000 Z4      2017178 01:40:53.42         00000290 IOEZ00705I Formatted v5 aggregate size 7108020 8K blocks, dataset

S                                                        size 574

E                                          574 00000290 7108020 8K blocks

N 0000000 Z4      2017178 01:40:53.42         00000290 IOEZ00707I Log file size 4096 8K blocks, verified correct

N 0000000 Z4      2017178 01:40:55.33         00000290 IOEZ00709I Bitmap size 981 8K blocks, verified correct

M 0000000 Z4      2017178 01:40:55.33         00000290 IOEZ00951I Aggregate OMVSSPT.DEMM.ZFS anode table length=1(in 8K 577

E                                          577 00000290 blocks) LPI=0 not-encrypted not-compressed 

M 0000000 Z4      2017178 01:41:09.91         00000290 IOEZ00867I Salvage for aggregate OMVSSPT.DEMM.ZFS has been 604    

E                                          604 00000290 interrupted.

M 0000000 Z4      2017178 01:41:09.91         00000290 IOEZ00739I Salvage processed 0 directory pages, 4 anodes, 3221 605

E                                          605 00000290 indirect blocks and 1 anode table pages.

M 4020000 Z4      2017178 01:41:09.91         00000090 IOEZ00867I Salvage for aggregate OMVSSPT.DEMM.ZFS has been 606    

E                                          606 00000090 interrupted.

N 0000000 Z4      2017178 01:41:09.94         00000290 IOEZ00048I Detaching aggregate OMVSSPT.DEMM.ZFS 

 

Case 7: Privilege

The issuer must be logged in as a root user (UID=0) or have READ authority to

the SUPERUSER.FILESYS.PFSCTL resource in the z/OS UNIXPRIV class.

rlist unixpriv superuser.filesys.pfsctl au

 

CLASS     NAME                                         

-----     ----                                         

UNIXPRIV  SUPERUSER.FILESYS.PFSCTL                     

                                                        

LEVEL OWNER     UNIVERSAL ACCESS YOUR ACCESS WARNING

----- --------  ---------------- ----------- -------

 00   SYS1           NONE              READ   NO    

 

permit superuser.filesys.pfsctl class(unixpriv) id(demi) access(none)

setropts raclist(unixpriv) refresh

 

144:/u/demi $ zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -verifyonly

IOEZ00092E The user is not authorized to run this command.

 

id usswork

uid=0(bpxroot) gid=0(sys1)

 

su usswork

 

zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -verifyonly

 

IOEZ00711I Aggregate OMVSSPT.DEMI.ZFS successfully verified.

 

permit superuser.filesys.pfsctl class(unixpriv) id(demi) access(read)

setropts raclist(unixpriv) refresh

 

CLASS     NAME                                          

-----     ----                                          

UNIXPRIV  SUPERUSER.FILESYS.PFSCTL                      

                                                         

LEVEL OWNER     UNIVERSAL ACCESS YOUR ACCESS WARNING 

----- --------  ---------------- ----------- ------- 

 00   SYS1           NONE              READ   NO

 

150:/u/demi $ zfsadm salvage -aggregate OMVSSPT.DEMI.ZFS -trace /u/demi/log

IOEZ00711I Aggregate OMVSSPT.DEMI.ZFS successfully verified or repaired.

IOEZ00036I Printing contents of table at address 298AB000 name: Main Trace Table

IOEZ00042I Start record found, total records 662, 28632 bytes to format.

IOEZ00043I zfsadm: print of in-memory trace table has completed.

 

For more information about zfsadm salvage command, please refer to: z/OS Distributed File Service zFS Administration -> Chapter 11. zFS commands -> zfsadm salvage