Phone: +86 13764045638 Email: email@example.com 7 x 24 online support!
ASM disk header corruption
Calling all ASM experts - we have the problem described in note 1487443.1 . I have a simple question, as follows:
The affected diskgroups are shared by 4 databases, one on each of 4 servers in a cluster. These are all single-instance databases. One server in the cluster was shut down for some maintenance activity and, when it was restarted, some ASM diskgroups failed to mount, and subsequent investigation showed that we have the problem described above. The ASM instances on the other 3 servers are running correctly and have mounted the affected diskgroups.
My question is - do we have to shut down these other ASM instances (which would clearly affect the databases running on those servers) before implementing the fix described in the note? Any thoughts would be appreciated.
Firstly You need to check whether corruption occured only Disk header by running kfed tool :1) kfed read <Disk path> aun=0 blkn=0
Above command reads only asm disk header of a disk
2) For more information you can also run amdu tool:
amdu -diskstring '<your_path_to_ASM_disks' -dump '<diskgroup>'
The default command will generate following files:
<diskgroup>_0001.img - a exact dump of the content of the diskgroup ;size is limited to 2gb but can be more than one file
<diskgroup>.map - can be used to find the exact location of the ASM metadata on the disks
report.txt - include details about the disks scanned
Check the report file to know the corrupted sectors on ASM disks.
If only ASM disk header is corrupted then you can repair the asm disk header as said in Doc ID 1487443.1 ,
by shutting down asm instances on all nodes.
If corruption occurs on allocation units beyond disk header then you need to consult, Oracle support for resolving the corruption.
Disk can be provisioned ,due to below condition,
+ If some one run createdisk command on those disk.
+ If there is checksum failure
+ if some one deleted the disk and after finding disk in not visible at system ,again label it using ASMLib.
So,corrective action depends upon its impact .
Hence,please raise service request with oracle for investigate further.
Thank you all for your help - we closed all the ASM instances and followed note 1487443.1 to correct the corruptions. All is now well and the customer is happy.