Email: service@parnassusdata.com 7 x 24 online support!

    You are here

    • You are here:
    • Home > Blogs > PDSERVICE's blog > Oracle ASM ORA-15063 / ORA-15042 - TROUBLESHOOTING STEPS BEFORE OPENING a SR to Oracle Support

Oracle ASM ORA-15063 / ORA-15042 - TROUBLESHOOTING STEPS BEFORE OPENING a SR to Oracle Support

Oracle ASM ORA-15063 / ORA-15042 - TROUBLESHOOTING STEPS BEFORE OPENING a SR to Oracle Support

APPLIES TO:
Oracle Database - Enterprise Edition
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Information in this document applies to any platform.
 
 
 
PURPOSE
 
 
Self-debugging steps when a diskgroup cannot be mounted due to error ORA-15063:
 
ORA-15063: ASM discovered an insufficient number of disks for diskgroup s%
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "%" is missing 
 
 
TROUBLESHOOTING STEPS
SECTION A - Getting started
Start by refering  NOTE 452770.1 "TROUBLESHOOTING - ASM disk not found/visible/discovered issues "
Firstly  identify all disks being part of the affected diskgroup by looking at last successful mount in alert_+ASM*.log.
 
You should search for a section as below:
SQL> ALTER DISKGROUP <DGNAME1> MOUNT /* asm agent *//* {0:0:214} */
NOTE: cache registered group DATA number=1 incarn=0x44bef6bb
NOTE: cache began mount (not first) of group DATA number=1 incarn=0x44bef6bb
NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
NOTE: Assigning number (1,0) to disk (ORCL:DATA01P)
NOTE: Assigning number (1,1) to disk (ORCL:DATA02P)
NOTE: Assigning number (1,2) to disk (ORCL:DATA03P)
NOTE: Assigning number (1,3) to disk (ORCL:DATA04P)
NOTE: Assigning number (1,4) to disk (ORCL:DATA05P)
..
NOTE: cache opening disk 0 of grp 1: DATA01P label:DATA01P
NOTE: cache opening disk 1 of grp 1: DATA02P label:DATA02P
..
SUCCESS: DISKGROUP <DGNAME1> was mounted
 
 
NOTE: When ASMLIB is not used the path to ASM disk is specified within the mount section:
 NOTE: cache opening disk 1 of grp 1: REDO3_0001 path:/dev/mpath/3600601600ba12c00d4b784363e69e211 
 NOTE: cache opening disk 2 of grp 1: REDO3_0002 path:/dev/mpath/3600601600ba12c00d4b784363e69e212 
 ...
 
 
Isolate the device(s) reported as "missing" as note 452770.1 suggested.
 
Finally start your checks as follow:
 
A1) If there is any IO/storage/multipathing errors reported in OS logs - investigate and fix them.
This step is mandatory as usually ORA-15063/ORA-15042 are caused by underlying IO/storage errors .  
 
A2) If devices used by ASM disks are properly presented and configured at OS level.
If additionally "ORA-15075: disk(s) are not visible cluster-wide" is reported, make sure that all devices are cluster-wide visible.
 
A3) If all ASM disks have appropriate permissions (eg: they should be owned by grid owner)
If ownership of ASM disk(s) has been changed for whatever reason, please correct that.
 
A4) If/how the "missing" device(s) is reported when querying v$asm_disks
-----------------------------------------------------------------------------------
If the device(s) is reported with status:
 
=> "PROVISIONED/CANDIDATE" - this means the header of ASM disk(s) is damaged.
 
    -> investigate the IO problems behind the corruption - see  step A1. Oracle never wipes out its metadata!! A checksum is made for every write before  being accepted.
 
    -> check the header status, in order to confirm the damage:   
$> kfed read <path_to_your_missing_devices>
       
        kfbh.endian:                          0 ; 0x000: 0x00
        kfbh.hard:                            0 ; 0x001: 0x00
        kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
        kfbh.datfmt:                          0 ; 0x003: 0x00
        kfbh.block.blk:                       0 ; 0x004: blk=0
        kfbh.block.obj:                       0 ; 0x008: file=0
        ....
         
    ->  try to repair the header and see if diskgroup can be mounted:                  
$> kfed repair <path_to_your_missing_devices>
 
    -> check the if there is additional corruptions reported by ASM (eg ORA-15196) or by your database - as IO/storage problems could affect more than one block.
    If any corruption is seen please open a SR to Oracle Support.
 
 
 NOTE:  
 1) When non-default AU size is used AUSZ=<au_size> must be specified with each KFED command.
 2) "kfed repair" works for 11g ONLY!
 
 
=> "UNKNOWN/IGNORED" - this means the ASM disk(s) is not seen at OS level.
    -> review steps A1,A2 and A3:         
-----------------------------------------------------------------------------------   
 
A5) If asm_diskstring is still properly set.
 
On Windows configuration, you can also refer NOTE 880061.1 "ASM Is Unable To Detect SCSI Disks On Windows"    
      
SECTION B - ASMLIB is used
When ASMLIB is used, follow the above steps (section A) and also check the errors associated with ORA-15063:
 
B1) ORA-15183 Unable to initialize the ASMLIB in oracle/ORA-15183: ASMLIB initialization error [driver/agent not installed]
 
Refer: NOTE 340519.1 Cannot Start ASM Ora-15063/ORA-15183
 
B2) ORA-15186: ASMLIB error function = [asm_open], error = [1], mesg = [Operation not permitted]
   
Check your ASMLIB health.
 
 => correctness of installed rpm's
 
 => correctness of symlinks - all nodes should show:
   
    # ls -l  /etc/sysconfig/oracleasm
       lrwxrwxrwx 1 root root 24 Sep 18 22:10 /etc/sysconfig/oracleasm -> oracleasm-_dev_oracleas
       
 => correctness of ASMLIB configuration (/etc/sysconfig/oracleasm) -    when multipathing is used:
 
     # ORACLEASM_SCANORDER: Matching patterns to order disk scanning
        ORACLEASM_SCANORDER="dm"
     # ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan
        ORACLEASM_SCANEXCLUDE="sd"
 
B3) Check if ASMLIB disks are listed under /dev/oracleasm/disks
 
=> devices under /dev/oracleasm/disks/* must be reported as dm devices on all nodes (not single path device -sd*-).If not, please correct that! (see step B2)
   
$> ls -al /dev/oracleasm/disks
 
brw-rw---- 1 grid dba 253, 29 Feb 12 11:44 /dev/oracleasm/disks/DATA01P
brw-rw---- 1 grid dba 253, 35 Feb 12 11:44 /dev/oracleasm/disks/DATA02P
brw-rw---- 1 grid dba 253, 27 Feb 15 16:04 /dev/oracleasm/disks/DATA03P
brw-rw---- 1 grid dba 253, 24 Feb 12 11:44 /dev/oracleasm/disks/DATA04P
brw-rw---- 1 grid dba 253, 25 Feb 12 11:44 /dev/oracleasm/disks/DATA05P
 
 
=> If one of your ASMLIB disk(s) is missing from the above output,  first try to re-scan devices, as root:
 # /etc/init.d/oracleasm scandisks
 
 
=> If ASMLIB disk(s) is still missing from /dev/oracleasm/disks,  engage your sysadmin to investigate this (see steps A1, A2, A3).
 
B4) Check if ASMLIB disk(s) has the correct ASMLIB stamp and status:
 
 $> kfed read <ASMLIB_device> |grep provstr
      kfdhdb.driver.provstr: ORCLDISK<diskname> ; 0x000: length=20
 
 $> kfed read <ASMLIB_device> | egrep 'kfbh.type|kfdhdb.dskname|kfdhdb.hdrsts'
      kfbh.type:      1 ; 0x002: KFBTYP_DISKHEAD 
      kfdhdb.dskname: DATA01P ; 0x028: length=14
      kfdhdb.hdrsts:  3 ; 0x027: KFDHDR_MEMBER     
     
=> If the output is "kfdhdb.driver.provstr: ORCLCLRD" (but kfdhdb.hdrsts= MEMBER and kfbh.type=KFBTYP_DISKHEAD)  then your disk was deleted using "oracleasm deletedisk".
 
 
 
=> If  kfbh.type = KFBTYP_INVALID  -> see step A4)  and check if "kfed repair" could fix the problem.
 
 
B5)Refer also the below documents:
 
NOTE: 398622.1     ORA-15186: ASMLIB error function = [asm_open], error = [1], mesg = [Operation not permitted]
NOTE: 1384504.1   Mount ASM Disk Group Fails : ORA-15186, ORA-15025, ORA-15063  
NOTE: 967461.1    "Multipath: error getting device" seen in OS log causes ASM/ASMlib to shutdown by itself
NOTE: 1526920.1   ORA-15186 ORA-15063 on node 2
SECTION C  -  Additional notes to review
If the above checks are done, but error still persists, please review also the below notes, depending on your configuration/situation:
 
NOTE:  577526.1     ORA-15063 ASM Discovered An Insufficient Number Of Disks For Diskgroup using NetApp Storage
NOTE:  784776.1     ORA-15063 When Mounting a Diskgroup After Storage Cloning ( BCV / Split Mirror / SRDF / HDS / Flash Copy )
NOTE:  555918.1     ORA-15038 On Diskgroup Mount After Node Eviction
NOTE:  1484723.1   ASM Candidate Raw Device Is Not Presented As A RAC Cluster Wide Shared character Devices On Unix.
NOTE:  1534211.1   ORA-15017 and ORA-15063 errors for unused diskgroups in 11.2
NOTE:  1487443.1   Mounting Diskgroup Fails With ORA-15063 and V$ASM_DISK Shows PROVISIONED
NOTE:  742832.1     AIX:After changing Multipathing drivers from RDAC to MPIO ASM discovered an insufficient number of disks
NOTE:  1276913.1   Unable to discover or use raw devices for ASM in HP-UX Itanium in 11.2.0.2 ( ORA-15063 )
SECTION D  - Information to be collected when are you going to open a SR 
If you are not able to fix the problem on your own, please collect the below information and raise a SR to Oracle Support
 
D1) alert_+ASM*.log (from all nodes if RAC)
 
D2) script#1 from NOTE 470211.1 How To Gather/Backup ASM Metadata In A Formatted Manner version 10.1, 10.2, 11.1 & 11.2?
 
D3) KFED reports
 
 
#! /bin/sh
rm /tmp/kfed_DH.out /tmp/kfed_BK.out 
for i in `ls <your_path_to_asm_disks>`
 do
 echo $i >> /tmp/kfed_DH.out
 kfed read $i >> /tmp/kfed_DH.out
 echo $i >> /tmp/kfed_BK.out
 kfed read $i aun=1 blkn=254  >> /tmp/kfed_BK.out     
done
 
Run kfed.sh in as GRID/ASM owner. Upload /tmp/kfed_DH.out, /tmp/kfed_BK.out
! Pay attention to non-default AU size - if a non-default AU size is used the  you must specify it. (see note 1485597.1 "ASM tools used by Support : KFOD, KFED, AMDU")
 
 
D4) ASMLIB information
NOTE : 869526.1 Collecting The Required Information For Support To Troubleshot ASM/ASMLIB Issues.
 
D5) List of your ASM devices
 
   $> ls -al <path_to_ASM_devices>
 
D6) OS logs (from all nodes if this is RAC configuration)
 
SECTION E  - Disk is reported as MISSING after a failed disk addition
 If you are facing ORA-15063 after a failed disk addition, please collect the below information and raise a SR to Oracle Support
 
E1) alert_+ASM*.log (from all nodes if RAC)
 
E2) script#1 from NOTE 470211.1 How To Gather/Backup ASM Metadata In A Formatted Manner version 10.1, 10.2, 11.1 & 11.2?
 
E3) KFED reports
#! /bin/sh
rm /tmp/kfed_*.out 
for i in `ls <your_path_to_asm_disks>`
 do
 echo $i >> /tmp/kfed_DH.out
 kfed read $i >> /tmp/kfed_DH.out
 echo $i >> /tmp/kfed_BK.out
 kfed read $i aun=1 blkn=254  >> /tmp/kfed_BK.out 
 echo $i >> /tmp/kfed_PST.out
 kfed read $i aun=1 blkn=2 >> /tmp/kfed_PST.out
 echo $i >> /tmp/kfed_FS.out
 kfed read $i blkn=1 >> /tmp/kfed_FS.out
 echo $i >> /tmp/kfed_FD.out
 kfed read $i aun=2 blkn=1 >> /tmp/kfed_FD.out
 echo $i >> /tmp/kfed_DD.out
 kfed read $i aun=2 blkn=0 >> /tmp/kfed_DD.out  ##there might be more than one block needed if a large number of disks -> this might be asked later by Oracle Support
done
 
Run kfed.sh in as GRID/ASM owner. Upload /tmp/kfed_*.out
! Pay attention to non-default AU size - if a non-default AU size is used the  you must specify it. (see note 1485597.1 "ASM tools used by Support : KFOD, KFED, AMDU")
 
 
E4) AMDU output
 
amdu -diskstring '<ASM_DISKSTRING>' -dump '<DISKGROUP_NAME>' -noimage
amdu -diskstring '<ASM_DISKSTRING>' -print <DISKGROUP_NAME>.F2.V0.C2 > DG.amdu
####F2.V0.C2  --> This will only extract up to 16 disks information. If there is a large number of disks, a larger output is needed