Email: service@parnassusdata.com 7 x 24 online support!

ORA-15042 AND LOOKING FOR DISKS WHICH WERE DROPPED EARLIER

ORA-15042 AND LOOKING FOR DISKS WHICH WERE DROPPED EARLIER


ORA-15042 AND LOOKING FOR DISKS WHICH WERE DROPPED EARLIER


PROBLEM:
--------

SQL> alter diskgroup DG_PORTLUAT mount;
alter diskgroup DG_PORTLUAT mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "1" is missing from group number "35"
ORA-15042: ASM disk "0" is missing from group number "35"

DIAGNOSTIC ANALYSIS:
--------------------
From alert_ASM7.log
===================
1. Last successful mount of DG_PORTLUAT  diskgroup was with disks

/dev/rdsk/oracle/data/ln1/ora_data_0088(disk 1)
/dev/rdsk/oracle/data/ln1/ora_data_64(disk 0)

2. Two new disks added (ora_data_0085, ora_data_0086) and rebalance completed

3. DG_PORTLUAT_0000 and DG_PORTLUAT_0001 (ora_data_0088 , ora_data_64) are 
dropped later

4. It looks like before the rebalance was completed, the devices were made 
offline.
5. The 0088 and 64 disk exists on the nodes.

6. Status of old disks is showing Former and new disks is Member.

7. f1b1 location is on new disk
[grid@stgrac1 ~]$ kfed read ora_data_0086.dd | more
...
kfdhdb.dsknum:                      768 ; 0x024: 0x0300
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
kfdhdb.dskname:        DG_PORTLUAT_0003 ; 0x028: length=16
...
kfdhdb.f1b1locn:             1963851776 ; 0x0d4: 0x750e0000 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>contain f1b1 location

8. AMDU is collected
amdu -diskstring '/dev/rdisk/oracle/data/ora_data*' -dump DG_PORTLUAT 
-noimage

DG_PORTLUAT_86.map
=======================
N0001 D0003 R00 A00003702 F00000002 I0 E00000000 U00 C00256 S0001 B0044138496 
 >>>>>>>>.  AU num is 3702 for disk directory

But looking the kfed for aunum 3702 it is showing invalid:

[grid@stgrac1 ~]$ kfed read ora_data_0086.dd aunum=3702
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
B7E96200 00000000 00000000 00000000 00000000  [................]
 Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: 
[kfbtTraverseBlock][Invalid OSM block type][][0]

9. Partner status table is not showing disk 2 and 3 , which was added 
initially

kfed read ora_data_0085.dd aunum=1 blknum=2 | more
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                           18 ; 0x002: KFBTYP_PST_DTA
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                33619968 ; 0x004: blk=33619968
kfbh.block.obj:                33554560 ; 0x008: file=128
kfbh.check:                    18055808 ; 0x00c: 0x01138280
...
fdpDtaEv0[0].status:                 5 ; 0x000: V=0 R=0 W=0
...
kfdpDtaEv0[1].status:                 5 ; 0x030: V=0 R=0 W=0
....
kfdpDtaEv0[2].status:                 7 ; 0x060: V=0 R=0 W=0
....
kfdpDtaEv0[3].status:                 7 ; 0x090: V=0 R=0 W=0

WORKAROUND:
-----------
recreating diskgroup

RELATED BUGS:
-------------
none

REPRODUCIBILITY:
----------------
at customer side

TEST CASE:
----------
none

STACK TRACE:
------------
alert_ASM7.log
=================
Sun Mar 18 16:04:22 2012
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 35
SUCCESS: diskgroup DG_PORTLUAT was mounted
SUCCESS: alter diskgroup DG_PORTLUAT     mount 
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< alter diskgroup DG_PORTLUAT  add disk 
'/dev/rdisk/oracle/data/ora_data_0085'
Sun Mar 18 18:08:25 2012
NOTE: Attempting voting file refresh on diskgroup DG_PMUAT
Sun Mar 18 18:08:26 2012
NOTE: Assigning number (35,2) to disk (/dev/rdisk/oracle/data/ora_data_0085)
........
SUCCESS: alter diskgroup DG_PORTLUAT  add disk 
'/dev/rdisk/oracle/data/ora_data_0085'
SQL> alter diskgroup DG_PORTLUAT  add disk 
'/dev/rdisk/oracle/data/ora_data_0086'
.........
SUCCESS: alter diskgroup DG_PORTLUAT  add disk 
'/dev/rdisk/oracle/data/ora_data_0086'
Sun Mar 18 19:42:12 2012
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 35/0x8dab06bf (DG_PORTLUAT) 
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< alter diskgroup DG_PORTLUAT drop disk DG_PORTLUAT_0000 rebalance power 5
...
SUCCESS: refreshed membership for 35/0x8dab06bf (DG_PORTLUAT)
Mon Mar 19 13:57:25 2012
SUCCESS: alter diskgroup DG_PORTLUAT drop disk DG_PORTLUAT_0000 rebalance 
power 5
SQL> alter diskgroup DG_PORTLUAT drop disk DG_PORTLUAT_0001 rebalance power 5
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=35
NOTE: Attempting voting file refresh on diskgroup DG_PORTLUAT
NOTE: membership refresh pending for group 35/0x8dab06bf (DG_PORTLUAT)
Mon Mar 19 13:57:31 2012
GMON querying group 35 at 911 for pid 18, osid 8423
SUCCESS: refreshed membership for 35/0x8dab06bf (DG_PORTLUAT)
SUCCESS: alter diskgroup DG_PORTLUAT drop disk DG_PORTLUAT_0001 rebalance 
power 5
.......
NOTE: process _b002_+asm7 (6420) initiating offline of disk 0.1156188455 
(DG_PORTLUAT_0000) with mask 0x7e in group 35
NOTE: process _b002_+asm7 (6420) initiating offline of disk 1.1156188454 
(DG_PORTLUAT_0001) with mask 0x7e in group 35
WARNING: Disk DG_PORTLUAT_0000 in mode 0x15 is now being taken offline
WARNING: Disk DG_PORTLUAT_0001 in mode 0x15 is now being taken offline
NOTE: initiating PST update: grp = 35, dsk = 0/0x44ea0927, mode = 0x6a, op = 
4
NOTE: initiating PST update: grp = 35, dsk = 1/0x44ea0926, mode = 0x6a, op = 
4
Mon Mar 19 14:13:49 2012
NOTE: LGWR doing non-clean dismount of group 35 (DG_PORTLUAT)
NOTE: LGWR sync ABA=2.1501 last written ABA 2.1501
GMON updating disk modes for group 35 at 963 for pid 38, osid 6420
Mon Mar 19 14:13:50 2012
kjbdomdet send to inst 1
detach from dom 35, sending detach message to inst 1
.......
WARNING: Offline for disk DG_PORTLUAT_0000 in mode 0x15 failed.
WARNING: Offline for disk DG_PORTLUAT_0001 in mode 0x15 failed.
Errors in file /u00/app/grid/diag/asm/+asm/+ASM7/trace/+ASM7_b002_6420.trc:
ORA-15130: diskgroup "" is being dismounted
ORA-15066: offlining disk "DG_PORTLUAT_0000" may result in a data loss
ORA-15066: offlining disk "DG_PORTLUAT_0000" may result in a data loss
Mon Mar 19 14:13:50 2012
NOTE: process _b001_+asm7 (6538) initiating offline of disk 0.1156188455 
(DG_PORTLUAT_0000) with mask 0x7e in group 35
NOTE: process _b001_+asm7 (6538) initiating offline of disk 1.1156188454 
(DG_PORTLUAT_0001) with mask 0x7e in group 35
WARNING: Disk DG_PORTLUAT_0000 in mode 0x15 is now being taken offline
WARNING: Disk DG_PORTLUAT_0001 in mode 0x15 is now being taken offline
....
Mon Mar 19 16:17:41 2012
SQL> alter diskgroup DG_PORTLUAT mount 
...
ERROR: diskgroup DG_PORTLUAT was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "1" is missing from group number "35" 
ORA-15042: ASM disk "0" is missing from group number "35" 
ERROR: alter diskgroup DG_PORTLUAT mount
Mon Mar 19 22:22:36 2012

SUPPORTING INFORMATION:
-----------------------
asm alert log from all nodes
OS log from all nodes
dd backup of 4 disks 
amdu output report.txt and .map file