Email: service@parnassusdata.com 7 x 24 online support!

    You are here

    • You are here:
    • Home > Blogs > PDSERVICE's blog > Bug 22891868 - Oracle Grid Clusterware OHASD does not restart CRSD when crsd.bin is hanging (Doc ID 22891868.8)

Bug 22891868 - Oracle Grid Clusterware OHASD does not restart CRSD when crsd.bin is hanging (Doc ID 22891868.8)

Bug 22891868 - Oracle Grid Clusterware OHASD does not restart CRSD when crsd.bin is hanging (Doc ID 22891868.8)

Bug 22891868  OHASD does not restart CRSD when crsd.bin is hanging
 This note gives a brief overview of bug 22891868.
 The content was last updated on: 28-JUN-2018
 Click here for details of each of the sections below.
Affects:
Product (Component) Oracle Server (PCW)
Range of versions believed to be affected Versions BELOW 12.2
Versions confirmed as being affected
12.1.0.2 (Server Patch Set)
Platforms affected Generic (all / most platforms affected)
Fixed:
The fix for 22891868 is first included in
12.2.0.1 (Base Release)
12.1.0.2.170418 (Apr 2017) Grid Infrastructure Patch Set Update (GI PSU)
12.1.0.2.170418 (Apr 2017) Bundle Patch for Windows Platforms
 
Interim patches may be available for earlier versions - click here to check.
Symptoms:
Related To:
(None Specified)
Cluster Ready Services / Parallel Server Management
Description
This bug is only relevant when using Real Application Clusters (RAC)
OHASD may not restart CRSD when crsd.bin is hanging
  
 
Rediscovery Notes
 
1. We may have this scenario:
 
  Time 1. CRSD hangs
  Time 2. OHASD is  terminated
  Time 3. CRSD still hanging
  Time 4. OHASD restart
 
2. A call stack on OHASD reveals a check on ASM is running:
 
....
clsn_agent::CrsCmd::ClscrsCmdData::stat(clsagfw_aectx const*,
std::map<std::basic_string<char, std::char_traits<char>, std::allocator<char>
>, std::basic_string<char, std::char_traits<char>, std::allocator<char> >,
std::less<std::basic_string<char, std::char_traits<char>,
std::allocator<char> > >, std::allocator<std::pair<std::basic_string<char,
std::char_traits<char>, std::allocator<char> > const, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > > > >&, CLSCRS_STATFLAG, bool)
()
#16 0x00000000006fe968 in
clsn_agent::CrsCmd::ClscrsCmdData::stat(clsagfw_aectx const*,
std::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&, std::basic_string<char, std::char_traits<char>, std::allocator<char>
>&, CLSCRS_STATFLAG, bool) ()
#17 0x00000000006f805f in clsn_agent::CrsCmd::stat(clsagfw_aectx const*,
std::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&, std::basic_string<char, std::char_traits<char>, std::allocator<char>
> const&, CLSCRS_FLAG, std::basic_string<char, std::char_traits<char>,
std::allocator<char> >&, std::basic_string<char, std::char_traits<char>,
std::allocator<char> >&, CLSCRS_STATFLAG, bool) ()
#18 0x000000000048104f in clsn_agent::AsmAgent::checkCbk(clsagfw_aectx
const*, clsn_agent::Gimh*, std::basic_string<char, std::char_traits<char>,
std::allocator<char> >&) ()
#19 0x0000000000554d16 in clsn_agent::InstAgent::checkState(clsagfw_aectx
const*) ()
#20 0x0000000000551fbb in clsn_agent::InstAgent::check(clsagfw_aectx const*)
()
#21 0x000000000045f7a1 in clsn_agent::Agent::commonCheck(clsagfw_aectx
const*) ()
#22 0x0000000000508510 in clsn_agent::check(clsagfw_aectx const*) ()
#23 0x000000000098cf80 in cls_agfw::Cmd::execute() ()
#24 0x0000000000990cfd in cls_agfw::CmdEx::executeCmd(cls::Message*) ()
#25 0x0000000000990b4f in cls_agfw::CmdEx::clsRequestHdlr(cls::Message*) ()
#26 0x00000000009fd333 in cls::ThreadModel::processQueue(sltstid*) ()
#27 0x00000000009fbe54 in cls::ThreadModel::runTM(void*) ()
#28 0x0000000000a098fb in CLS_Threading::CLSthreadMain::cppStart(void*) ()
 
3. When the check action on ASM is completed (or aborted) then OHASD tries to start CRSD. This could happen after 20 mins.
 
Workaround
  Start the CRSD resource running:
 
        crsctl start res ora.crsd -init
 
Note. This fix is dependent on the fix for bug 8934841