Skip to main content

CRS and ASM cannot start because GNS offline(VIP ran away)

Our RAC crash when we have storage issue, after storage is fixed, starting RAC using "crsctl start crs" failed on ASM start. Did not think it could be caused by GNS but looks like it was. GNS was offline because the VIP somehow ran away on one of the RAC node, I can ping and ssh to the VIP but cannot run "srvctl stop/start GNS".

This is the evidence that ASM rely on GNS to be online, if GNS is not online, then the remote_listener in +ASM's parameter file is invalid and asm is killed.

In alert_ASM1.log:
Using parameter settings in server-side spfile +OCR_VOTE/wwwracprdcrs00/asmparameterfile/registry.253.830014321
...
Sat Aug 03 00:49:10 2019
USER (ospid: 16060): terminating the instance due to error 119

The ASM log does not say why error 119 occur, but checking ohasd_oraagent_grid.trc, it's the remote_listener that causing problem:

In ohasd_oraagent_grid.trc:

2019-08-03 00:49:10.295345 :CLSDYNAM:2912564992: [ ora.asm]{0:5:3} [start] ORA-00119: invalid specification for system parameter REMOTE_LISTENER
ORA-00132: syntax error or unresolved network name 'scan.prd01.rac.bcferries.corp'
...
2019-08-03 00:49:10.318762 :    AGFW:2916767488: {0:5:3} ora.asm 1 1 state changed from: STARTING to: OFFLINE

sqlplus / as sysasm
SQL> show parameter remote

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
remote_listener                      string      scan.prd01.rac.bcferries.corp

After we powered off the node which run VIP, the ASM and CRS can start.

I am doing more investigations, I can hardly believe this. thought CRS/ASM should start before GNS service, according to this link, GNS is level 4, ASM is level 2:

https://www.hhutzler.de/blog/troubleshooting-clusterware-startup-problems/

Comments

Popular posts from this blog

Opatch apply/lsinventory error: oneoff is corrupted or does not exist

I am applying the quarterly patch for 19c RDBMS, I tried using napply but failed, but somehow it corrupted the inventory though nothing applied. further apply and lsinventory command ran into error like this: $ ./OPatch/opatch lsinventory Oracle Interim Patch Installer version 12.2.0.1.21 Copyright (c) 2020, Oracle Corporation.  All rights reserved. Oracle Home       : /u02/app/oracle/19.0.0 Central Inventory : /u01/app/oraInventory    from           : /u02/app/oracle/19.0.0/oraInst.loc OPatch version    : 12.2.0.1.21 OUI version       : 12.2.0.7.0 Log file location : /u02/app/oracle/19.0.0/cfgtoollogs/opatch/opatch2020-09-08_13-35-59PM_1.log Lsinventory Output file location : /u02/app/oracle/19.0.0/cfgtoollogs/opatch/lsinv/lsinventory2020-09-08_13-35-59PM.txt -------------------------------------------------------------------------------- Inventory load failed... OPatch cannot load inventory ...

oracle dba_hist_sysmetric_summary

found this blog is helpful to get CPU and IO statistics on oracle database. http://shob-dbadmin.blogspot.ca/2012/12/how-to-find-total-io-of-database.html courtesy to  Shomil Bansal , below are hist writing, not mine. How to find total IO of the database instance Total IO of database instance is sum of the physical reads, physical writes and redo writes. There are several views to find these values. v$sysmetric  - Reports metric values for only the most current time sample 60 secs. v$sysmetric_summary  - Reports metric values for time sample of 1 hour. v$sysmetric_history  - Reports metric values every 60 sec from the time instance is up. Better way to analyse IO using this view to take deltas between two time periods. dba_hist_sysmetric_history  - All the above views are refreshed when the instance is restarted. This view, part of AWR, stores the historical stats. I have used this view for my report. Query: ====== set lines 350...

ORA_RMAN_SGA_TARGET

assume that we lost all the files of oracle database but we do have rman backup, when trying to bring up a dummy database before restore start, I get this error. RMAN> startup nomount force; WARNING: cannot translate ORA_RMAN_SGA_TARGET value startup failed: ORA-01078: failure in processing system parameters ORA-01565: error in identifying file '+DATA/PROD/spfilePROD.ora' ORA-17503: ksfdopn:2 Failed to open file +DATA/PROD/spfilePROD.ora ORA-15056: additional error message ORA-17503: ksfdopn:DGOpenFile05 Failed to open file +DATA/prod/spfileprod.ora ORA-17503: ksfdopn:2 Failed to open file +DATA/prod/spfileprod.ora ORA-15173: entry 'spfileprod.ora' does not exist in directory 'prod' ORA-06512: at line 4 starting Oracle instance without parameter file for retrival of spfile RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =================================...