Our RAC crash when we have storage issue, after storage is fixed, starting RAC using "crsctl start crs" failed on ASM start. Did not think it could be caused by GNS but looks like it was. GNS was offline because the VIP somehow ran away on one of the RAC node, I can ping and ssh to the VIP but cannot run "srvctl stop/start GNS".
This is the evidence that ASM rely on GNS to be online, if GNS is not online, then the remote_listener in +ASM's parameter file is invalid and asm is killed.
In alert_ASM1.log:
Using parameter settings in server-side spfile +OCR_VOTE/wwwracprdcrs00/asmparameterfile/registry.253.830014321
...
Sat Aug 03 00:49:10 2019
USER (ospid: 16060): terminating the instance due to error 119
The ASM log does not say why error 119 occur, but checking ohasd_oraagent_grid.trc, it's the remote_listener that causing problem:
In ohasd_oraagent_grid.trc:
2019-08-03 00:49:10.295345 :CLSDYNAM:2912564992: [ ora.asm]{0:5:3} [start] ORA-00119: invalid specification for system parameter REMOTE_LISTENER
ORA-00132: syntax error or unresolved network name 'scan.prd01.rac.bcferries.corp'
...
2019-08-03 00:49:10.318762 : AGFW:2916767488: {0:5:3} ora.asm 1 1 state changed from: STARTING to: OFFLINE
sqlplus / as sysasm
SQL> show parameter remote
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string scan.prd01.rac.bcferries.corp
After we powered off the node which run VIP, the ASM and CRS can start.
I am doing more investigations, I can hardly believe this. thought CRS/ASM should start before GNS service, according to this link, GNS is level 4, ASM is level 2:
https://www.hhutzler.de/blog/troubleshooting-clusterware-startup-problems/
This is the evidence that ASM rely on GNS to be online, if GNS is not online, then the remote_listener in +ASM's parameter file is invalid and asm is killed.
In alert_ASM1.log:
Using parameter settings in server-side spfile +OCR_VOTE/wwwracprdcrs00/asmparameterfile/registry.253.830014321
...
Sat Aug 03 00:49:10 2019
USER (ospid: 16060): terminating the instance due to error 119
The ASM log does not say why error 119 occur, but checking ohasd_oraagent_grid.trc, it's the remote_listener that causing problem:
In ohasd_oraagent_grid.trc:
2019-08-03 00:49:10.295345 :CLSDYNAM:2912564992: [ ora.asm]{0:5:3} [start] ORA-00119: invalid specification for system parameter REMOTE_LISTENER
ORA-00132: syntax error or unresolved network name 'scan.prd01.rac.bcferries.corp'
...
2019-08-03 00:49:10.318762 : AGFW:2916767488: {0:5:3} ora.asm 1 1 state changed from: STARTING to: OFFLINE
sqlplus / as sysasm
SQL> show parameter remote
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string scan.prd01.rac.bcferries.corp
After we powered off the node which run VIP, the ASM and CRS can start.
I am doing more investigations, I can hardly believe this. thought CRS/ASM should start before GNS service, according to this link, GNS is level 4, ASM is level 2:
https://www.hhutzler.de/blog/troubleshooting-clusterware-startup-problems/
Comments
Post a Comment