The Oracle RAC databases were being backed up regularly without using catalog, even worse, the database controlfile autobackup was turned off. That configuration compromise the RPO. I restored and recovered the database to the latest scn that the last controlfile backup knows, but not the latest scn that available in the archivelogs. Below are the steps:
The assumption is that all database files, controlfile are gone.
1. list available backup handles in netbackup after a particular date:
/usr/openv/netbackup/bin/bplist -l -S $netbackupserver -C $client -t 4 -R -s 11/29/2018 /
output:
...
-rw-rw---- oracle asmadmin 19922944 Nov 29 11:55 /ctrl_dAPIMGRS_u16tje9lj_s1062_p1_t993470131
-rw-rw---- oracle asmadmin 142082048 Nov 29 11:54 /arch_dAPIMGRS_u15tje9kd_s1061_p1_t993470093
...
From the above output, we guess the latest controlfile backup is in /ctrl_dAPIMGRS_u16tje9lj_s1062_p1_t993470131
2. restore controlfile from this handle:
shutdown immediate
start nomount
run
{
allocate channel ch10 type 'SBT_TAPE' SEND 'NB_ORA_CLIENT=$client,NB_ORA_POLICY=$policy,NB_ORA_SERV=$netbackupserver';
restore controlfile from 'ctrl_dAPIMGRS_u16tje9lj_s1062_p1_t993470131';
}
alter database mount
3. Find the scn that should be used for restore and recover:
RMAN> list backup of archivelog time between 'sysdate-2' and 'sysdate';
List of Backup Sets
===================
BS Key Size Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ -----------------------------
1058 135.50M SBT_TAPE 00:00:16 29-november -2018 11:55:09 am
BP Key: 1058 Status: AVAILABLE Compressed: NO Tag: TAG20181129T115453
Handle: arch_dAPIMGRS_u15tje9kd_s1061_p1_t993470093 Media: @aaaac
List of Archived Logs in backup set 1058
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- ----------------------------- ---------- ---------
1 1855 178307993 29-november -2018 09:00:40 am 178324870 29-november -2018 11:54:52 am
1 1856 178324870 29-november -2018 11:54:52 am 178324888 29-november -2018 11:54:53 am
2 2803 178305976 29-november -2018 08:40:47 am 178324361 29-november -2018 11:53:11 am
3 2759 178302578 29-november -2018 08:00:35 am 178324772 29-november -2018 11:53:46 am
6. recover database
run
{
}
7. open database
SQL> alter database open resetlogs;
8. verify database:
SQL> select scn_to_timestamp(178324361) from dual;
SCN_TO_TIMESTAMP(178324361)
---------------------------------------------------------------------------
29-NOV-18 11.53.08.000000000 AM
...
The key is to find out what archivelogs are available and known by the controlfile, choose scn number wisely, otherwise the restore database can have datafile header bigger than scn available in the known archivelogs, ends up the database cannot be recovered and open resetlogs.
the error will be:
The assumption is that all database files, controlfile are gone.
1. list available backup handles in netbackup after a particular date:
/usr/openv/netbackup/bin/bplist -l -S $netbackupserver -C $client -t 4 -R -s 11/29/2018 /
output:
...
-rw-rw---- oracle asmadmin 19922944 Nov 29 11:55 /ctrl_dAPIMGRS_u16tje9lj_s1062_p1_t993470131
-rw-rw---- oracle asmadmin 142082048 Nov 29 11:54 /arch_dAPIMGRS_u15tje9kd_s1061_p1_t993470093
...
From the above output, we guess the latest controlfile backup is in /ctrl_dAPIMGRS_u16tje9lj_s1062_p1_t993470131
2. restore controlfile from this handle:
shutdown immediate
start nomount
run
{
allocate channel ch10 type 'SBT_TAPE' SEND 'NB_ORA_CLIENT=$client,NB_ORA_POLICY=$policy,NB_ORA_SERV=$netbackupserver';
restore controlfile from 'ctrl_dAPIMGRS_u16tje9lj_s1062_p1_t993470131';
}
alter database mount
3. Find the scn that should be used for restore and recover:
RMAN> list backup of archivelog time between 'sysdate-2' and 'sysdate';
List of Backup Sets
===================
BS Key Size Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ -----------------------------
1058 135.50M SBT_TAPE 00:00:16 29-november -2018 11:55:09 am
BP Key: 1058 Status: AVAILABLE Compressed: NO Tag: TAG20181129T115453
Handle: arch_dAPIMGRS_u15tje9kd_s1061_p1_t993470093 Media: @aaaac
List of Archived Logs in backup set 1058
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- ----------------------------- ---------- ---------
1 1855 178307993 29-november -2018 09:00:40 am 178324870 29-november -2018 11:54:52 am
1 1856 178324870 29-november -2018 11:54:52 am 178324888 29-november -2018 11:54:53 am
2 2803 178305976 29-november -2018 08:40:47 am 178324361 29-november -2018 11:53:11 am
3 2759 178302578 29-november -2018 08:00:35 am 178324772 29-november -2018 11:53:46 am
find the max next_scn of the three RAC threads, then choose the smallest of the threee, which should be 178324772, but I accidentally chose 178324361 in this test, which is acceptable for test database, minor data lose, better than cannot recovering database, which error RMAN-06054 is shown at the end of this post.
Though there might be more later archivelogs backup shown in bplist in step 1, however, they are taken after this scn, so controlfile will not be able to use them. if the backup use catalog, that would be different story, because the catalog will record all the archivelogs handles and should be able to use them.
4. restore database using this scn:
run
{
allocate channel ch10 type 'SBT_TAPE' SEND 'NB_ORA_CLIENT=$client,NB_ORA_POLICY=$policy,NB_ORA_SERV=$netbackupserver';
...
restore database until scn 178324361 ;
}
5. verify the scn in datafile header: they should be all smaller than 178324361
SQL> select file#,FUZZY,CHECKPOINT_CHANGE# from V$DATAFILE_HEADER order by 3;
FILE# FUZ CHECKPOINT_CHANGE#
---------- --- ------------------
2 NO 178137599
5 NO 178137622
6 NO 178137658
7 NO 178323950
8 NO 178323951
3 NO 178324160
9 NO 178324164
10 NO 178324197
1 NO 178324228
4 NO 178324271
10 rows selected.
6. recover database
run
{
allocate channel ch10 type 'SBT_TAPE' SEND 'NB_ORA_CLIENT=$client,NB_ORA_POLICY=$policy,NB_ORA_SERV=$netbackupserver';
...
recover database until scn 178324361 ;}
7. open database
SQL> alter database open resetlogs;
8. verify database:
SQL> select scn_to_timestamp(178324361) from dual;
SCN_TO_TIMESTAMP(178324361)
---------------------------------------------------------------------------
29-NOV-18 11.53.08.000000000 AM
...
The key is to find out what archivelogs are available and known by the controlfile, choose scn number wisely, otherwise the restore database can have datafile header bigger than scn available in the known archivelogs, ends up the database cannot be recovered and open resetlogs.
the error will be:
unable
to find archived log
archived
log thread=1 sequence=1837
RMAN-00571:
===========================================================
RMAN-00569:
=============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571:
===========================================================
RMAN-03002:
failure of recover command at 11/29/2018 16:08:20
RMAN-06054:
media recovery requesting unknown archived log for thread 1 with sequence 1837
and starting SCN of 17xxxxxxx ( a bigger scn)
Comments
Post a Comment