Friday, May 23, 2014

Block Recovery using RMAN with Valid backup when datafile header is not corrupted

In this demo, we will see how to perform a block recovery using RMAN utility.
Block recovery feature can be used when database is a 24/7 and cannot be shutdown.  We can use block recovery either in catalog-mode or no-catalog mode.
Catalog-mode might be required if we need to restore from an old backup which is not known to RMAN repository information in the control file anymore.
Block corruption errors are not detected in oracle until we use dbv (db verify),RMAN,analyze or export command.
Important views for checking block corruption
a.       v$database_block_corruption


Steps.

        1. Create a table in users tablespace
SQL> create table blktest as select * from dba_tables;

Table created.

         2. Get the count of the records.
SQL> select count(1) from blktest;

  COUNT(1)
----------
      2777

         3. Before corrupting the block, we will have all these information (Just for information).
Identify the blocks belonging to that table.

SQL> select * from (select distinct dbms_rowid.rowid_block_number(rowid) from blktest) where rownum <10;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                 299
                                 300
                                 301
                                 302
                                 303
                                 304
                                 305
                                 306
                                 307

9 rows selected.

SQL>  select file#,blocks,name from v$datafile;

     FILE#     BLOCKS NAME
---------- ---------- ---------------------------------------------
         1      92160 /data1/noasmdb/datafile/system01.dbf
         2      72960 /data1/noasmdb/datafile/sysaux01.dbf
         3     448000 /data1/noasmdb/datafile/testbkp01.dbf
         4        800 /data1/noasmdb/datafile/users01.dbf
         5      25600 /data1/noasmdb/datafile/undotbs01.dbf

SQL> select * from v$database_block_corruption;

no rows selected

         4. Now, we will corrupt the blocks for USERS tablespace.  (Simulating block corruption)


SELECT 'dd of=' || f.file_name || ' bs=8192 conv=notrunc seek=' ||
       to_number(S.HEADER_BLOCK + 1) || ' << EOF',
       'CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt CORRUPT CORRUPT CORRUPT CORRUPT corrupt corrupt ',
       'EOF'
  FROM DBA_SEGMENTS S, dba_data_files f
 WHERE f.tablespace_name = 'USERS'
   and S.SEGMENT_NAME = 'BLKTEST' 

dd of=/data1/noasmdb/datafile/users01.dbf bs=8192 conv=notrunc seek=299 << EOF
CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt CORRUPT CORRUPT
CORRUPT CORRUPT corrupt corrupt
EOF


[oracle@vm1 backup]$ dd of=/data1/noasmdb/datafile/users01.dbf bs=8192 conv=notrunc seek=299 << EOF
> CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt CORRUPT corrupt CORRUPT CORRUPT
> CORRUPT CORRUPT corrupt corrupt
> EOF
0+1 records in
0+1 records out
112 bytes (112 B) copied, 0.000192 seconds, 583 kB/s

       5. Check the block is corrupted.


SQL> alter system flush buffer_cache;

System altered.



6.  To confirm, we will try to query the table now and let’see what is the result is.
select count(1) from blktest
                      *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 299)
ORA-01110: data file 4: '/data1/noasmdb/datafile/users01.dbf'


7.       Alert log give following message.

Hex dump of (file 4, block 299) in trace file /u01/app/oracle/diag/rdbms/noasmdb/noasmdb/trace/noasmdb_ora_10185.trc
Corrupt block relative dba: 0x0100012b (file 4, block 299)
Bad header found during multiblock buffer read
Data in bad block:
 type: 67 format: 7 rdba: 0x20545055
 last change scn: 0x7075.72726f63 seq: 0x74 flg: 0x20
 spare1: 0x52 spare2: 0x52 spare3: 0x5252
 consistency value in tail: 0x8ac00601
 check value in block header: 0x4f43
 block checksum disabled
Reading datafile '/data1/noasmdb/datafile/users01.dbf' for corruption at rdba: 0x0100012b (file 4, block 299)
Reread (file 4, block 299) found same corrupt data (no logical check)
Hex dump of (file 4, block 303) in trace file /u01/app/oracle/diag/rdbms/noasmdb/noasmdb/trace/noasmdb_ora_10185.trc
Corrupt block relative dba: 0x0100012f (file 4, block 303)
Completely zero block found during multiblock buffer read
Reading datafile '/data1/noasmdb/datafile/users01.dbf' for corruption at rdba: 0x0100012f (file 4, block 303)
Reread (file 4, block 303) found same corrupt data (no logical check)
Tue Feb 11 12:05:18 2014
Corrupt Block Found
         TSN = 4, TSNAME = USERS
         RFN = 4, BLK = 299, RDBA = 16777515
         OBJN = 75642, OBJD = 75642, OBJECT = BLKTEST, SUBOBJECT =
         SEGMENT OWNER = TEST1, SEGMENT TYPE = Table Segment
Corrupt Block Found
         TSN = 4, TSNAME = USERS
         RFN = 4, BLK = 303, RDBA = 16777519
         OBJN = 75642, OBJD = 75642, OBJECT = BLKTEST, SUBOBJECT =
         SEGMENT OWNER = TEST1, SEGMENT TYPE = Table Segment
Tue Feb 11 12:05:20 2014
Checker run found 1 new persistent data failures
Errors in file /u01/app/oracle/diag/rdbms/noasmdb/noasmdb/trace/noasmdb_ora_10185.trc  (incident=19475):
ORA-01578: ORACLE data block corrupted (file # 4, block # 299)
ORA-01110: data file 4: '/data1/noasmdb/datafile/users01.dbf'
Incident details in: /u01/app/oracle/diag/rdbms/noasmdb/noasmdb/incident/incdir_19475/noasmdb_ora_10185_i19475.trc
Tue Feb 11 12:05:32 2014
Dumping diagnostic data in directory=[cdmp_20140211120532], requested by (instance=1, osid=10185), summary=[incident=19475].
Tue Feb 11 12:05:33 2014
Sweep [inc][19475]: completed
Errors in file /u01/app/oracle/diag/rdbms/noasmdb/noasmdb/incident/incdir_19475/noasmdb_m000_10248_i19475_a.trc:
ORA-19583: conversation terminated due to error
ORA-01122: database file 4 failed verification check
ORA-01110: data file 4: '/data1/noasmdb/datafile/users01.dbf'
ORA-01565: error in identifying file '/data1/noasmdb/datafile/users01.dbf'
ORA-27046: file size is not a multiple of logical block size
Additional information: 1

8.       You can see, we can’t find any information in v$database_block_corruption view eventhough we have one of our datafile corrupted.

SQL> select * from v$database_block_corruption;

no rows selected

9.       Now , let us update v$database_block_corruption by below procedure.

RMAN> backup validate tablespace users;

Starting backup at 11-FEB-14
starting full resync of recovery catalog
full resync complete
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=20 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 02/11/2014 12:37:29
ORA-01122: database file 4 failed verification check
ORA-01110: data file 4: '/data1/noasmdb/datafile/users01.dbf'
ORA-01565: error in identifying file '/data1/noasmdb/datafile/users01.dbf'
ORA-27046: file size is not a multiple of logical block size
Additional information: 1

10.   Let us use dbverify utility to check the corruption.
[oracle@vm1 backup]$ dbv file=/data1/noasmdb/datafile/users01.dbf blocksize=8192

DBVERIFY: Release 11.2.0.3.0 - Production on Tue Feb 11 12:11:39 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.


DBV-00600: Fatal Error - [21] [5] [0] [0]


11.   From the above messages when querying for the table, we find the file#4 and block# 299 is corrupted.


12.   Now login to target database using RMAN
RMAN> blockrecover datafile 4 block 299;

Starting recover at 11-FEB-14
starting full resync of recovery catalog
full resync complete
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=20 device type=DISK
searching flashback logs for block images until SCN 1145158
finished flashback log search, restored 0 blocks

channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00004
channel ORA_DISK_1: reading from backup piece /home/oracle/backup/FullDBNOASM_3bp0biap_1_1
channel ORA_DISK_1: piece handle=/home/oracle/backup/FullDBNOASM_3bp0biap_1_1 tag=FULL DB
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:04

starting media recovery
media recovery complete, elapsed time: 00:00:03

Finished recover at 11-FEB-14


13.   Once the above step is done, check the count of the blktest table now.
SQL> select count(1) from blktest;

  COUNT(1)
----------
      2777


Conclusion :

When a block is corrupted, there is no need taking the tablespace offline. If we have a valid backup, we can recover the block and TABLE can be recovered back to normal except when the FIRST block of the datafile with datafile header is 

No comments:

Post a Comment