我们中心系统使用nbu进行备份管理,今日发现有备份任务失败,发现了status 6的错误,下面是本人简单排查的一个过程
<a href="http://blog.51cto.com/attachment/201209/110624612.jpg" target="_blank"></a>
记得在赛门铁克的nbu手册里面有一段记录是这样的:
<a href="http://blog.51cto.com/attachment/201209/103942245.jpg" target="_blank"></a>
那么我们可以查看我们的Oracle主机里备份脚本所输出的log日志进行下一步的诊断~
jszwdb01 [/]$ cd /oracle/nbu_scripts
jszwdb01 [/oracle/nbu_scripts]$ ls
hot_database_backup.sh hot_database_backup_arch.sh
hot_database_backup.sh.out hot_database_backup_arch.sh.out
jszwdb01 [/oracle/nbu_scripts]$ tail -n 200 hot_database_backup.sh.out
input datafile fno=00043 name=/dev/rjs_settindex12
input datafile fno=00089 name=/dev/rjs_settle46
input datafile fno=00135 name=/dev/rjs_statdata031
input datafile fno=00181 name=/dev/rjs_statdata077
input datafile fno=00221 name=/dev/rjs_statindex018
channel ch00: starting piece 1 at 13-SEP-12
channel ch01: finished piece 1 at 13-SEP-12
piece handle=bk_34031_1_793845665 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch01: backup set complete, elapsed time: 00:01:11
channel ch01: starting incremental level 1 datafile backupset
channel ch01: specifying datafile(s) in backupset
input datafile fno=00044 name=/dev/rjs_settindex13
input datafile fno=00090 name=/dev/rjs_settle47
input datafile fno=00136 name=/dev/rjs_statdata032
input datafile fno=00182 name=/dev/rjs_statdata078
input datafile fno=00222 name=/dev/rjs_statindex019
channel ch01: starting piece 1 at 13-SEP-12
channel ch00: finished piece 1 at 13-SEP-12
piece handle=bk_34032_1_793845711 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch00: backup set complete, elapsed time: 00:01:01
channel ch00: starting incremental level 1 datafile backupset
channel ch00: specifying datafile(s) in backupset
input datafile fno=00045 name=/dev/rjs_settle02
input datafile fno=00091 name=/dev/rjs_settle48
input datafile fno=00137 name=/dev/rjs_statdata033
input datafile fno=00183 name=/dev/rjs_statdata079
input datafile fno=00223 name=/dev/rjs_statindex020
piece handle=bk_34033_1_793845736 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch01: backup set complete, elapsed time: 00:01:01
input datafile fno=00046 name=/dev/rjs_settle03
input datafile fno=00092 name=/dev/rjs_settle49
input datafile fno=00138 name=/dev/rjs_statdata034
input datafile fno=00184 name=/dev/rjs_statdata080
input datafile fno=00002 name=/dev/rjs_undotbs01
piece handle=bk_34034_1_793845772 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00047 name=/dev/rjs_settle04
input datafile fno=00093 name=/dev/rjs_settle50
input datafile fno=00139 name=/dev/rjs_statdata035
input datafile fno=00185 name=/dev/rjs_statdata081
input datafile fno=00224 name=/dev/rjs_undotbs02
piece handle=bk_34035_1_793845797 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00005 name=/dev/rjs_commdata001
input datafile fno=00051 name=/dev/rjs_settle08
input datafile fno=00097 name=/dev/rjs_settle54
input datafile fno=00143 name=/dev/rjs_statdata039
input datafile fno=00001 name=/dev/rjs_system
piece handle=bk_34036_1_793845833 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch00: backup set complete, elapsed time: 00:01:11
input datafile fno=00006 name=/dev/rjs_settindex01
input datafile fno=00052 name=/dev/rjs_settle09
input datafile fno=00098 name=/dev/rjs_settle55
input datafile fno=00144 name=/dev/rjs_statdata040
input datafile fno=00003 name=/dev/rjs_sysaux
piece handle=bk_34037_1_793845869 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00007 name=/dev/rjs_settle01
input datafile fno=00053 name=/dev/rjs_settle10
input datafile fno=00099 name=/dev/rjs_settle56
input datafile fno=00145 name=/dev/rjs_statdata041
input datafile fno=00226 name=/dev/rjs_sysaux01
piece handle=bk_34038_1_793845904 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00008 name=/dev/rjs_statdata001
input datafile fno=00054 name=/dev/rjs_settle11
input datafile fno=00100 name=/dev/rjs_settle57
input datafile fno=00146 name=/dev/rjs_statdata042
input datafile fno=00227 name=/dev/rjs_sysaux02
piece handle=bk_34039_1_793845930 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00009 name=/dev/rjs_statindex001
input datafile fno=00055 name=/dev/rjs_settle12
input datafile fno=00101 name=/dev/rjs_settle58
input datafile fno=00147 name=/dev/rjs_statdata043
input datafile fno=00225 name=/dev/rjs_rman
piece handle=bk_34040_1_793845965 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00010 name=/dev/rjs_commindex01
input datafile fno=00056 name=/dev/rjs_settle13
input datafile fno=00102 name=/dev/rjs_settle59
input datafile fno=00148 name=/dev/rjs_statdata044
input datafile fno=00004 name=/dev/rjs_user
piece handle=bk_34041_1_793845991 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00048 name=/dev/rjs_settle05
input datafile fno=00094 name=/dev/rjs_settle51
input datafile fno=00140 name=/dev/rjs_statdata036
input datafile fno=00186 name=/dev/rjs_statdata082
piece handle=bk_34042_1_793846027 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00049 name=/dev/rjs_settle06
input datafile fno=00095 name=/dev/rjs_settle52
input datafile fno=00141 name=/dev/rjs_statdata037
input datafile fno=00187 name=/dev/rjs_statdata083
piece handle=bk_34043_1_793846052 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
input datafile fno=00050 name=/dev/rjs_settle07
input datafile fno=00096 name=/dev/rjs_settle53
input datafile fno=00142 name=/dev/rjs_statdata038
input datafile fno=00188 name=/dev/rjs_statdata084
piece handle=bk_34044_1_793846088 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
including current control file in backupset
piece handle=bk_34045_1_793846113 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
including current SPFILE in backupset
piece handle=bk_34046_1_793846149 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
piece handle=bk_34047_1_793846175 tag=HOT_DB_BK_LEVEL0 comment=API Version 2.0,MMS Version 5.0.0.0
channel ch01: backup set complete, elapsed time: 00:01:00
Finished backup at 13-SEP-12
sql statement: alter system archive log current
released channel: ch00
released channel: ch01
allocated channel: ch00
channel ch00: sid=5693 devtype=SBT_TAPE
channel ch00: Veritas NetBackup for Oracle - Release 7.0 (2010070805)
sent command to channel: ch00
Starting backup at 13-SEP-12
current log archived
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-03002: failure of backup command at 09/13/2012 00:50:54
RMAN-06059: expected archived log not found, lost of archived log compromises recoverability
ORA-19625: error identifying file /ora_arch/1_143570_734312913.arc
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: A file or directory in the path name does not exist.
Additional information: 3
RMAN> RMAN>
Recovery Manager complete.
Script /oracle/nbu_scripts/hot_database_backup.sh
==== ended in error on Thu Sep 13 00:50:55 GMT+08:00 2012 ====
发现了大量的RMAN备份error,其中我们关注一下ora-19625
jszwdb01 [/oracle/nbu_scripts]$ cd /ora_arch
jszwdb01 [/ora_arch]$ ls
1_144167_734312913.arc 1_144179_734312913.arc 1_144191_734312913.arc 1_144203_734312913.arc 1_144215_734312913.arc 1_144227_734312913.arc
1_144168_734312913.arc 1_144180_734312913.arc 1_144192_734312913.arc 1_144204_734312913.arc 1_144216_734312913.arc addmrpt_1_14314_14316.txt
1_144169_734312913.arc 1_144181_734312913.arc 1_144193_734312913.arc 1_144205_734312913.arc 1_144217_734312913.arc awrrpt_1_13612_13625.html
1_144170_734312913.arc 1_144182_734312913.arc 1_144194_734312913.arc 1_144206_734312913.arc 1_144218_734312913.arc awrrpt_1_13614_13617.html
1_144171_734312913.arc 1_144183_734312913.arc 1_144195_734312913.arc 1_144207_734312913.arc 1_144219_734312913.arc awrrpt_1_13618_13621.html
1_144172_734312913.arc 1_144184_734312913.arc 1_144196_734312913.arc 1_144208_734312913.arc 1_144220_734312913.arc full_str.dump
1_144173_734312913.arc 1_144185_734312913.arc 1_144197_734312913.arc 1_144209_734312913.arc 1_144221_734312913.arc jzjs_ora_4588116.trc
1_144174_734312913.arc 1_144186_734312913.arc 1_144198_734312913.arc 1_144210_734312913.arc 1_144222_734312913.arc lost+found
1_144175_734312913.arc 1_144187_734312913.arc 1_144199_734312913.arc 1_144211_734312913.arc 1_144223_734312913.arc ora_back_bitmap
1_144176_734312913.arc 1_144188_734312913.arc 1_144200_734312913.arc 1_144212_734312913.arc 1_144224_734312913.arc
1_144177_734312913.arc 1_144189_734312913.arc 1_144201_734312913.arc 1_144213_734312913.arc 1_144225_734312913.arc
1_144178_734312913.arc 1_144190_734312913.arc 1_144202_734312913.arc 1_144214_734312913.arc 1_144226_734312913.arc
jszwdb01 [/ora_arch]$ ls
jszwdb01 [/ora_arch]$ ls 1_143570_734312913.arc
ls: 0653-341 The file 1_143570_734312913.arc does not exist.
原因找到了,操作系统下删除了这个归档日志,这个日志还没备份过,rman去备份的时候,找不到这个归档就出错了,Oracle在controlfile中记录着archivelog信息,当我们把这些物理文件delete掉或异常变动后,在controlfile中仍然记录着这些archivelog的信息,当我们手工清除archive目录下的文件后,这些记录并没有被我们从controlfile中清除掉,此时Oracle并不知道这些文件已经不存在。我们进入RMAN中执行crosscheck archivelog all;进行交叉检验,让oracle控制文件同步归档文件的信息。
本文转自yangjunfeng 51CTO博客,原文链接:http://blog.51cto.com/yangjunfeng/989435