天天看点

dg broker配置的问题及分析

今天在配置一个备库的时候碰到了一些问题,话说配置dg broker真没什么特别需要注意的细节了,本身已经给DBA省了很大的事儿了。

但是有时候就是会出现一些稀奇古怪的小问题。这个环境又非常重要,备库已经因为硬件故障报废了,现在刚搭的备库就想赶紧把它跑起来。

简单添加配置之后,spfile,防火墙,端口,listener等等因素都满足了。感觉就是一蹴而就的事情了。

但是show configuration的时候就是报错。

DGMGRL> show configuration;

Configuration - test_dg

  Protection Mode: MaxPerformance

  Databases:

    test   - Primary database

      Error: ORA-16778: redo transport error for one or more databases

    stest1 - Physical standby database

      Warning: ORA-16792: configurable property value is inconsistent with database setting

Fast-Start Failover: DISABLED

Configuration Status:

ERROR

对于这个问题,常规思路如果想得到更多的明细信息,直接使用verbose方式来查看。

查看主库的verbose信息

DGMGRL> show database verbose test;

Database - test

  Role:            PRIMARY

  Intended State:  TRANSPORT-ON

  Instance(s):  test

      Error: ORA-16737: the redo transport service for standby database "stest1" has an error

  Properties:

    DGConnectIdentifier             = 'test'

    ObserverConnectIdentifier       = ''

    LogXptMode                      = 'ASYNC'

    DelayMins                       = '0'

    Binding                         = 'optional'

    MaxFailure                      = '0'

    MaxConnections                  = '1'

    ReopenSecs                      = '300'

    NetTimeout                      = '30'

    RedoCompression                 = 'DISABLE'

    LogShipping                     = 'ON'

    PreferredApplyInstance          = ''

    ApplyInstanceTimeout            = '0'

    ApplyParallel                   = 'AUTO'

    StandbyFileManagement           = 'AUTO'

    ArchiveLagTarget                = '0'

    LogArchiveMaxProcesses          = '2'

    LogArchiveMinSucceedDest        = '1'

    DbFileNameConvert               = ''

    LogFileNameConvert              = ''

    FastStartFailoverTarget         = ''

    InconsistentProperties          = '(monitor)'

    InconsistentLogXptProps         = '(monitor)'

    SendQEntries                    = '(monitor)'

    LogXptStatus                    = '(monitor)'

    RecvQEntries                    = '(monitor)'

    SidName                         = 'test'

StaticConnectIdentifier         =

'(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.127.65.111)(PORT=1535))(CONNECT_DATA=(SERVICE_NAME=test_DGMGRL)(INSTANCE_NAME=test)(SERVER=DEDICATED)))'

    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'

    AlternateLocation               = ''

    LogArchiveTrace                 = '0'

    LogArchiveFormat                = '%t_%s_%r.dbf'

    TopWaitEvents                   = '(monitor)'

Database Status:

查看备库的verbose信息

DGMGRL> show database verbose stest1;

Database - stest1

  Role:            PHYSICAL STANDBY

  Intended State:  APPLY-ON

  Transport Lag:   (unknown)

  Apply Lag:       (unknown)

  Real Time Query: OFF

      Warning: ORA-16714: the value of property ArchiveLagTarget is inconsistent with the database setting

    DGConnectIdentifier             = 'stest1'

    DbFileNameConvert               = '/U01/app/oracle/oradata/test, /U01/app/oracle/oradata/test, /data/oracle/oradata/test, /U01/app/oracle/oradata/test, /other/app/oracle/oradata/test, /U01/app/oracle/oradata/test, +DATA, /U01/app/oracle/oradata/test, +ARCH, /U01/app/oracle/oradata/test'

    LogFileNameConvert              = '/U01/app/oracle/oradata/test, /U01/app/oracle/oradata/test, /data/oracle/oradata/test, /U01/app/oracle/oradata/test, /other/app/oracle/oradata/test, /U01/app/oracle/oradata/test, +DATA, /U01/app/oracle/oradata/test, +ARCH, /U01/app/oracle/oradata/test'

    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.11.14.12)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=stest1_DGMGRL)(INSTANCE_NAME=test)(SERVER=DEDICATED)))'

WARNING

我是横竖看了很多遍,实在是没找出哪里的配置不一致了。

对于这类问题,一般都是推荐查看主库的归档路径,是否出现了不一致,连接不通的问题,或者是db_unique_name的问题。

查看v$archive_dest发现,归档路径2确实显示有问题。

SQL> select dest_id,error from v$archive_dest;

   DEST_ID ERROR

---------- -----------------------------------------------------------------

         1

         2 ORA-16047: DGID mismatch between destination setting and target database

问题的原因说是DGID不匹配。那么来看看归档路径2,这个也是dg broker自动生成的,是在也没发现那里有问题。          

log_archive_dest_2                   string      service="stest1", LGWR ASYNC NO AFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_

                                                           unique_name="stest1" net_timeout=30, valid_for=(all_logfiles,primary_role)                                                                                      

查看备库dg broker的日志,发现报出了这么一段警告。但是原因未知。

11/18/2015 18:04:38

Warning: Property 'ArchiveLagTarget' has inconsistent values:METADATA='0', SPFILE='', DATABASE='0'

11/18/2015 18:05:14

11/18/2015 18:06:08

查看备库的alert日志,提示接收gap的归档存在问题,我就开始慌了,很重要的一套库,不能有任何闪失,要不又得重来一次了,真感觉实在是太酸爽了。

Error 12541 received logging on to the standby

Check whether the listener is up and running.

FAL[client, USER]: Error 12541 connecting to test for fetching gap sequence

Wed Nov 18 18:02:36 2015

FAL[client]: Failed to request gap sequence

 GAP - thread 1 sequence 460503-460515

 DBID 1210367666 branch 622336050

FAL[client]: All defined FAL servers have been attempted.

------------------------------------------------------------

Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization

parameter is defined to a value that's sufficiently large

enough to maintain adequate log switch information to retestve

archivelog gaps.

特别申明一下,这些操作都是在上午做的,如果没有发现有什么端倪,就可以继续往下看。

这个时候尝试重建dg broker文件。发现朱备库的dr的文件大小相同,但是时间戳不同。

这个时候查看备库的时间

$ date

Wed Nov 18 18:30:49 CST 2015

发现时间压根就不同步,要和主库的保持一致,还是使用nftp来做。

# /usr/sbin/ntpdate 192.168.131.132

18 Nov 10:32:17 ntpdate[48502]: step time server 192.168.131.132 offset -28854.645360 sec

时间修正之后,再次查看,就没有任何问题了。                                                         

SUCCESS

这个问题也算是早上给自己的一个小警告,一个非常细小的问题就很可能造成很大的延误。所以环境的检查还是要细致,不能轻视。

继续阅读