天天看點

OGG低版本Trail檔案6位,如何達到序列門檻值999999後如何處理?

一、問題

序列号的問題可以參考這個連結

https://www.cnblogs.com/lvcha001/p/14762355.html

Oracle  MOS 2538448.1,2120995.1 兩篇文章清晰的告訴了我們,OGG12.1 以及之前Trail檔案是6位,OGG版本大于等于12.2 Trail檔案格式轉為9位!

那麼現在的問題是,某客戶一天源端會出現進3k的Trail檔案,這種情況下序列客戶擔心後面達到6位數上限,xx999999,之後OGG如何處理呢???      

二、尋找資料

2.1已有的處理方式

https://blog.csdn.net/u014237598/article/details/100558507

Ogg投遞程序abend,報錯如下:

2017-09-27 13:35:51  ERROR   OGG-06498  The sequence number 999000 for output trail file '/ggscenter/dirdat/b8' has exceeded
 the maximum threshold (998999).  Please consult Oracle Knowledge Management Doc ID 1559048.1 for further actions. 
2017-09-27 13:35:51  ERROR   OGG-01668  PROCESS ABENDING.
————————————————
版權聲明:本文為CSDN部落客「清風果果果」的原創文章,遵循CC 4.0 BY-SA版權協定,轉載請附上原文出處連結及本聲明。
原文連結:https://blog.csdn.net/u014237598/article/details/100558507
删除原有的rmttrail檔案,新增新的rmttrail檔案:
delete rmttrail /ggscenter/dirdat/b8
delete rmttrail ./dirdat/b8
add rmttrail ./dirdat/28,extract b_p_28
可以根據删除程序之前,記錄最後讀取抽取程序的seq,rba alter修改一下
總結一下:處理的方法,可以等同于進行拆分的套路!      

2.2 MOS 1060554.1

1.如下的MOS主要是說,抽取程序如果抽取的序列名稱達到了999999,如何處理;
2.很啰嗦了說了一堆,說其實是支援序列号大于999999更大的值,但是alter 修改抽取程序式列,報錯parameter EXTSEQNO. Range is 0 to 999999序列号隻能在之間;
3.如果要處理這個問題,可以删除抽取程序,建立抽取程序,這樣讓Trail檔案的序列号又會從0開始,避免了門檻值的問題! 
!!!重點但是MOS的這篇文章很不嚴謹! 如果按照這個說法,建立的抽取程序begin now,很容易造成資料的不一緻問題!   原來的抽取程序最後的scn ->新抽取程序scn直接的差異資料丢失!
4.補充,我們可以學習 MOS 964684.1 、1267901.1 兩篇文檔,其中是将關于抽取程序拆分;      

GGSCI > ADD EXTRACT <extract>, TRANLOG, EXTSEQNO <Current Checkpoint sequence #>, EXTRBA <Current Checkpoint RBA>

GGSCI > ALTER EXTRACT <extract>, IOEXTSEQNO <Recovery Checkpoint #>, IOEXTRBA <Recovery Checkpoint RBA>

GGSCI > ADD RMTTRAIL./dirdat/<extrail trail>, METGABYTES xx, SEQNO <current write position Sequence #> , RBA <current write position RBA>, EXTRACT <extract> 

這樣新增的抽取程序與STOP 達到門檻值的OGG抽取程序抽取的時間點,達到一緻,這樣可以認為資料一緻!當然如果能初始化就不用這麼費勁了,直接begin now就完事了!

詳細抽取程序的操作及MOS可以見如下,部落格不友善發Word
https://www.modb.pro/doc/35751      
What Happens When The Maximum Number Of Trail Files (999999) Is Exceeded? (Doc ID 1060554.1)
Oracle GoldenGate - Version 4.0.0 and later
Provide additional information to the user on what happens when the maximum number of Trail files (999999) has been exceeded.
SOLUTION
Issue:
What will happen if the next output trail file number will be larger than 999999?
Consequences:
GoldenGate trail names are a composite of a two character prefix, such as 'et', and an incrementing trail number in the range of 0 to 999999.
 Incrementing the trail past 999999 has adverse consequences.

1. The next trail file number will go to 1000000.
Because four bytes are used in the Extract checkpoint file as a signed number for the output trail file number, the trail number can go as large
 as ~2 billion (2147483647). Beyond that, there will be a mis-match between the trail number in the file name and in the INFO EXTRACT output,
 because the number is signed. The number will be recycled after reaching 4294967295 (FFFFFF).
2. It has been tested that Replicat will process et1000000, after processing et999999.  But the replicat may not be able to process
 the next trail 1000001, and will hang.
3, Certain functions may not work for a trail number larger than 999999. For example:
GGSCI (dosi) 9> alter replicat <replicat_name>, extseqno 1000000
ERROR: Number out of range (1000000) for parameter EXTSEQNO. Range is 0 to 999999.

4. purgeoldextracts parameter may purge the trails before finishing processing them.
Solution and Workaround:
1. If the user has an extract with a trail file number approaching 999999, he should do the following. First, the trail sizes should
 probably be defined to be larger. Second, the user should stop the extract, saving the current time and rba being processed.
 The user should then delete the extract and existing trails once they are all processed. Next, the user should re-add the extract,
 preferably with suitable large trail sizes.
2. When the trail goes to 7 digits 1000000, the purgeoldextracts should be disabled immediately.
then stop the replicat and upstream pump (or extract if the pump is not in use), and do following manual changes:
- make backup of the checkpoint files of replicat and pump
- delete and re-add the pump rmtfile with appropriate seq#, then rename the trail files  to appropriate 6 digit number.
   example, there are 2 replicat trails rt1000000 and rt1000001.
   (1) change them to rt100000 and rt100001 respectively
   (2) add the pump rmttrail with "seqno 2, rba 0" option.
- reposition the replicat checkpoint.

3. If the new trails are delete by purgeoldextracts, the pump may be re-positioned to the local trail based on the replicat timestamp . 
 But there can be duplicate transactions within 1 second.  In addition, attention has to be paid on if the source and target servers
 are on same time zone.
 
Note:
As found in previous issues , a delete and re-add of the exttrail is not changing the write 7th digit sequence. And it doesn't show
 this info detail after the exttrail is re-added:
GGSCI> info extract <name>, detail
Remote Trail Name Seqno RBA Max MB
<trail_file_path>   0    0    100
But once EXTRACT is started, it continues to write a 7 digit sequence:
Info <extract_name>, detail
EXTRACT <extract_name>Last Started 2012-03-31 13:33 Status RUNNING
Checkpoint Lag 07:18:59 (updated 00:00:07 ago)
Log Read Checkpoint Oracle Redo Logs
2012-03-31 06:14:55 Thread 1, Seqno xxxxxx, RBA xxxxxxxx
Log Read Checkpoint Oracle Redo Logs
2012-03-31 06:14:55 Thread 2, Seqno xxxxxx, RBA xxxxxxxx

Target Extract Trails:
Remote Trail Name Seqno RBA Max MB
<trail_path> 1000001 53819 100
To get around this, we had to delete EXTTRAIL and add it back using a different trail prefix:
GGSCI> DELETE exttrail <trail_path>
GGSCI> add exttrail <trail_path>, megabytes 100, extract <extract_name>
Also refer GoldenGate Trail file Sequence Number Does Not properly Reset After 999999 (Doc ID 1453979.1)   for step by step instructions
Enhancement Request:
This issue is tracked in bug 10424514, as an enhancement request.
REFERENCES
NOTE:1453979.1 - GoldenGate 11g Trail File Sequence Number Does Not Properly Reset after 999999      

2.3 MOS 1453979.1

1.如下的MOS主要是針對投遞程序如果序列達到999999之後如何進行處理;
2.套路和2.1的大哥處理的方法一樣,删除程序,建立一個程序指定新的trail投遞過去的名稱,這樣seq就能從0開始重新計數;
3.抽取、投遞程序都有了解決方法,複制程序可以忽略,複制程序就是投遞程序投遞過去的trail檔案!
4.新版本大于等于12.2 OGG版本Trail檔案是9位數,從這個角度看Oracle也是希望避免出現這個問題!
5.有客戶從11.2 OGG 原地更新到12.3,但是OGG抽取程序的Trail檔案位數還是6,是以更新的小夥伴建議關注這個問題!!!
GoldenGate 11g Trail File Sequence Number Does Not Properly Reset after 999999 (Doc ID 1453979.1)    
Oracle GoldenGate - Version 11.1.1.0.0 to 11.2.1.0.39 [Release 11.1.1 to 11.2]
GOAL
How to reset the a GoldenGate trail sequence manually ?
SOLUTION
Steps to reset the sequence manually. This is resetting the trail file written by the pump. Instructions for resetting trails written by
 a log extract are similar and may be derived from the following information. 
----------------------------------------------
1)Stop the main extract
  stop extract 
2) Make sure that pump has read all the extract trail file generated by the main extract and is at the end of the last generated trail
 file and the lag is 0 
  SEND EXTRACT [pump_name], LOGEND 
  This should give the output as "YES"
3) Stop the pump
  stop extract 
4) Make sure that replicat has read all the extract trail file generated by the pump extract and is at the end of the last generated trail
 file and the lag is 0 
  SEND REPLICAT [replicat_name], LOGEND
  This should give the output as "YES"
5)Stop the replicat
  stop replicat 
6) make a note of the the "info pump_name ,detail" output and "info replicat_name ,detail" as backup  
7) backup all the trail files generated by the pump on the target server to a new folder and remove all the corresponding trail files on
 the target side and delete all the trail files generated by the pump on the target side ie for eg:- rm ./dirdat/el* on the target side
 (where replicat is running) 
8) delete the rmttrail file of the pump extract
  ex:- suppose the rmttrailfile name is ./dirdat/el
  use ggsci> DELETE RMTTRAIL ./dirdat/el
9) Add back the rmttrail with the sequence 0 and rba 0
  ex:- add rmttrail ./dirdat/el, megabytes 1000, seqno 0 , rba 0, extract test
10)make sure that the "info pump_name ,detail" output is showing trail file sequence as 0  
11)start the extract
12) start the pump
13)point the replicat to the the first generated trail file by pump ie 0
     alter replicat ,extseqno 0 , extrba 0
14) start the replicat 
    start replicat 
 

NOTE:Manager PURGEOLDEXTRACTS USECHECKPOINTS may incorrectly purge sequences from 1000000 (1 million) as it deem to be sequence
 0 and 1000001 to be 1, etc. on the target.

Starting in GoldenGate 12.1, we support trail sequences up to 9 digits instead of 6.