视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
OGG的extract进程checkpoint时间点回到1988-01-0100:00:00故障处理
2020-11-09 14:26:44 责编:小采
文档


Extract进程(SEXTR01)状态为running,但是Lag at Chkpt却达到5个多小时,且时间一直在增长,根本就不抽取新日志,状态信息如下

1、故障现象
Extract进程(SEXTR01)状态为running,但是Lag at Chkpt却达到5个多小时,且时间一直在增长,根本就不抽取新日志,状态信息如下:

GGSCI (caXXadgdb) 21> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING

EXTRACT RUNNING DPEYWGL 00:00:00 00:00:04

EXTRACT RUNNING SEXTR01 05:14:58 00:00:03

2、故障分析
2.1 debug一下进程状态信息
GGSCI (caXXadgdb) 23> info sextr01, showch debug

EXTRACT SEXTR01 Last Started 2015-01-25 22:51 Status RUNNING

Checkpoint Lag 05:21:26 (updated 00:00:04 ago)

Log Read Checkpoint Oracle Redo Logs

2015-01-25 18:05:16 Thread 1, Seqno 5554, RBA 611373536

SCN 3126.1366562 (13426204423758)

Log Read Checkpoint Oracle Redo Logs

1988-01-01 00:00:00 Thread 2, Seqno 4016, RBA 1024

SCN 44527.110828288 (191242119617280)

Current Checkpoint Detail:

Read Checkpoint #1

……

Read Checkpoint #2

Oracle Threaded Redo Log

Startup Checkpoint (starting position in the data source):

Thread #: 2

Sequence #: 4015

RBA: 0984080

Timestamp: 2015-01-25 18:03:51.000000

SCN: 3126.136602577 (13426204369873)

Redo File: +DGROUP1/caXXdb/onlinelog/group_15.267.799673791

Recovery Checkpoint (position of oldest unprocessed transaction in the data source):

Thread #: 2

Sequence #: 4015

RBA: 0984080

Timestamp: 2015-01-25 18:03:51.000000

SCN: 3126.136602577 (13426204369873)

Redo File: +DGROUP1/caXXdb/onlinelog/group_15.267.799673791

Current Checkpoint (position of last record read in the data source):

Thread #: 2

Sequence #: 4016

RBA: 1024

Timestamp: 1988-01-01 00:00:00.000000

SCN: 44527.110828288 (191242119617280)

Redo File: +DGROUP1/caXXdb/onlinelog/group_16.266.799673793

……

从这里发现一个不可思议的问题:

Thread #: 2的current checkpoint 的timestamp竟然是1988-01-0100:00:00.000000,SCN的十进制数,竟然为191242119617280,远远大于当前数据库的SCN,多出一位数。

时间倒退,SCN号反而大涨,时间还那么有个性。这个问题挺有意思。

2.2 view extract进程的report
使用view report sextr01查看进程的报告,提示有ERROR,thread#2 4016 archive log 不能打开。(由于笔者没有将信息复制下来,所以当时的状态只能这么描述了),但是4016号archivelog实际物理文件是存在的,而且权限正常。

2.3 将进程停止下来,showch其状态
GGSCI (calladgdb) 24>stop sextr01

GGSCI (calladgdb) 25> info SEXTR01, showch

EXTRACT SEXTR01 Last Started 2015-01-26 01:00 Status STOPPED

Checkpoint Lag 06:56:55 (updated 00:00:58 ago)

Log Read Checkpoint Oracle Redo Logs

2015-01-25 18:05:16 Thread 1, Seqno 5554, RBA 611373536

SCN 3126.1366562 (13426204423758)

Log Read Checkpoint Oracle Redo Logs

1988-01-01 00:00:00 Thread 2, Seqno 4016, RBA 1024

SCN 44527.110828288 (191242119617280)

Current Checkpoint Detail:

Read Checkpoint #1

……

Read Checkpoint #2

Oracle Threaded Redo Log

Startup Checkpoint (starting position in the data source):

Thread #: 2

Sequence #: 4015

RBA: 0984080

Timestamp: 2015-01-25 18:03:51.000000

SCN: 3126.136602577 (13426204369873)

Redo File: +DGROUP1/caXXdb/onlinelog/group_15.267.799673791

Recovery Checkpoint (position of oldest unprocessed transaction in the data source):

Thread #: 2

Sequence #: 4015

RBA: 0984080

Timestamp: 2015-01-25 18:03:51.000000

SCN: 3126.136602577 (13426204369873)

Redo File: +DGROUP1/caXXdb/onlinelog/group_15.267.799673791

Current Checkpoint (position of last record read in the data source):

Thread #: 2

Sequence #: 4016

RBA: 1024

Timestamp: 1988-01-01 00:00:00.000000

SCN: 44527.110828288 (191242119617280)

Redo File: +DGROUP1/caXXdb/onlinelog/group_16.266.799673793

……

下载本文
显示全文
专题