视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
ORA-00600:internalerrorcode,arguments:[15709]
2020-11-09 13:04:14 责编:小采
文档


客户一套10.2.0.4的数据库,一个实例突然的Crash掉了。客户想让我们帮忙分析宕机的原因。对于这种数据库突然Crash的问题,我们首先就会看数据库的Alert日志,可以看到在宕机之前,SMON进程报了ORA-00600[15709]的错误,紧接数据库就输出了一条信息“Fatal in

客户一套10.2.0.4的数据库,一个实例突然的Crash掉了。客户想让我们帮忙分析宕机的原因。对于这种数据库突然Crash的问题,我们首先就会看数据库的Alert日志,可以看到在宕机之前,SMON进程报了ORA-00600[15709]的错误,紧接数据库就输出了一条信息“Fatal internal error happened while SMON was doing active transaction recovery.”也就是说SMON在做活动事务恢复的时候出现了异常。最终导致了数据库实例的宕机。日志输出如下所示:

Fri Sep 26 10:53:35 2014
Errors in file /oracle/app/oracle/admin/wxyydb/bdump/wxyydb_smon_297.trc:
ORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []
ORA-30319: Message 30319 not found; product=RDBMS; facility=ORA
Fri Sep 26 10:53:55 2014
Fatal internal error happened while SMON was doing active transaction recovery.
Fri Sep 26 10:53:55 2014
Errors in file /oracle/app/oracle/admin/wxyydb/bdump/wxyydb_smon_297.trc:
ORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []
ORA-30319: Message 30319 not found; product=RDBMS; facility=ORA
SMON: terminating instance due to error 474
Termination issued to instance processes. Waiting for the processes to exit
Fri Sep 26 10:54:05 2014
Instance termination failed to kill one or more processes
Instance terminated by SMON, pid = 297

我们再来分析一下wxyydb_smon_297.trc文件的信息。可以看到数据库的SMON进程一直尝试在做并行恢复事务。在恢复的过程中遇到了ORA-00600错误,最终底层代码异常触发了数据库的宕机。

*** 2014-09-26 10:10:36.236
Parallel Transaction recovery caught error 30319 
*** 2014-09-26 10:15:10.3
Parallel Transaction recovery caught exception 30319
*** 2014-09-26 10:15:21.816
Parallel Transaction recovery caught error 30319 
*** 2014-09-26 10:19:51.707
Parallel Transaction recovery caught exception 30319
*** 2014-09-26 10:53:35.830
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [15709], [29], [1], [], [], [], [], []
ORA-30319: Message 30319 not found; product=RDBMS; facility=ORA
----- Call Stack Trace -----
calling call entry argument values in hex 
location type point (? means dubious value) 
-------------------- -------- -------------------- ----------------------------
ksedst()+ call ksedst1() 000000000 ? 000000001 ?
ksedmp()+2176 call ksedst() 000000000 ?
 C000000000000C9F ?
 4000000004057F40 ?
 000000000 ? 000000000 ?
 000000000 ?
ksfdmp()+48 call ksedmp() 000000003 ?
kgeriv()+336 call ksfdmp() C000000000000695 ?
 000000003 ?
 40000000095185E0 ?
 00000EC33 ? 000000000 ?
 000000000 ? 000000000 ?
 000000000 ?
kgeasi()+416 call kgeriv() 6000000000031770 ?
 6000000000032828 ?
 4000000001A504E0 ?
 000000002 ?
 9FFFFFFFFFFFA138 ?
$cold_kxfpqsrls()+1 call kgeasi() 6000000000031770 ?
168 9FFFFFFFFD3D2290 ?
 000003D5D ? 000000002 ?
 000000002 ? 0000003E7 ?
 000003D5D ?
 9FFFFFFFFD3D22A0 ?
kxfpqrsod()+1104 call $cold_kxfpqsrls() C0000004FDF7A838 ?
 C0000004FDF74430 ?
 000000004 ?
 9FFFFFFFFFFFA200 ?
 C0000000000011AB ?
 4000000003AA1250 ?
 00000EDF5 ? 000000001 ?
kxfpdelqrefs()+0 call kxfpqrsod() C0000004FDF74430 ?
 000000001 ?
 60000000000B6300 ?
 C000000000000694 ?
 4000000003DD14F0 ?
 00000EE2D ?
 60000000000C6708 ?
kxfpqsod_qc_sod()+2 call kxfpdelqrefs() 00000003E ? 000000001 ?
016 60000000000B6300 ?
 C000000000001028 ?
 40000000025DE5A0 ?
 4000000001B1A110 ?
 60000000000C2D04 ?
 60000000000C2E90 ?
kxfpqsod()+816 call kxfpqsod_qc_sod() 000000010 ? 000000001 ?
 9FFFFFFFFFFFA260 ?
 60000000000B6300 ?
 9FFFFFFFFFFFA7F0 ?
 C000000000001028 ?
 40000000025DF810 ?
 00000EE65 ?
ktprdestroy()+208 call kxfpqsod() C0000004FDF7A838 ?
 000000001 ?
 9FFFFFFFFFFFA810 ?
 60000000000B6300 ?
 9FFFFFFFFFFFAD90 ?
ktprbeg()+8272 call ktprdestroy() C000000000001026 ?
 40000000025615B0 ?
 000006E61 ? 000000000 ?
 4000000001052E40 ?
 000000000 ?
ktmmon()+10096 call ktprbeg() 9FFFFFFFFFFFBE70 ?
 9FFFFFFFFFFFADA0 ?
 60000000000B6300 ?
 40000000028B75A0 ?
 00000EF21 ?
 9FFFFFFFFFFFADD8 ?
 9FFFFFFFFFFFADE0 ?
ktmSmonMain()+ call ktmmon() 9FFFFFFFFFFFD140 ?
ksbrdp()+2816 call ktmSmonMain() C000000100E1CA60 ?
 C000000000000FA5 ?
 000007361 ?
 4000000003B5AE10 ?
 C000000000000205 ?
 400000000409DCD0 ?
opirip()+1136 call ksbrdp() 9FFFFFFFFFFFD150 ?
 60000000000B6300 ?
 9FFFFFFFFFFFDC90 ?
 4000000002863EF0 ?
 000004861 ?
 C000000000000B1D ?
 60000000000318F0 ?
$cold_opidrv()+1408 call opirip() 9FFFFFFFFFFFEA70 ?
 000000004 ?
 9FFFFFFFFFFFF090 ?
 9FFFFFFFFFFFDCA0 ?
 60000000000B6300 ?
 C000000000000DA1 ?
sou2o()+336 call $cold_opidrv() 000000032 ?
 9FFFFFFFFFFFF090 ?
 60000000000C2C78 ?
$cold_opimai_real() call sou2o() 9FFFFFFFFFFFF0B0 ?
+0 000000032 ? 000000004 ?
 9FFFFFFFFFFFF090 ?
main()+368 call $cold_opimai_real() 000000003 ? 000000000 ?
main_opd_entry()+80 call main() 000000003 ?
 9FFFFFFFFFFFF598 ?
 60000000000B6300 ?
 C000000000000004 ?
 

根据ORA-00600[15709],我们在Oracle Support上找到一篇文档,SMON may fail with ORA-00600 [15709] Errors Crashing the Instance (文档 ID 736348.1),这篇文档的错误信息和我们所报出来的信息雷同。这篇文档列出了出现错误的堆栈情况:kxfpqsrls <- kxfpqrsod <- kxfpdelqrefs <- kxfpqsod_qc_sod <- kxfpqsod <- ktprdestroy <- ktprbe <- ktmmon。我们可以从SMON的Trace里面看到,堆栈内容基本上和这个匹配。所以,这个问题是在恢复的过程中命中了bug 695472,而如果你安装了这个patch,还是有类似的问题,很可能是遇到了另外一个类似的bug 9233544,Oracle的Bug还真是多啊。

bug 695472会影响9.2.0.8和10.2.0.4这两个版本,并且在10.2.0.4.2和10.2.0.5,11.1.0.7,11.2.0.1上得到了修复。解决bug 695472的方法是:

1.Use the following workaround

Set fast_start_parallel_rollback=false and recovery_parallelism=0

OR

2.Apply one-off  <>, if available for your platform/version here.

OR

3.Upgrade to fixed release 10.2.0.5, 11.1.0.7 or 11.2.0.1.

bug 9233544会影响10.2.0.4,11.1.0.7和11.2.0.1这三个版本,并且在11.2.0.3和12.1上得到了修复,解决bug 9233544的方法是:

1.Apply patchset 11.2.0.3, in which Bug: 9233544 is fixed.

OR

2.Check if one-off Patch:9233544 is available for your release and platform here.

我们仔细检查了一下系统的补丁,发现系统已经安装了patch 6954722,那就证明是bug 9233544影响的。要么升级到11.2.0.3的版本,要么就是安装单独的patch 9233544。对于升级11.2.0.3这个动作太大了,给客户说了一下考虑安装小patch来解决。

下载本文
显示全文
专题