视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
修改并行参数引发ORA-600[kgeade_is_0]的问题处理
2020-11-09 13:06:28 责编:小采
文档


客户有一套数据库,这周有例行停机维护的时间,于是我们趁这次停机例行维护的时间区间进行PARALLEL_EXECUTION_MESSAGE_SIZE参数的修改,修改完成后在重启的过程中遇到了ORA-00600[KGEADE_IS_0]的错误。首先来说一下为什么要修改PARALLEL_EXECUTION_MESSAGE_S

客户有一套数据库,这周有例行停机维护的时间,于是我们趁这次停机例行维护的时间区间进行PARALLEL_EXECUTION_MESSAGE_SIZE参数的修改,修改完成后在重启的过程中遇到了ORA-00600[KGEADE_IS_0]的错误。首先来说一下为什么要修改PARALLEL_EXECUTION_MESSAGE_SIZE这个参数,根据Oracle最佳实践的推荐,10g默认装完数据库该参数的值是2152,也有可能是2048,推荐将这个值设置成8192,而在11g中,这个值默认被设置成了16K,是可以满足大多数应用场景的。这个值的作用就是在并行执行中消息的大小。这个值越大,需要的shared pool也就越大。虽然能获得更好的性能,但是相应的内存也需要的更多了。还有:这个参数在并行恢复或者是standby recover情况下,增加它的大小到4096以上,也能提升至少20%恢复速度。

我们来看一下我们的报错的情况,我们修改一个节点该参数,然后直接重启。

Sun Jul 13 16:57:58 CST 2014
Errors in file /oracle/app/oracle/admin/racdb/bdump/racdb1_m000_21519.trc:
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Sun Jul 13 16:57:59 CST 2014
Errors in file /oracle/app/oracle/admin/racdb/bdump/racdb1_mmon_21339.trc:
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Sun Jul 13 16:58:00 CST 2014
Errors in file /oracle/app/oracle/admin/racdb/bdump/racdb1_mmon_21339.trc:
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Sun Jul 13 16:58:00 CST 2014
Trace dumping is performing id=[cdmp_20140713165800]
Sun Jul 13 16:58:01 CST 2014
Trace dumping is performing id=[cdmp_20140713165801]
Sun Jul 13 16:58:07 CST 2014
Errors in file /oracle/app/oracle/admin/racdb/bdump/racdb1_m000_21519.trc:
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Sun Jul 13 16:58:07 CST 2014
Trace dumping is performing id=[cdmp_20140713165807]
*** 2014-07-13 16:57:58.781
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Current SQL statement for this session:
select tablespace_id, rfno, allocated_space, file_size, file_maxsize, changescn_base, changescn_wrap, flag from GV$FILESPACE_USAGE where inst_id != :inst and (changescn_wrap >= :w or (changescn_wrap = :w and changescn_base >= :b))
*** 2014-07-13 16:57:59.274
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kgeade_is_0], [], [], [], [], [], [], []
Current SQL statement for this session:
SELECT INSTANCE_NAME, HOST_NAME, NVL(GVI_STARTUP_TIME, SYSTIMESTAMP) - INTERVAL '1' SECOND AS SHUTDOWN_TIME FROM (SELECT RRI.INSTANCE_NAME AS INSTANCE_NAME, RRI.HOST_NAME AS HOST_NAME, FROM_TZ(RRI.STARTUP_TIME
, '+00:00') AS RRI_STARTUP_TIME, DBMS_HA_ALERTS_PRVT.INSTANCE_STARTUP_TIMESTAMP_TZ(GVI.STARTUP_TIME) AS GVI_STARTUP_TIME FROM RECENT_RESOURCE_INCARNATIONS$ RRI LEFT OUTER JOIN GV$INSTANCE GVI ON GVI.INSTANCE_N
AME = RRI.RESOURCE_NAME WHERE RRI.RESOURCE_TYPE = 'INSTANCE' AND :B2 = RRI.DB_UNIQUE_NAME AND :B1 = RRI.DB_DOMAIN) WHERE GVI_STARTUP_TIME IS NULL OR GVI_STARTUP_TIME > RRI_STARTUP_TIME GROUP BY INSTANCE_NAME, 
HOST_NAME, GVI_STARTUP_TIME
----- PL/SQL Call Stack -----
 object line object
 handle number name
0x7de705a8 301 package body SYS.DBMS_HA_ALERTS_PRVT
0x7de740 1 anonymous block

可以看到,都是在查询GV$视图的语句出现了这个错误。我们在来看一下它出错时候的堆栈信息。

ksedst()+31 call ksedst1() 000000000 ? 000000001 ?
 7FFF778810B0 ? 7FFF77881110 ?
 7FFF77881050 ? 000000000 ?
ksedmp()+610 call ksedst() 000000000 ? 000000001 ?
 7FFF778810B0 ? 7FFF77881110 ?
 7FFF77881050 ? 000000000 ?
ksfdmp()+63 call ksedmp() 000000003 ? 000000001 ?
 7FFF778810B0 ? 7FFF77881110 ?
 7FFF77881050 ? 000000000 ?
kgerinv()+161 call ksfdmp() 006AE9A20 ? 000000003 ?
 7FFF778810B0 ? 7FFF77881110 ?
 7FFF77881050 ? 000000000 ?
kgeasnmierr()+163 call kgerinv() 006AE9A20 ? 2B763E0B0040 ?
 7FFF77881110 ? 7FFF77881050 ?
 000000000 ? 000000000 ?
kgeade()+501 call kgeasnmierr() 006AE9A20 ? 2B763E0B0040 ?
 7FFF77881110 ? 7FFF77881050 ?
 000000000 ? 000000000 ?
kgerev()+58 call kgeade() 2B763E0B0040 ? 006AE9A20 ?
 2B763E0B0040 ? 000000000 ?
 000000000 ? 000000000 ?
kserec0()+186 call kgerev() 006AE9A20 ? 2B763E0B0040 ?
 000000000 ? 000000000 ?
 7FFF778821A0 ? 000000000 ?
kxfpg1sg()+2014 call kserec0() 006AE9A20 ? 000000001 ?
 000000029 ? 7FFF77881F40 ?
 000000000 ? 388B519840 ?
kxfpgsg()+2098 call kxfpg1sg() 083D278 ? 000000001 ?
 7FFF778822B0 ? 7FFF77881F40 ?
 083CC48 ? 2B7600000001 ?
kxfrAllocSlaves()+3 call kxfpgsg() 000000005 ? 000000001 ?
51 000000001 ? 000000001 ?
 3E0A254800000001 ?
 2B763E0A2548 ?
kxfrialo()+2111 call kxfrAllocSlaves() 00005322E ? 2B763E5726C0 ?
 000000001 ? 7FFF00000001 ?
 7FFF00000001 ? 000000001 ?
kxfralo()+313 call kxfrialo() 00005322E ? 2B763E5726C0 ?
 000000001 ? 07DAA7230 ?
 2B763E572768 ? 7FFF77880000 ?
qerpx_rowsrc_start( call kxfralo() 00005322E ? 2B763E5726C0 ?
)+32 000000001 ? 07DAA7230 ?
 2B763E572768 ? 000000000 ?
qerpxStart()+234 call qerpx_rowsrc_start( 7FFF77883280 ? 000000001 ?
 ) 000000001 ? 07DAA10 ?
 100000001 ? 000000000 ?
selexe()+667 call qerpxStart() 000000001 ? 000003F60 ?
 000000001 ? 07DAA10 ?
 100000001 ? 000000000 ?
opiexe()+4687 call selexe() 07DACBB38 ? 7FFF77883F60 ?
 7FFF77883F60 ? 07DACBB38 ?
 100000001 ? 000000000 ?
kpoal8()+2295 call opiexe() 000000049 ? 000000003 ?
 7FFF77884428 ? 000000003 ?
 100000001 ? 000000000 ?
opiodr()+1184 call kpoal8() 00000005E ? 000000000 ?
 7FFF77887EF8 ? 000000003 ?
 83B7000000000001 ?
 000000000 ?
kpoodrc()+38 call opiodr() 00000005E ? 000000000 ?
 7FFF77887EF8 ? 000000000 ?
 005BEBDF0 ? 000000000 ?
rpiswu2()+409 call kpoodrc() 7FFF77885440 ? 000000000 ?
 7FFF77887EF8 ? 000000000 ?
 005BEBDF0 ? 000000000 ?
kpoodr()+554 call rpiswu2() 083B7ABF0 ? 000000000 ?
 2B763E0F0CBC ? 000000002 ?
 2B763E0F0CFC ? 000000000 ?
upirtrc()+2101 call kpoodr() 2B763E342E20 ? 00000005E ?
 7FFF77887EF8 ? 000000000 ?
 2B763E0F0CFC ? 000000000 ?
kpurcsc()+125 call upirtrc() 2B763E342E20 ? 00000005E ?
 7FFF77887EF8 ? 7FFF77888060 ?
 7FFF77888FD0 ? 003C558C6 ?
kpuexecv8()+1705 call kpurcsc() 7FFF7787D0 ? 00000005E ?
 7FFF77887EF8 ? 7FFF77888060 ?
 7FFF77888FD0 ? 003C558C6 ?
kpuexec()+23 call kpuexecv8() 2B763E0FE958 ? 2B763E33F4C0 ?
 2B763E33F540 ? 000000000 ?
 000000000 ? 7FFF7788A8C4 ?
OCIStmtExecute()+41 call kpuexec() 000000001 ? 2B763E33F4C0 ?
 2B763E342DB0 ? 000000001 ?
 000000000 ? 000000000 ?
ktte_aggregate_finf call OCIStmtExecute() 000000001 ? 2B763E33F4C0 ?
o()+3133 2B763E342DB0 ? 000000001 ?
 000000000 ? 000000000 ?
ktte_monitor_tsth() call ktte_aggregate_finf 7FFF7788B780 ? 000000001 ?
+788 o() 000000009 ? 000000001 ?
 000000000 ? 000000000 ?
ktte_threshold_slav call ktte_monitor_tsth() 7FFF7788B780 ? 000000001 ?
e()+183 000000009 ? 000000001 ?
 000000000 ? 000000000 ?
kebm_slave_main()+2 call ktte_threshold_slav 07F63B200 ? 000000001 ?
21 e() 000000000 ? 000000001 ?
 000000000 ? 000000000 ?
ksvrdp()+1159 call kebm_slave_main() 07F63B200 ? 07F63B200 ?
 000000000 ? 000000001 ?
 000000000 ? 000000000 ?
opirip()+748 call ksvrdp() 07F63B200 ? 07F63B200 ?
 000000000 ? 000000001 ?
 000000000 ? 000000000 ?
opidrv()+583 call opirip() 000000032 ? 000000004 ?
 7FFF7788D298 ? 000000001 ?
 000000000 ? 000000000 ?
sou2o()+114 call opidrv() 000000032 ? 000000004 ?
 7FFF7788D298 ? 000000001 ?
 000000000 ? 000000000 ?
opimai_real()+317 call sou2o() 7FFF7788D270 ? 000000032 ?
 000000004 ? 7FFF7788D298 ?
 000000000 ? 000000000 ?
main()+116 call opimai_real() 000000003 ? 7FFF7788D300 ?
 000000004 ? 7FFF7788D298 ?
 000000000 ? 000000000 ?
__libc_start_main() call main() 000000003 ? 7FFF7788D300 ?
+244 000000004 ? 7FFF7788D298 ?
 000000000 ? 000000000 ?
_start()+41 call __libc_start_main() 00072D108 ? 000000001 ?
 7FFF7788D458 ? 000000000 ?
 000000000 ? 000000003 ?

根据文档ORA-600 [kgeade_is_0] In A Real Application Cluster (RAC) Environment (文档 ID 797182.1)里面的描述,凡是trace文件堆栈信息类似于“kxfpg1sg kxfpgsg kxfrAllocSlaves kxfrialo kxfralo qerpx_rowsrc_start”这样的,命中bug8592375。解决这个问题的办法也很简单,就是把两个库实例都停下来,修改成相同的参数,然后启动。像我们这样一个实例还在运行着,使用的是以前的参数,而新实例启动之后用的新的参数,就会导致这个问题的出现。还一个办法是安装补丁程序,但是感觉这个补丁是针对standby数据库的。8592375: PHSB: READABLE STANDBY REPORTED ORA-00700:[KGEADE_IS_0]。

参考文档:ORA-600 [kgeade_is_0] In A Real Application Cluster (RAC) Environment (文档 ID 797182.1)

下载本文
显示全文
专题