存储宕机导致Oracle异常故障处理---惜分飞

存储突然掉线,导致数据库crash,报大量ORA-00206 ORA-00202 ORA-15081以及Linux-x86_64 Error: 5: Input/output error之类的错误

Sun Jul 21 20:00:11 2024

Thread 1 advanced to log sequence 1594398 (LGWR switch)

  Current log# 5 seq# 1594398 mem# 0: +DATA/xff/onlinelog/group_5.412.906718739

Sun Jul 21 20:53:17 2024

WARNING: Write Failed. group:2 disk:0 AU:506916 offset:49152 size:16384

Sun Jul 21 20:53:17 2024

WARNING: Read Failed. group:2 disk:2 AU:506931 offset:49152 size:16384

WARNING: failed to read mirror side 1 of virtual extent 4 logical extent 0 of file 415 in group [2.34109396]

from disk ORACLE_DATA_0002  allocation unit 506931 reason error; if possible, will try another mirror side

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ckpt_42142.trc:

ORA-15080: 与磁盘的同步 I/O 操作失败

ORA-27061: 异步 I/O 等待失败

Linux-x86_64 Error: 5: Input/output error

Additional information: -1

Additional information: 16384

WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0

of file 415 in group 2 on disk 0 allocation unit 506916

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ckpt_42142.trc:

ORA-00206: 写入控制文件时出错 (块 3, # 块 1)

ORA-00202: 控制文件: ''+DATA/xff/controlfile/current.415.906718737''

ORA-15081: 无法将 I/O 操作提交到磁盘

ORA-15081: 无法将 I/O 操作提交到磁盘

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ckpt_42142.trc:

ORA-00221: 写入控制文件时出错

ORA-00206: 写入控制文件时出错 (块 3, # 块 1)

ORA-00202: 控制文件: ''+DATA/xff/controlfile/current.415.906718737''

ORA-15081: 无法将 I/O 操作提交到磁盘

ORA-15081: 无法将 I/O 操作提交到磁盘

CKPT (ospid: 42142): terminating the instance due to error 221

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_lmon_42087.trc:

ORA-00202: 控制文件: ''+DATA/xff/controlfile/current.415.906718737''

ORA-15081: 无法将 I/O 操作提交到磁盘

ORA-27072: 文件 I/O 错误

Linux-x86_64 Error: 5: Input/output error

Additional information: 4

Additional information: 1038194784

Additional information: -1

Sun Jul 21 20:53:19 2024

ORA-1092 : opitsk aborting process

Sun Jul 21 20:53:24 2024

ORA-1092 : opitsk aborting process

Sun Jul 21 20:53:24 2024

License high water mark = 59

Sun Jul 21 20:53:28 2024

Instance terminated by CKPT, pid = 42142

USER (ospid: 64660): terminating the instance

Instance terminated by USER, pid = 64660

存储恢复之后启动数据库报ORA-600 2131错误

Mon Jul 22 09:10:04 2024

ALTER DATABASE   MOUNT

This instance was first to mount

Mon Jul 22 09:10:04 2024

Sweep [inc][490008]: completed

Sweep [inc2][490008]: completed

NOTE: Loaded library: System

SUCCESS: diskgroup ORACLE_DATA was mounted

NOTE: dependency between database rac and diskgroup resource ora.ORACLE_DATA.dg is established

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ora_14301.trc  (incident=492409):

ORA-00600: ??????, ??: [2131], [33], [32], [], [], [], [], [], [], [], [], []

Incident details in: /users/oracle/app/db/diag/rdbms/xff/xff1/incident/incdir_492409/xff1_ora_14301_i492409.trc

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

ORA-600 signalled during: ALTER DATABASE   MOUNT...

客户尝试重建ctl进行恢复,结果由于分析不正确,导致在重建ctl的时候,遗漏了3个数据文件,并且在屏蔽一致性的情况下,强制resetlogs操作,结果数据库没有被正常打开,而是报ORA-600 2662错误

alter database open resetlogs

RESETLOGS is being done without consistancy checks. This may result

in a corrupted database. The database should be recreated.

RESETLOGS after incomplete recovery UNTIL CHANGE 9965567206652

Clearing online redo logfile 1 +DATA/xff/onlinelog/group_1.414.906718739

Clearing online log 1 of thread 1 sequence number 0

Clearing online redo logfile 1 complete

Clearing online redo logfile 2 +DATA/xff/onlinelog/group_2.413.906718739

Clearing online log 2 of thread 1 sequence number 0

Clearing online redo logfile 2 complete

Clearing online redo logfile 5 +DATA/xff/onlinelog/group_5.412.906718739

Clearing online log 5 of thread 1 sequence number 0

Clearing online redo logfile 5 complete

Expanded controlfile section 2 from 1 to 63 records

The number of logical blocks in section 2 remains the same

Expanded controlfile section 1 from 4 to 66 records

Requested to grow by 62 records; added 32 blocks of records

Expanded controlfile section 30 from 1 to 63 records

The number of logical blocks in section 30 remains the same

Expanded controlfile section 29 from 1 to 63 records

The number of logical blocks in section 29 remains the same

Control file has been expanded to support 63 threads

Mon Jul 22 23:04:07 2024

Redo thread 2 enabled by open resetlogs or standby activation

Online log +DATA/xff/onlinelog/group_1.414.906718739: Thread 1 Group 1 was previously cleared

Online log +DATA/xff/onlinelog/group_2.413.906718739: Thread 1 Group 2 was previously cleared

Online log +DATA/xff/onlinelog/group_3.501.1175036643: Thread 2 Group 3 was previously cleared

Online log +DATA/xff/onlinelog/group_4.502.1175036645: Thread 2 Group 4 was previously cleared

Online log +DATA/xff/onlinelog/group_5.412.906718739: Thread 1 Group 5 was previously cleared

Mon Jul 22 23:04:08 2024

Setting recovery target incarnation to 2

Initializing SCN for created control file

Database SCN compatibility initialized to 3

Warning - High Database SCN: Current SCN value is 9965567206655, threshold SCN value is 0

If you have not previously reported this warning on this database,

please notify Oracle Support so that additional diagnosis can be performed.

Mon Jul 22 23:04:09 2024

Assigning activation ID 2763017873 (0xa4b04e91)

Thread 1 opened at log sequence 1

  Current log# 1 seq# 1 mem# 0: +DATA/xff/onlinelog/group_1.414.906718739

Successful open of redo thread 1

MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set

Mon Jul 22 23:04:10 2024

SMON: enabling cache recovery

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ora_64210.trc  (incident=624374):

ORA-00600: 内部错误代码, 参数: [2662], [2320], [1243079939], [2320], [1243211805], [12583040], [], [], [], [], [], []

Incident details in: /users/oracle/app/db/diag/rdbms/xff/xff1/incident/incdir_624374/xff1_ora_64210_i624374.trc

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ora_64210.trc:

ORA-00600: 内部错误代码, 参数: [2662], [2320], [1243079939], [2320], [1243211805], [12583040], [], [], [], [], [], []

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_ora_64210.trc:

ORA-00600: 内部错误代码, 参数: [2662], [2320], [1243079939], [2320], [1243211805], [12583040], [], [], [], [], [], []

Error 600 happened during db open, shutting down database

USER (ospid: 64210): terminating the instance due to error 600

Instance terminated by USER, pid = 64210

ORA-1092 signalled during: alter database open resetlogs...

操作到这里,后续问题就比较麻烦了,因为在asm磁盘组中数据文件重建ctl的时候遗漏3个并且还被resetlogs操作过,导致这三个文件的resetlogs scn和其他数据文件不一致,对于这个问题,解决办法通过Oracle Recovery Tools工具或者bbed修改相关resetlogs scn,然后重建ctl

SQL> @rectl.sql

Control file created.

SQL> RECOVER DATABASE;

Media recovery complete

然后解决之前数据库启动报ORA-600 2662问题,通过修改数据库scn进行解决,可以使用Patch_SCN工具进行快速解决,然后open数据库成功

SQL> ALTER DATABASE OPEN;

  

Database altered.

但是查看alert日志数据库报大量ORA-600 4194、ORA-01595和Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xC21D511] [PC:0x97F4EFA, kgegpa()+40]之类错误

Wed Jul 24 15:24:21 2024

alter database open

Beginning crash recovery of 1 threads

 parallel recovery started with 32 processes

Started redo scan

Completed redo scan

 read 0 KB redo, 0 data blocks need recovery

…………

Database Characterset is ZHS16GBK

No Resource Manager plan active

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_smon_40279.trc  (incident=777938):

ORA-00600: 内部错误代码, 参数: [4194], [], [], [], [], [], [], [], [], [], [], []

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

replication_dependency_tracking turned off (no async multimaster replication found)

Starting background process QMNC

Wed Jul 24 15:24:40 2024

QMNC started with pid=79, OS id=54632

Block recovery from logseq 2, block 74 to scn 9965587206835

Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0

  Mem# 0: +DATA/xff/onlinelog/redo02

LOGSTDBY: Validating controlfile with logical metadata

Wed Jul 24 15:24:40 2024

Block recovery stopped at EOT rba 2.82.16

Block recovery completed at rba 2.82.16, scn 2320.1263080114

Block recovery from logseq 2, block 74 to scn 9965587206833

Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0

  Mem# 0: +DATA/xff/onlinelog/redo02

Block recovery completed at rba 2.82.16, scn 2320.1263080114

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_smon_40279.trc:

ORA-01595: 释放区 (4) 回退段 (20) 时出错

ORA-00600: 内部错误代码, 参数: [4194], [], [], [], [], [], [], [], [], [], [], []

LOGSTDBY: Validation complete

Wed Jul 24 15:24:41 2024

Sweep [inc][777938]: completed

Sweep [inc2][777938]: completed

Wed Jul 24 15:24:41 2024

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_q001_54657.trc  (incident=778362):

ORA-00600: 内部错误代码, 参数: [4194], [], [], [], [], [], [], [], [], [], [], []

Use ADRCI or Support Workbench to package the incident.

See Note 411.1 at My Oracle Support for error and packaging details.

Starting background process SMCO

Wed Jul 24 15:24:42 2024

SMCO started with pid=83, OS id=54691

Block recovery from logseq 2, block 74 to scn 9965587206835

Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0

  Mem# 0: +DATA/xff/onlinelog/redo02

Block recovery completed at rba 2.82.16, scn 2320.1263080118

Block recovery from logseq 2, block 74 to scn 9965587206838

Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0

  Mem# 0: +DATA/xff/onlinelog/redo02

Block recovery completed at rba 2.83.16, scn 2320.1263080119

Error 600 in kwqmnpartition(), aborting txn

Errors in file /users/oracle/app/db/diag/rdbms/xff/xff1/trace/xff1_q001_54657.trc  (incident=778363):

ORA-25319: 队列表重新分区已中止

Completed: alter database open

Block recovery from logseq 2, block 74 to scn 9965587206835

Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0

  Mem# 0: +DATA/rac/onlinelog/redo02

Block recovery completed at rba 2.82.16, scn 2320.1263080118

Block recovery from logseq 2, block 74 to scn 9965587207538

Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0

  Mem# 0: +DATA/rac/onlinelog/redo02

Block recovery completed at rba 2.1097.16, scn 2320.1263080819

Errors in file /users/oracle/app/db/diag/rdbms/rac/rac1/trace/rac1_cjq0_55657.trc  (incident=778427):

ORA-00600: 内部错误代码, 参数: [600], [ORA-00600: 内部错误代码, 参数:

[4194], [], [], [], [], [], [], [], [], [], [], []], [], [], [], [], [], [], [], [], [], []

Incident details in: /users/oracle/app/db/diag/rdbms/xff/xff1/incident/incdir_778427/xff1_cjq0_55657_i778427.trc

Exception [type:SIGSEGV, Address not mapped to object][ADDR:0xC21D511][PC:0x97F4EFA, kgegpa()+40][flags: 0x0, count: 1]

Exception [type:SIGSEGV, Address not mapped to object][ADDR:0xC21D511][PC:0x97F396E, kgebse()+776][flags: 0x2, count: 2]

Exception [type:SIGSEGV, Address not mapped to object][ADDR:0xC21D511][PC:0x97F396E, kgebse()+776][flags: 0x2, count: 2]

从报错分析是由于undo异常导致,处理异常undo回滚段之后,数据库open正常,安排逻辑迁移数据,完成本次恢复

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/876745.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

国产大模型的逆袭:技术路径的策略与实践

〔探索AI的无限可能,微信关注“AIGCmagic”公众号,让AIGC科技点亮生活〕 一.聚焦长文本,国产大模型已有赶超GPT之势 1.1 理科能力差距较大,注重文科能力的提升 整体比较而言,国内大模型与GPT-4(官网&…

树与二叉树【数据结构】

前言 之前我们已经学习过了各种线性的数据结构,顺序表、链表、栈、队列,现在我们一起来了解一下一种非线性的结构----树 1.树的结构和概念 1.1树的概念 树是一种非线性的数据结构,它是由n(n>0)个有限结点组成一…

【计算机网络】ICMP报文实验

一:实验目的 1:掌握ICMP报文的各种类型及其代码。 2:掌握ICMP报文的格式。 3:深入理解TTL的含义(Time to Live,生存时间)。 二:实验仪器设备及软件 硬件:RCMS-C服务器…

等级保护测评解决方案

什么是等级保护测评? 网络安全等级保护是指对国家重要信息、法人和其他组织及公民的专有信息以及公开信息和存储、传输、处理这些信息的信息系统分等级实行安全保护,对信息系统中使用的信息安全产品实行按等级管理,对信息系统中发生的信息安全…

小模型狂飙!6家巨头争相发布小模型,Andrej Karpathy:大语言模型的尺寸竞争正在倒退...

过去一周,可谓是小模型战场最疯狂的一周,商业巨头改变赛道,向大模型say byebye~。 OpenAI、Apple、Mistral等“百花齐放”,纷纷带着自家性能优越的轻量化小模型入场。 小模型(SLM),是相对于大语言模型(LLM…

Istio 学习笔记

Istio 学习笔记 作者:王珂 邮箱:49186456qq.com 文章目录 Istio 学习笔记[TOC] 前言一、基本概念1.1 Istio定义 二、Istio的安装2.1 通过Istioctl安装2.2 通过Helm安装 三、Istio组件3.1 Gateway3.2 VirtulService3.2.1 route详解3.2.2 match详解3.2.3…

【前端 02】新浪新闻项目-初步使用CSS来排版

在今天的博文中,我们将围绕“新浪新闻”项目,深入探讨HTML和CSS在网页制作中的基础应用。通过具体实例,我们将学习如何设置图片、标题、超链接以及文本排版,同时了解CSS的引入方式和选择器优先级,以及视频和音频标签的…

【Gin】智慧架构的巧妙砌筑:Gin框架中控制反转与依赖注入模式的精华解析与应用实战(下)

【Gin】智慧架构的巧妙砌筑:Gin框架中控制反转与依赖注入模式的精华解析与应用实战(下) 大家好 我是寸铁👊 【Gin】智慧架构的巧妙砌筑:Gin框架中控制反转与依赖注入模式的精华解析与应用实战(下)✨ 喜欢的小伙伴可以点点关注 💝 …

怀旧必玩!重返童年,扫雷游戏再度登场!

Python提供了一个标准的GUI(图形用户界面)工具包:Tkinter。它可以用来创建各种窗口、按钮、标签、文本框等图形界面组件。 而且Tkinter 是 Python 自带的库,无需额外安装。 Now,让我们一起来回味一下扫雷小游戏吧 扫…

快速搞定分布式RabbitMQ---RabbitMQ进阶与实战

本篇内容是本人精心整理;主要讲述RabbitMQ的核心特性;RabbitMQ的环境搭建与控制台的详解;RabbitMQ的核心API;RabbitMQ的高级特性;RabbitMQ集群的搭建;还会做RabbitMQ和Springboot的整合;内容会比较多&#…

【C++】C++入门知识(上)

好久不见&#xff0c;本篇介绍一些C的基础&#xff0c;没有特别的主题&#xff0c;话不多说&#xff0c;直接开始。 1.C的第一个程序 C中需要把定义文件代码后缀改为 .cpp 我们在 test.cpp 中来看下面程序 #include <stdio.h> int main() {printf("hello world\n…

SQL Server 设置端口号:详细步骤与注意事项

目录 一、了解SQL Server端口号的基础知识 1.1 默认端口号 1.2 静态端口与动态端口 二、使用SQL Server配置管理器设置端口号 2.1 打开SQL Server配置管理器 2.2 定位到SQL Server网络配置 2.3 修改TCP/IP属性 2.4 重启SQL Server服务 三、注意事项 3.1 防火墙设置 3…

Java小抄|Java中的List与Map转换

文章目录 1 List<User> 转Map<User.id,User>2 基础类型的转换&#xff1a;List < Long> 转 Map<Long,Long> 1 List 转Map<User.id,User> Map<Long, User> userMap userList.stream().collect(Collectors.toMap(User::getId, v -> v, …

p28 vs环境-C语言实用调试技巧

int main() { int i0; for(i0;i<100;i) { printf("%d",i); } } 1.Debug 和Release的介绍 Debug通常称为调试版本&#xff0c;它包含调试信息&#xff0c;并且不做任何优化&#xff0c;便于程序员调试程序。 Release称为发布版本&#x…

PTPD 在 QNX 系统上的授时精度验证与误差排查

文章目录 0. 引言1.关键函数实现2. 验证策略与结果3. 授时误差的排查与解决3. 授时误差的排查与解决4. 结论 0. 引言 PTPD是一种时间同步的开源实现&#xff0c;在不同操作系统上的表现可能存在显著差异。 本文通过在QNX系统上运行PTPD&#xff0c;针对其授时精度进行详细验证…

探索算法系列 - 双指针

目录 移动零&#xff08;原题链接&#xff09; 复写零&#xff08;原题链接&#xff09; 快乐数&#xff08;原题链接&#xff09; 盛最多水的容器&#xff08;原题链接&#xff09; 有效三角形的个数&#xff08;原题链接&#xff09; 查找总价格为目标值的两个商品&…

优化算法:2.粒子群算法(PSO)及Python实现

一、定义 粒子群算法&#xff08;Particle Swarm Optimization&#xff0c;PSO&#xff09;是一种模拟鸟群觅食行为的优化算法。想象一群鸟在寻找食物&#xff0c;每只鸟都在尝试找到食物最多的位置。它们通过互相交流信息&#xff0c;逐渐向食物最多的地方聚集。PSO就是基于这…

【python_将一个列表中的几个字典改成二维列表,并删除不需要的列】

def 将一个列表中的几个字典改成二维列表(original_list,headersToRemove_list):# 初始化一个列表用于存储遇到的键&#xff0c;保持顺序ordered_keys []# 遍历data中的每个字典&#xff0c;添加其键到ordered_keys&#xff0c;如果该键还未被添加for d in original_list:for …

P4009 汽车加油行驶问题题解

P4009 汽车加油行驶问题 紫题&#xff0c;但是DFS。 思路 记忆化搜索&#xff0c;分多钟情况去搜索。 注意该题不用标记&#xff0c;有可能会往回走。 有可能这样走。 代码 #include<bits/stdc.h> #include<cstring> #include<queue> #include<set&g…

redis:清除缓存的最简单命令示例

清除redis缓存命令(执行命令列表见截图) 1.打开cmd窗口&#xff0c;并cd进入redis所在目录 2.登录redis redis-cli 3.查询指定队列当前的记录数 llen 队列名称 4.清除指定队列所有记录 ltrim 队列名称 1 0 5.再次查询&#xff0c;确认队列的记录数是否已清除