Oracle非归档模式遇到文件损坏怎么办?

昨天夜里基地夜班的兄弟,打电话说有个报表库连不上了,赶紧起来连上VPN查看一下,看到实例宕机了,先赶紧startup起来。

1.查看报错信息

环境介绍:Redhat 6.9 Oracle 11.2.0.4   No Archive Mode

查看alert log 关键报错信息如下

Thread 1 advanced to log sequence 4231012 (LGWR switch)Current log# 2 seq# 4231012 mem# 0: /oradata/rtp/redo02.log
Thu May 08 23:22:56 2025
KCF: read, write or open error, block=0x240ab online=1file=118 '/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf'error=27072 txt: 'Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1'
Errors in file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_dbw0_3300.trc:
Errors in file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_dbw0_3300.trc:
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 118 (block # 147627)
ORA-01110: data file 118: '/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf'
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1
DBW0 (ospid: 3300): terminating the instance due to error 63999
Thu May 08 23:22:57 2025
System state dump requested by (instance=1, osid=3300 (DBW0)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_diag_3292_20250508232257.trc
Instance terminated by DBW0, pid = 3300

排查路径  查看报错的trc文件

Trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp\_dbw0\_3300.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE\_HOME = /u01/app/oracle/product/11.2.0/db\_1
System name:    Linux
Node name:      rtpdb
Release:        2.6.32-696.el6.x86\_64
Version:        #1 SMP Tue Feb 21 00:53:17 EST 2017
Machine:        x86\_64
VM name:        VMWare Version: 6
Instance name: rtp
Redo thread mounted by this instance: 1
Oracle process number: 10
Unix process pid: 3300, image: oracle\@rtpdb (DBW0)\*\*\* 2025-05-08 23:22:56.680
\*\*\* SESSION ID:(1521.1) 2025-05-08 23:22:56.680
\*\*\* CLIENT ID:() 2025-05-08 23:22:56.680
\*\*\* SERVICE NAME:(SYS\$BACKGROUND) 2025-05-08 23:22:56.680
\*\*\* MODULE NAME:() 2025-05-08 23:22:56.680
\*\*\* ACTION NAME:() 2025-05-08 23:22:56.680KCF: read, write or open error, block=0x240ab online=1
file=118 '/oradata2/rtp/RTP/datafile/o1\_mf\_tbs\_ods\_n1qx02j0\_.dbf'
error=27072 txt: 'Linux-x86\_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1'
Encountered write error
DDE rules only execution for: ORA 1110
\----- START Event Driven Actions Dump ----
\---- END Event Driven Actions Dump ----
\----- START DDE Actions Dump -----
Executing SYNC actions
\----- START DDE Action: 'DB\_STRUCTURE\_INTEGRITY\_CHECK' (Async) -----
Successfully dispatched
\----- END DDE Action: 'DB\_STRUCTURE\_INTEGRITY\_CHECK' (SUCCESS, 0 csec) -----
Executing ASYNC actions
\----- END DDE Actions Dump (total 0 csec) -----
error 63999 detected in background process
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 118 (block # 147627)
ORA-01110: data file 118: '/oradata2/rtp/RTP/datafile/o1\_mf\_tbs\_ods\_n1qx02j0\_.dbf'
ORA-27072: File I/O error
Linux-x86\_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1
kjzduptcctx: Notifying DIAG for crash event
\----- Abridged Call Stack Trace -----
ksedsts()+465<-kjzdssdmp()+267<-kjzduptcctx()+232<-kjzdicrshnfy()+63<-ksuitm()+5570<-ksbrdp()+3507<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai\_real()+250<-ssthrdmain()+265<-main()+201<-\_\_libc\_start\_main()+253
\----- End of Abridged Call Stack Trace -----\*\*\* 2025-05-08 23:22:56.750
DBW0 (ospid: 3300): terminating the instance due to error 63999
ksuitm: waiting up to \[5] seconds before killing DIAG(3292)
\[oracle\@rtpdb \~]\$

2. OS层面检查IO报错问题

2.1查看/oradata2挂载点是否正常,发现有较多的io错误

[oracle@rtpdb ~]$ dmesg | grep -i error
end_request: I/O error, dev sdc, sector 716711160
Buffer I/O error on device dm-0, logical block 89588639
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588640
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588641
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588642
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588643
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588644
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588645
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588646
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588647
lost page write due to I/O error on dm-0
JBD2: Detected IO errors while flushing file data on dm-0-8
[root@rtpdb ~]# tail -f /var/log/messages
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588645
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588646
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588647
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: JBD2: Detected IO errors while flushing file data on dm-0-8
May  9 03:40:04 rtpdb rhsmd: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date.
May  9 12:38:21 rtpdb kernel: NET: Unregistered protocol family 36
May  9 12:38:21 rtpdb kernel: NET: Registered protocol family 36

3.Rman检查报错的文件是否有坏块

RMAN> VALIDATE DATAFILE 118;Starting validate at 09-MAY-25
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=647 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00118 name=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:15
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
118  FAILED 0              85421        268800          74401038924File Name: /oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbfBlock Type Blocks Failing Blocks Processed---------- -------------- ----------------Data       0              99872           Index      0              82856           Other      49             651             validate found one or more corrupt blocks
See trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc for details
Finished validate at 09-MAY-25RMAN> list backup summary;specification does not match any backup in the repositoryRMAN> 

从/u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc中检查具体的有哪些block损坏, 一下检查到这么多corrupt block

而且还没有物理备份(非归档模式的库)?该如何处理

[oracle@rtpdb ~]$ cat /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc | grep -i "Corrupt"
Corrupt block relative dba: 0x1d82e5cf (file 118, block 189903)
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data。
Corrupt block relative dba: 0x1d82e5ff (file 118, block 189951)
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data

这里出现连续的block 189903 到 block 189918 至少有 16坏块

4.查看坏块对应的object

检查这些坏块是输入对应的哪个object,看到是一个表

SELECT tablespace_name, segment_type, owner, segment_name 
FROM dba_extents 
WHERE file_id = 118 AND (block_id BETWEEN 189903 AND 189918 OR(block_id + blocks - 1) BETWEEN 189903 AND 189918 ORblock_id < 189903 AND (block_id + blocks - 1) > 189918);
TABLESPACE_NAME                SEGMENT_TYPE       OWNER                          SEGMENT_NAME
------------------------------ ------------------ ------------------------------ ---------------------------------------------------------------------------------
TBS_ODS                        TABLE              ODS                            LOT_MATERIAL_MASTER

5.标记坏块,防止操作失败

标记这些坏块并跳过,这个不算是标准处理流程,因为是报表库,元数据都是从另外一个库拉取的,这里标记跳过,联系报表的同事重建这个表

BEGINDBMS_REPAIR.SKIP_CORRUPT_BLOCKS (schema_name   => 'ODS',object_name   => 'LOT_MATERIAL_MASTER',object_type   => DBMS_REPAIR.TABLE_OBJECT,flags         => DBMS_REPAIR.SKIP_FLAG);
END;
/
PL/SQL procedure successfully completed.

5.1 DBMS_REPAIR.SKIP_CORRUPT_BLOCKS包介绍

DBMS_REPAIR.SKIP_CORRUPT_BLOCKS  用于告诉数据库在访问特定表或索引时跳过已知的坏块(corrupt blocks),从而避免访问错误中断操作


✅ 主要作用

  • 标记指定对象中的坏块为可跳过,当应用或查询访问这些坏块时,Oracle 会跳过它们,而不是报错。

  • 适用于:

    • 表(TABLE_OBJECT

    • 索引(INDEX_OBJECT

  • 常用于数据库文件损坏、硬盘故障、备份文件不完整等情况下临时绕过问题块继续业务运行或数据导出


🧠 使用场景举例:

  • 表中某些数据块损坏,导致全表扫描失败。

  • 临时需要导出未损坏的数据,用于转移或恢复。

  • 配合 DBMS_REPAIR.CHECK_OBJECT 检测坏块后,继续运行业务逻辑。


📌 工作机制

启用后,对象上的查询或操作:

  • 遇到坏块 → Oracle 跳过不访问这些坏块

  • 这样能 最大程度保留/导出/访问完好数据

  • 不影响数据块的实际内容(不会修复坏块,仅跳过)


 常用调用格式

BEGIN DBMS_REPAIR.SKIP_CORRUPT_BLOCKS ( schema_name => 'SCOTT', object_name => 'EMP', object_type => DBMS_REPAIR.TABLE_OBJECT, flags => DBMS_REPAIR.SKIP_FLAG -- 开启跳过 ); END;

关闭跳过功能:

BEGIN DBMS_REPAIR.SKIP_CORRUPT_BLOCKS ( schema_name => 'SCOTT', object_name => 'EMP', object_type => DBMS_REPAIR.TABLE_OBJECT, flags => DBMS_REPAIR.NOSKIP_FLAG -- 关闭跳过 ); END;


⚠️ 注意事项

  1. 此操作不会修复坏块,只是忽略它们

  2. 配合 DBMS_REPAIR.CHECK_OBJECT 使用,先识别出坏块。

  3. 通常用于应急,不应长期依赖。

  4. 处理后建议尽快进行数据恢复或表重建
     

总结

 因为这个是库是非归档模式的,所以没有物理备份,这样遭遇了block corrupt确实非常麻烦,建议重要的库还是一定要启用归档并使用RMAN备份。

SQL> archive log list;
Database log mode              No Archive Mode
Automatic archival             Disabled
Archive destination            /u01/app/oracle/product/11.2.0/db_1/dbs/arch
Oldest online log sequence     4231143
Current log sequence           4231147

expdp备份部分表的脚本 供参考

[oracle@rtpdb ~]$ cat $HOME/jobs/expback.sh
#!/bin/bash
#backup table on noarchive db
#create by norton.fan 20220729
PATH=$PATH:$HOME/bin
ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
ORACLE_SID=rtp
PATH=$ORACLE_HOME/bin:$PATH
export ORACLE_BASE ORACLE_HOME ORACLE_SID
export PATH
NLS_LANG=AMERICAN_AMERICA.UTF8
export NLS_LANG
#export DELTIME=`date -d "15 days ago" +%Y%m%d`
export BACKUPTIME=`date +%Y%m%d%H%M%S`
expdp ods/ods dumpfile=ods$BACKUPTIME.dmp logfile=ods$BACKUPTIME.log parfile=/home/oracle/jobs/exp.par
#echo "Delete backup cycle before 15 days"
find /oradata/backup/ -mtime +1 -name  *.dmp -exec rm -f {} ';'   
find /oradata/backup/ -mtime +7 -name  *.log -exec rm -f {} ';'
[oracle@rtpdb ~]$ 
[oracle@rtpdb ~]$ cat /home/oracle/jobs/exp.par
DIRECTORY = dmpdir
SCHEMAS = ods
INCLUDE = TABLE:"IN (select table_name from exptab)" ##将需要备份的表名放入到exptab表中

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/web/79359.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

关于一些平时操作系统或者软件的步骤转载

关于一些平时操作系统或者软件的步骤转载 关于python环境搭建 关于Ubuntu 1. 双系统之Ubuntu快速卸载 2. VMware安装Ubuntu虚拟机实现COpenCV代码在虚拟机下运行教程 3. ubuntu 下 opencv的安装以及配置&#xff08;亲测有效&#xff09; 4. Ubuntu将c编译成.so文件并测试 5…

hz2新建Keyword页面

新建一个single-keywords.php即可&#xff0c;需要筛选项再建taxonomy-knowledge-category.php 参考&#xff1a;https://www.tkwlkj.com/customize-wordpress-category-pages.html WordPress中使用了ACF创建了自定义产品分类products&#xff0c;现在想实现自定义产品分类下的…

VRRP协议-IP地址冗余配置

有两个服务器172.16.42.1和172.16.42.121&#xff0c;通过VRRP协议使两台设备共用一个虚拟地址172.16.42.100&#xff0c;当 172.16.42.1 可用时&#xff0c;它会作为主路由器使用虚拟 IP 地址&#xff1b;当它不可用时&#xff0c;172.16.42.121 会接管虚拟 IP 地址&#xff0…

21、DeepSeekMath论文笔记(GRPO)

DeepSeekMath论文笔记 0、研究背景与目标1、GRPO结构GRPO结构PPO知识点**1. PPO的网络模型结构****2. GAE&#xff08;广义优势估计&#xff09;原理****1. 优势函数的定义**2.GAE&#xff08;广义优势估计&#xff09; 2、关键技术与方法3、核心实验结果4、结论与未来方向关键…

卡尔曼滤波算法(C语言)

此处感谢华南虎和互联网的众多大佬的无偿分享。 入门常识 先简单了解以下概念&#xff1a;叠加性&#xff0c;齐次性。 用大白话讲&#xff0c;叠加性&#xff1a;多个输入对输出有影响。齐次性&#xff1a;输入放大多少倍&#xff0c;输出也跟着放大多少倍 卡尔曼滤波符合这…

SolidWork-2023 鼠標工程

地址 https://github.com/MartinxMax/SW2023-Project/tree/main/mouse 鼠標

vue 组件函数式调用实战:以身份验证弹窗为例

通常我们在 Vue 中使用组件&#xff0c;是像这样在模板中写标签&#xff1a; <MyComponent :prop"value" event"handleEvent" />而函数式调用&#xff0c;则是让我们像调用一个普通 JavaScript 函数一样来使用这个组件&#xff0c;例如&#xff1a;…

Vite Proxy配置详解:从入门到实战应用

Vite Proxy配置详解&#xff1a;从入门到实战应用 一、什么是Proxy代理&#xff1f; Proxy&#xff08;代理&#xff09;是开发中常用的解决跨域问题的方案。Vite内置了基于http-proxy的代理功能&#xff0c;可以轻松配置API请求转发。 二、基础配置 在vite.config.js中配置…

图像画质算法记录(前言)

一、背景介绍 本篇主要是对图像画质增强相关&#xff0c;进行简单整理和记录。 二、整体流程 整体效果主要受到两部分影响&#xff1a; 1、前端isp处理。 2、后端画质增强。 三、isp常规流程 可以参考&#xff1a;刘斯宁&#xff1a;Understanding ISP Pipeline 四、后端画质…

Qt 中信号与槽(signal-slot)机制支持 多种连接方式(ConnectionType)

Qt 中信号与槽&#xff08;signal-slot&#xff09;机制支持 多种连接方式&#xff08;ConnectionType&#xff09; Qt 中信号与槽&#xff08;signal-slot&#xff09;机制支持 多种连接方式&#xff08;ConnectionType&#xff09;&#xff0c;用于控制信号发出后如何调用槽…

卷积神经网络实战(4)代码详解

目录 一、导包 二、数据准备 1.数据集 2. 标准化转换(Normalize) 3.设置dataloader 三、定义模型 四、可视化计算图&#xff08;不重要&#xff09; 五、评估函数 六、Tensorboard 一、导包 import matplotlib as mpl import matplotlib.pyplot as plt %matplotlib i…

深入解析进程地址空间:从虚拟到物理的奇妙之旅

深入解析进程地址空间&#xff1a;从虚拟到物理的奇妙之旅 前言 各位小伙伴&#xff0c;还记得我们之前探讨的 fork 函数吗&#xff1f;当它返回两次时&#xff0c;父子进程中同名变量却拥有不同值的现象&#xff0c;曾让我们惊叹于进程独立性与写时拷贝的精妙设计。但你是否…

opencv处理图像(二)

接下来进入到程序线程设计部分 我们主线程负责图形渲染等操作&#xff0c;OpenGL的限制&#xff0c;opencv技术对传入图像加以处理&#xff0c;输出预期图像给主线程 QThread 我之前也是在想给opencv开一个专门的线程&#xff0c;但经过了解有几个弊端&#xff0c;第一资源浪…

学习threejs,使用Physijs物理引擎

&#x1f468;‍⚕️ 主页&#xff1a; gis分享者 &#x1f468;‍⚕️ 感谢各位大佬 点赞&#x1f44d; 收藏⭐ 留言&#x1f4dd; 加关注✅! &#x1f468;‍⚕️ 收录于专栏&#xff1a;threejs gis工程师 文章目录 一、&#x1f340;前言1.1 ☘️Physijs 物理引擎1.1.1 ☘️…

ARCGIS PRO DSK 选择坐标系控件(CoordinateSystemsControl )的调用

在WPF窗体上使用 xml&#xff1a;加入空间命名引用 xmlns:mapping"clr-namespace:ArcGIS.Desktop.Mapping.Controls;assemblyArcGIS.Desktop.Mapping" 在控件区域加入&#xff1a; <mapping:CoordinateSystemsControl x:Name"CoordinateSystemsControl&q…

LangGraph(三)——添加记忆

目录 1. 创建MemorySaver检查指针2. 构建并编译Graph3. 与聊天机器人互动4. 问一个后续问题5. 检查State参考 1. 创建MemorySaver检查指针 创建MemorySaver检查指针&#xff1a; from langgraph.checkpoint.memory import MemorySavermemory MemorySaver()这是位于内存中的检…

深入理解Mysql

BufferPool和Changebuffer是如何加快读写速度的? BufferPool 在Mysql启动的时候 Mysql会申请连续的空间来存储BufferPool 每个页16kb 当控制块不足以存储信息的时候就会向后申请一个新的页 每个控制块都对应了一个缓存页 控制块占chunk的百分之5左右 LRU链表 Changebuffer …

Python核心编程深度解析:作用域、递归与匿名函数的工程实践

引言 Python作为现代编程语言的代表&#xff0c;其作用域管理、递归算法和匿名函数机制是构建高质量代码的核心要素。本文基于Python 3.11环境&#xff0c;结合工业级开发实践&#xff0c;深入探讨变量作用域的内在逻辑、递归算法的优化策略以及匿名函数的高效应用&#xff0c…

《用MATLAB玩转游戏开发》贪吃蛇的百变玩法:从命令行到AI对战

《用MATLAB玩转游戏开发&#xff1a;从零开始打造你的数字乐园》基础篇&#xff08;2D图形交互&#xff09;-&#x1f40d; 贪吃蛇的百变玩法&#xff1a;从命令行到AI对战 &#x1f3ae; 欢迎来到这篇MATLAB贪吃蛇编程全攻略&#xff01;本文将带你从零开始&#xff0c;一步步…

Android平台FFmpeg音视频开发深度指南

一、FFmpeg在Android开发中的核心价值 FFmpeg作为业界领先的多媒体处理框架&#xff0c;在Android音视频开发中扮演着至关重要的角色。它提供了&#xff1a; 跨平台支持&#xff1a;统一的API处理各种音视频格式完整功能链&#xff1a;从解码、编码到滤镜处理的全套解决方案灵…