ASM disk reads fail with changed asm_diskstring parameter - ORA-15080: synchronous I/O operation to

SYMPTOMS

In the alert_+ASM.log:
...
Tue Jul 09 18:52:48 2013
NOTE: ASM client <database>:<database> disconnected unexpectedly.
NOTE: check client alert log.
NOTE: Trace records dumped in trace file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_20307.trc
...
Tue Jul 09 18:52:50 2013
WARNING: Read Failed. group:2 disk:3 AU:3 offset:0 size:4096
WARNING: cache failed reading from group=2(<DISKGROUP1>) fn=6 blk=0 count=1 from disk= 3(<DISK04>) kfkist=0x20 status=0x02 file=kfc.c line=11555
ERROR: cache failed to read group=2(<DISKGROUP1>) fn=6 blk=0 from disk(s): 3(<DISK04>)
ORA-15080: synchronous I/O operation to a disk failed  ---------------->  HERE
NOTE: cache initiating offline of disk 3 group <DISKGROUP1>  ---------------->  HERE
NOTE: process _<user>_+asm (27925) initiating offline of disk 3.3971639839 (<DISK04>) with mask 0x7e in group 2  ---------------->  HERE
WARNING: Disk 3 (<DISK04>) in group 2 in mode 0x7f is now being taken offline on ASM inst 1
NOTE: initiating PST update: grp = 2, dsk = 3/0xecba6a1f, mask = 0x6a, op = clear
Tue Jul 09 18:52:51 2013
GMON updating disk modes for group 2 at 13 for pid 25, osid 27925
ERROR: Disk 3 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 2)
Tue Jul 09 18:52:51 2013
NOTE: cache dismounting (not clean) group 2/0xBC3A9AFC (<DISKGROUP1>)
NOTE: messaging CKPT to quiesce pins Unix process pid: 27946, image: <image> (B000)
Tue Jul 09 18:52:52 2013
NOTE: halting all I/Os to diskgroup 2 (<DISKGROUP1>)           ---------------->  HERE
Tue Jul 09 18:52:52 2013
...
SQL> alter diskgroup <DISKGROUP1> dismount force /* ASM SERVER */                   ---------------->  HERE
WARNING: Offline of disk 3 (<DISK04>) in group 2 and mode 0x7f failed on ASM inst 1  ---------------->  HERE
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_27915.trc:
ORA-17503: ksfdopn:2 Failed to open file +<DISKGROUP1>/<database>/spfile<database>.ora
ORA-15130: diskgroup "<DISKGROUP1>" is being dismounted                              ---------------->  HERE
ORA-15066: offlining disk "<DISK04>" in group "<DISKGROUP1>" may result in a data loss  ---------------->  HERE
ORA-15080: synchronous I/O operation to a disk failed                   ---------------->  HERE
Tue Jul 09 18:52:52 2013
...
SUCCESS: diskgroup <DISKGROUP1> was dismounted
SUCCESS: alter diskgroup <DISKGROUP1> dismount force /* ASM SERVER */  ---------------->  HERE
Tue Jul 09 18:52:52 2013
NOTE: diskgroup resource ora.<DISKGROUP1>.dg is offline
ERROR: PST-initiated MANDATORY DISMOUNT of group <DISKGROUP1>
Tue Jul 09 18:52:53 2013
SQL> ALTER DISKGROUP <DISKGROUP1> MOUNT  /* asm agent *//* {0:4:26} */  ---------------->  HERE
...
WARNING: Read Failed. group:0 disk:10 AU:0 offset:0 size:4096
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_20178.trc:
ORA-27061: waiting for async I/Os failed  ---------------->  HERE
Linux-x86_64 Error: 5: Input/output error  ---------------->  HERE
Additional information: -1    ---------------->  HERE
Additional information: 4096  ---------------->  HERE
WARNING: Read Failed. group:0 disk:0 AU:0 offset:0 size:4096
NOTE: Assigning number (2,0) to disk (<ORCL:LG01>)
NOTE: Assigning number (2,1) to disk (<ORCL:MED02>)
NOTE: Assigning number (2,2) to disk (<ORCL:SM01>)
...
NOTE: Assigning number (2,3) to disk ()
...
NOTE: cache dismounted group 2/0x117A9B02 (<DISKGROUP1>)
NOTE: cache ending mount (fail) of group <DISKGROUP1> number=2 incarn=0x117a9b02
NOTE: cache deleting context for group <DISKGROUP1> 2/0x117a9b02
GMON dismounting group 2 at 19 for pid 19, osid 20178
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
ERROR: diskgroup <DISKGROUP1> was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "3" is missing from group number "2"   ---------------->  HERE
ORA-15080: synchronous I/O operation to a disk failed
ORA-15080: synchronous I/O operation to a disk failed
ERROR: ALTER DISKGROUP <DISKGROUP1> MOUNT  /* asm agent *//* {0:4:26} */  ---------------->  HERE
Tue Jul 09 18:53:00 2013
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_28346.trc:
ORA-27061: waiting for async I/Os failed   ---------------->  HERE
Linux-x86_64 Error: 5: Input/output error  ---------------->  HERE
Additional information: -1
Additional information: 4096
WARNING: Read Failed. group:0 disk:0 AU:0 offset:0 size:4096
WARNING: Read Failed. group:0 disk:10 AU:0 offset:0 size:4096
Tue Jul 09 18:53:00 2013
ASM Health Checker found 1 new failures
Tue Jul 09 18:58:04 2013
NOTE: ASMB process exiting due to lack of ASM file activity for 305 seconds
Wed Jul 10 09:14:37 2013
Shutting down instance (immediate)  ---------------->  HERE
...
Wed Jul 10 09:14:45 2013
Instance shutdown complete
Wed Jul 10 09:14:52 2013
...
Wed Jul 10 09:14:56 2013
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: Diskgroups listed in ASM_DISKGROUPS are
<ARCH>
<DATA>
<TEMP>
...
NOTE: cache opening disk 0 of grp 1: ARCH01 label:<MED01>
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache mounting (first) external redundancy group 1/0xE022BD39 (ARCH)
...
NOTE: cache opening disk 0 of grp 2: <DISK01> label:<LG01>
NOTE: F1X0 found on disk 0 au 2 fcn 0.166360
NOTE: cache opening disk 1 of grp 2: <DISK02> label:<MED02>
NOTE: cache opening disk 2 of grp 2: <DISK03> label:<SM01>
NOTE: cache opening disk 3 of grp 2: <DISK04> label:<LG02>
NOTE: cache mounting (first) external redundancy group 2/0xE042BD3A (<DISKGROUP1>)
...
NOTE: cache opening disk 0 of grp 3: <DISKP01> label:<MED03>
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache mounting (first) external redundancy group 3/0xE042BD3B (TEMP)
...
SUCCESS: diskgroup <ARCH> was mounted               --------------->  <ARCH> was successfully mounted
GMON querying group 2 at 11 for pid 13, osid 25613
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup <DISKGROUP1> was mounted                            ---------------> DATA was successfully mounted
GMON querying group 3 at 12 for pid 13, osid 25613
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 3
SUCCESS: diskgroup <TEMP> was mounted                --------------->  TEMP was successfully mounted
SUCCESS: ALTER DISKGROUP ALL MOUNT              --------------->  All diskgroups successfully mounted.
...
Wed Jul 10 16:04:41 2013
WARNING: Read Failed. group:2 disk:3 AU:3 offset:0 size:4096
WARNING: cache failed reading from group=2(<DISKGROUP1>) fn=6 blk=0 count=1 from disk= 3(<DISK04>) kfkist=0x20 status=0x02 file=kfc.c line=11555
ERROR: cache failed to read group=2(<DISKGROUP1>) fn=6 blk=0 from disk(s): 3(<DISK04>)
ORA-15080: synchronous I/O operation to a disk failed  ---------------->  HERE
NOTE: cache initiating offline of disk 3 group <DISKGROUP1>  ---------------->  HERE
NOTE: process _<user>_+asm (28115) initiating offline of disk 3.3981594077 (<DISK04>) with mask 0x7e in group 2
WARNING: Disk 3 (<DISK04>) in group 2 in mode 0x7f is now being taken offline on ASM inst 1
NOTE: initiating PST update: grp = 2, dsk = 3/0xed524ddd, mask = 0x6a, op = clear
Wed Jul 10 16:04:41 2013
GMON updating disk modes for group 2 at 13 for pid 22, osid 28115
ERROR: Disk 3 cannot be offlined, since diskgroup has external redundancy.
...
Wed Jul 10 16:04:41 2013
NOTE: halting all I/Os to diskgroup 2 (<DISKGROUP1>)  ---------------->  HERE
...
SQL> alter diskgroup <DISKGROUP1> dismount force /* ASM SERVER */  ---------------->  HERE
Wed Jul 10 16:04:41 2013
NOTE: ASM client <database>:<database> disconnected unexpectedly.
NOTE: check client alert log.
NOTE: Trace records dumped in trace file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_ora_26025.trc
WARNING: Read Failed. group:2 disk:3 AU:1 offset:4096 size:4096
WARNING: Read Failed. group:2 disk:3 AU:1 offset:0 size:4096
Wed Jul 10 16:04:42 2013
ERROR: ORA-15130 in COD recovery for diskgroup 2/0xe042bd3a (<DISKGROUP1>)
NOTE: cache deleting context for group <DISKGROUP1> 2/0xe042bd3a
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_rbal_25613.trc:
ORA-15130: diskgroup "<DISKGROUP1>" is being dismounted      ---------------->  HERE
ERROR: ORA-15130 in COD recovery for diskgroup 2/0xe042bd3a (<DISKGROUP1>)
ERROR: ORA-15130 thrown in RBAL for group number 2
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM/trace/+ASM_rbal_25613.trc:
ORA-15130: diskgroup "" is being dismounted
...
SUCCESS: diskgroup <DISKGROUP1> was dismounted
SUCCESS: alter diskgroup <DISKGROUP1> dismount force /* ASM SERVER */
ERROR: PST-initiated MANDATORY DISMOUNT of group <DISKGROUP1>
...
Wed Jul 10 16:04:49 2013
SQL> ALTER DISKGROUP <DISKGROUP1> MOUNT  /* asm agent *//* {0:4:35} */
...
Wed Jul 10 16:04:55 2013
...
NOTE: cache opening disk 0 of grp 2: <DISK01> label:<LG01>
NOTE: F1X0 found on disk 0 au 2 fcn 0.166360
NOTE: cache opening disk 1 of grp 2: <DISK02> label:<MED02>
NOTE: cache opening disk 2 of grp 2: <DISK03> label:<SM01>
NOTE: cache opening disk 3 of grp 2: <DISK04> label:<LG02>
NOTE: cache mounting (first) external redundancy group 2/0x4CC2BD3D (<DISKGROUP1>)
...
Wed Jul 10 16:04:55 2013
NOTE: Instance updated compatible.asm to 10.1.0.0.0 for grp 2
SUCCESS: diskgroup <DISKGROUP1> was mounted
SUCCESS: ALTER DISKGROUP <DISKGROUP1> MOUNT  /* asm agent *//* {0:4:35} */  ---------------->  HERE
Thu Jul 11 13:36:22 2013
...
********************************************************************************

In the ASM HTML:

GRP NAME STATE    TYPE   TOTAL_MB FREE_MB

1   <ARCH> MOUNTED EXTERN 20000     7328
2   <DATA> MOUNTED EXTERN 302816     167872
3   <TEMP> MOUNTED EXTERN 20000     13120

GRP DISK MOUNT HEADER         MODE STATE TOTAL_MB FREE_MB NAME FAILGROUP LABEL PATH

0 0 IGNORED MEMBER ONLINE NORMAL 0 0                                                          /dev/oracleasm/disks/<LG02> ------->  HERE
0 1 IGNORED MEMBER ONLINE NORMAL 0 0                                                          /dev/oracleasm/disks/<LG01>  ------->  HERE
0 2 IGNORED MEMBER ONLINE NORMAL 0 0                                                         /dev/oracleasm/disks/<MED01>  ------->  HERE
0 3 IGNORED MEMBER ONLINE NORMAL 0 0                                                         /dev/oracleasm/disks/<MED02>  ------->  HERE
0 4 IGNORED MEMBER ONLINE NORMAL 0 0                                                         /dev/oracleasm/disks/<MED03>  ------->  HERE
0 5 IGNORED MEMBER ONLINE NORMAL 0 0                                                         /dev/oracleasm/disks/<SM01>  ------->  HERE
0 6 CLOSED PROVISIONED ONLINE NORMAL 0 0                                                         /dev/oracleasm/disks/<SM02>
0 7 CLOSED PROVISIONED ONLINE NORMAL 0 0                                                         /dev/oracleasm/disks/<SM03>
0 8 CLOSED PROVISIONED ONLINE NORMAL 0 0                                                        /dev/oracleasm/disks/<SM04>
0 15 CLOSED PROVISIONED ONLINE NORMAL 0 0                                                   <SM02 ORCL:SM02>
0 16 CLOSED PROVISIONED ONLINE NORMAL 0 0                                                     <SM03 ORCL:SM03>
0 17 CLOSED PROVISIONED ONLINE NORMAL 0 0                                                   <SM04 ORCL:SM04>
1 0 CACHED MEMBER     ONLINE NORMAL 20000 7328 <ARCH01> <ARCH01> <MED01> <ORCL:MED01>   ------->  HERE
2 0 CACHED MEMBER   ONLINE NORMAL 99984 54592 <DATA01> <DATA01> <LG01> <ORCL:LG01>  ------->  HERE
2 1 CACHED MEMBER       ONLINE NORMAL 20000 10848 <DATA02> <DATA02> <MED02> <ORCL:MED02>  ------->  HERE
2 2 CACHED MEMBER       ONLINE NORMAL 10000 5408 <DATA03> <DATA03> <SM01> <ORCL:SM01>  ------->  HERE
2 3 CACHED MEMBER     ONLINE NORMAL 172832 97024 <DATA04> <DATA04> <LG02> <ORCL:LG02>  ------->  HERE
3 0 CACHED MEMBER        ONLINE NORMAL 20000 13120 <TEMP01> <TEMP01> <MED03> <ORCL:MED03>  ------->  HERE

current asm_diskstring= /dev/oracleasm/disks/*

****************************************************************************************************************************************************

In the OS_COMMANDS.txt (kfod):
...
================================================================================
 1:      99999 Mb /dev/oracleasm/disks/<LG01>                oracle   dba
 2:     172832 Mb /dev/oracleasm/disks/<LG02>                oracle   dba
 3:      20000 Mb /dev/oracleasm/disks/<MED01>               oracle   dba
 4:      20000 Mb /dev/oracleasm/disks/<MED02>               oracle   dba
 5:      20000 Mb /dev/oracleasm/disks/<MED03>               oracle   dba
 6:      10000 Mb /dev/oracleasm/disks/<SM01>                oracle   dba
 7:      10000 Mb /dev/oracleasm/disks/<SM02>                oracle   dba
 8:      10000 Mb /dev/oracleasm/disks/<SM03>                oracle   dba
 9:       9679 Mb /dev/oracleasm/disks/<SM04>                oracle   dba
10:      99999 Mb <ORCL:LG01>                                <unknown> <unknown>
11:     172832 Mb <ORCL:LG02>                                <unknown> <unknown>
12:      20000 Mb <ORCL:MED01>                               <unknown> <unknown>
13:      20000 Mb <ORCL:MED02>                               <unknown> <unknown>
14:      20000 Mb <ORCL:MED03>                               <unknown> <unknown>
15:      10000 Mb <ORCL:SM01>                                <unknown> <unknown>
16:      10000 Mb <ORCL:SM02>                                <unknown> <unknown>
17:      10000 Mb <ORCL:SM03>                                <unknown> <unknown>
18:       9679 Mb <ORCL:SM04>                                <unknown> <unknown>
------------------------------------------------------------------------------------
 

CHANGES

ASM disk reads fail
ASM disks didn't get dismounted since Customer changed asm_diskstring parameter.

CAUSE

The asm_diskstring is set to '/dev/oracleasm/disks/*', which is incorrect.   This value for the asm_diskstring disables the ASMLib API.

There is a discovery phase when starting up the ASM instance based on the ASM_DISKSTRING parameter.  
ASM will scan the disks based on the parameter asm_diskstring, so by setting the asm_diskstring to 'ORCL:*', then you are ensuring that ASM is using the ASMLib API.

PLEASE ALSO NOTE:  Important:  Disabling ASMLIB API may additionally impact the IO performance on any Linux OS:

For example: In one specific env, where the RAC or Standalone System installed ASMLIB (For RHEL6.4), and when set asm_diskstring='/dev/oracleasm/disks/*', IO write became very slow.  This disabling of the ASMLIB API, in any Version of Linux OS, especially with several ASM disk members and/or diskgroups, can potentially affect IO Performance.


So, your issue is most likely related to your current value for the asm_diskstring, according to your ASM HTML output.
----------------------------------------

 Script#1 from 470211.1 can be run to collect the HTML report as below

In the ASM HTML output, for example:                                        
----------------------------------------                                        
GRP DISK MOUNT HEADER   MODE STATE TOTAL_MB FREE_MB NAME FAILGROUP LABEL PATH     PATH                                    
                                        
0      0 IGNORED MEMBER ONLINE NORMAL  0 0    /dev/oracleasm/disks/<LG02> ------>    HERE     ------->    IGNORED    --->    same    ASM    ASM     same disk as    <ORCL:LG02>
0      1 IGNORED MEMBER ONLINE NORMAL 0 0    /dev/oracleasm/disks/<LG01>  ------->    HERE    --------->    IGNORED    --->    same    ASM    disk    same disk as    <ORCL:LG01>
0      2 IGNORED MEMBER ONLINE NORMAL 0 0    /dev/oracleasm/disks/<MED01>  ------->    HERE    --------->    IGNORED    --->    same    ASM    disk    same disk as    <ORCL:MED01>
0      3 IGNORED MEMBER ONLINE NORMAL 0 0    /dev/oracleasm/disks/<MED02>  ------->    HERE    --------->    IGNORED    --->    same    ASM    disk    same disk as    <ORCL:MED02>
0      4 IGNORED MEMBER ONLINE NORMAL 0 0    /dev/oracleasm/disks/<MED03>  ------->    HERE    --------->    IGNORED    --->    same    ASM    disk    same disk as    <ORCL:MED03>
0     5 IGNORED MEMBER ONLINE NORMAL 0 0    /dev/oracleasm/disks/<SM01>  ------->    HERE    --------->    IGNORED    --->    same    ASM    disk    same disk as    <ORCL:SM01>
0     6 CLOSED PROVISIONED ONLINE NORMAL 0 0    /dev/oracleasm/disks/<SM02>                                   
0     7 CLOSED PROVISIONED ONLINE NORMAL 0 0    /dev/oracleasm/disks/<SM03>                                   
0     8 CLOSED PROVISIONED ONLINE NORMAL 0 0    /dev/oracleasm/disks/<SM04>                                   
0   15 CLOSED PROVISIONED ONLINE NORMAL 0 0    <SM02 ORCL:SM02>                                   
0   16 CLOSED PROVISIONED ONLINE NORMAL 0 0    <SM03 ORCL:SM03>                                   
0   17 CLOSED PROVISIONED ONLINE NORMAL 0 0    <SM04 ORCL:SM04>                                   
1     0 CACHED MEMBER     ONLINE  NORMAL 20000 7328 <ARCH01> <ARCH01> <MED01> <ORCL:MED01>     <MED01 ORCL:MED01>                                   
2    0 CACHED MEMBER      ONLINE  NORMAL 99984 54592 <DATA01> <DATA01> <LG01> <ORCL:LG01>       <LG01 ORCL:LG01>                                    
2    1 CACHED MEMBER       ONLINE  NORMAL 20000 10848 <DATA02> <DATA02> <MED02> <ORCL:MED02>      <MED02 ORCL:MED02>                                   
2    2 CACHED MEMBER       ONLINE  NORMAL 10000 5408 <DATA03> <DATA03> <SM01> <ORCL:SM01>    <SM01 ORCL:SM01>                                   
2    3 CACHED MEMBER       ONLINE  NORMAL 172832 97024 <DATA04> <DATA04> <LG02> <ORCL:LG02>      <LG02 ORCL:LG02>                                   
3    0 CACHED MEMBER        ONLINE  NORMAL 20000 13120 <TEMP01> <TEMP01> <MED03> <ORCL:MED03> <MED03 ORCL:MED03>     <MED03 ORCL:MED03>   

The above ASM disks with the IGNORED mount_status indicates ASM finding a duplicate disk (with asm_diskstring=/dev/oracelasm/disks and also to ORCL:*.
This means the disk paths are pointing to the same physical devices being used by an existing diskgroup.

ASM would ignore all duplicate disk paths.
----------------------------------------------------------------------------------

SOLUTION

[Please set your asm_diskstring='ORCL:*'.

1)  Log into ASM:

SQL> ALTER SYSTEM SET asm_diskstring='ORCL:*';

This init parameter can be dynamically changed.

2)  Then, check the change was applied:
======================================================
SQL> show parameter ASM_DISKSTRING

SQL> select path from v$asm_disk;
======================================================

3)  Please note:  If using RAC, then restart the CRS (Bounce the Cluster on both nodes), after modifying the asm_diskstring.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值