Restoring the redundancy of an Oracle ASM disk group after a transient disk path failure can be time consuming. This is especially true if the recovery process requires rebuilding an entire Oracle ASM disk group. Oracle ASM fast mirror resync significantly reduces the time to resynchronize a failed disk in such situations. When you replace the failed disk, Oracle ASM can quickly resynchronize the Oracle ASM disk extents.
To use this feature, the disk group compatibility attributes must be set to 11.1
or higher. For more information, refer to "Disk Group Compatibility".
Any problems that make a failure group temporarily unavailable are considered transient failures that can be recovered by the Oracle ASM fast mirror resync feature. For example, transient failures can be caused by disk path malfunctions, such as cable failures, host bus adapter failures, controller failures, or disk power supply interruptions.
Oracle ASM fast resync keeps track of pending changes to extents on an offline disk during an outage. The extents are resynced when the disk is brought back online.
By default, Oracle ASM drops a disk in 3.6 hours after it is taken offline. You can set the DISK_REPAIR_TIME
disk group attribute to delay the drop operation by specifying a time interval to repair the disk and bring it back online. The time can be specified in units of minutes (m
or M
) or hours (h
or H
). If you omit the unit, then the default unit is hours. The DISK_REPAIR_TIME
disk group attribute can only be set with the ALTER
DISKGROUP
SQL statement and is only applicable to normal and high redundancy disk groups.
If the attribute is not set explicitly, then the default value (3.6h
) applies to disks that have been set to OFFLINE
mode without an explicit DROP
AFTER
clause. Disks taken offline due to I/O errors do not have a DROP
AFTER
clause.
The default DISK_REPAIR_TIME
attribute value is an estimate that should be adequate for most environments. However, ensure that the attribute value is set to the amount of time that you think is necessary in your environment to fix any transient disk error, and during which you are able to tolerate reduced data redundancy.
The elapsed time (since the disk was set to OFFLINE
mode) is incremented only when the disk group containing the offline disks is mounted. The REPAIR_TIMER
column of V$ASM_DISK
shows the amount of time left (in seconds) before an offline disk is dropped. After the specified time has elapsed, Oracle ASM drops the disk. You can override this attribute with the ALTER
DISKGROUP
OFFLINE
DISK
statement and the DROP
AFTER
clause.
If a disk is offlined by Oracle ASM because of an I/O (write) error or is explicitly offlined using the ALTER
DISKGROUP
... OFFLINE
statement without the DROP
AFTER
clause, then the value specified for the DISK_REPAIR_TIME
attribute for the disk group is used.
Altering the DISK_REPAIR_TIME
attribute has no effect on offline disks. The new value is used for any disks that go offline after the attribute is updated. You can confirm this behavior by viewing the Oracle ASM alert log.
If an offline disk is taken offline for a second time, then the elapsed time is reset and restarted. If another time is specified with the DROP
AFTER
clause for this disk, the first value is overridden and the new value applies. A disk that is in OFFLINE
mode cannot be dropped with an ALTER
DISKGROUP
DROP
DISK
statement; an error is returned if attempted. If for some reason the disk must be dropped (such as the disk cannot be repaired) before the repair time has expired, a disk can be dropped immediately by issuing a second OFFLINE
statement with a DROP
AFTER
clause specifying 0h
or 0m
.
You can use ALTER
DISKGROUP
to set the DISK_REPAIR_TIME
attribute to a specified hour or minute value, such as 4.5 hours or 270 minutes. For example:
ALTER DISKGROUP data SET ATTRIBUTE 'disk_repair_time' = '4.5h' ALTER DISKGROUP data SET ATTRIBUTE 'disk_repair_time' = '270m'
After you repair the disk, run the SQL statement ALTER
DISKGROUP
ONLINE
DISK
. This statement brings a repaired disk group back online to enable writes so that no new writes are missed. This statement also starts a procedure to copy of all of the extents that are marked as stale on their redundant copies.
If a disk goes offline when the Oracle ASM instance is in rolling upgrade mode, the disk remains offline until the rolling upgrade has ended and the timer for dropping the disk is stopped until the Oracle ASM cluster is out of rolling upgrade mode. See "Upgrading and Patching Oracle ASM". Examples of taking disks offline and bringing them online follow.
The following example takes disk DATA_001
offline and drops it after five minutes.
ALTER DISKGROUP data OFFLINE DISK DATA_001 DROP AFTER 5m;
The next example takes the disk DATA_001
offline and drops it after the time period designated by DISK_REPAIR_TIME
elapses:
ALTER DISKGROUP data OFFLINE DISK DATA_001;
This example takes all of the disks in failure group FG2
offline and drops them after the time period designated by DISK_REPAIR_TIME
elapses. If you used a DROP
AFTER
clause, then the disks would be dropped after the specified time:
ALTER DISKGROUP data OFFLINE DISKS IN FAILGROUP FG2;
The next example brings all of the disks in failure group FG2
online:
ALTER DISKGROUP data ONLINE DISKS IN FAILGROUP FG2;
This example brings only disk DATA_001
online:
ALTER DISKGROUP data ONLINE DISK DATA_001;
This example brings all of the disks in disk group DATA
online:
ALTER DISKGROUP data ONLINE ALL;
Querying the V$ASM_OPERATION
view while you run ALTER
DISKGROUP
ONLINE
statements displays the name and state of the current operation that you are performing. For example, the following SQL query shows values in the PASS
column during an online operation.
SQL> SELECT GROUP_NUMBER, PASS, STATE FROM V$ASM_OPERATION; GROUP_NUMBER PASS STAT ------------ --------- ---- 1 RESYNC RUN 1 REBALANCE WAIT 1 COMPACT WAIT
An offline operation does not generate a display in a V$ASM_OPERATION
view query.
You can set the FAILGROUP_REPAIR_TIME
and CONTENT.TYPE
disk group attributes. The FAILGROUP_REPAIR_TIME
disk group attribute specifies a default repair time for the failure groups in the disk group. The CONTENT.TYPE
disk group attribute specifies the type of data expected to be stored in a disk group. You can set these attributes with ASMCA, ASMCMD mkdg
, or SQL CREATE
and ALTER
DISKGROUP
statements. For information about disk group attributes, refer to "Managing Disk Group Attributes".
The ASMCMD lsop
command shows the resync time estimate. There are separate rows in the V$ASM_OPERATION
table for different phases of rebalance: disk resync, rebalance, and data compaction.
The ASMCMD online
command has a power
option to specify the power for the online operation. The SQL ALTER
DISKGROUP
REPLACE
DISK
statement also has the power option.
The ASMCMD chdg
command provides the replace
option in addition to the add
and drop
tags. The ASMCMD mkdg
command has an additional time parameter (-t
) to specify the time to offline a failure group.
Oracle Database SQL Language Reference for information about ALTER
DISKGROUP
and CREATE
DISKGROUP