A Troubleshooting the Oracle Grid Infrastructure Installation Process

This appendix provides troubleshooting information for installing Oracle Grid Infrastructure.

See Also:

Best Practices for Contacting Oracle Support
General Installation Issues
Interpreting CVU "Unknown" Output Messages Using Verbose Mode
Interpreting CVU Messages About Oracle Grid Infrastructure Setup
About the Oracle Clusterware Alert Log
Oracle Clusterware Install Actions Log Errors and Causes
Performing Cluster Diagnostics During Oracle Grid Infrastructure Installations
CVU Cluster Healthchecks Command Reference
Interconnect Configuration Issues
Storage Configuration Issues
Troubleshooting Windows Firewall Exceptions
Completing Failed or Incomplete Installations and Upgrades
Completing Failed or Interrupted Upgrades

A.1 Best Practices for Contacting Oracle Support

If you find that it is necessary for you to contact Oracle Support to report an issue, then Oracle recommends that you follow these guidelines when you enter your service request.

Provide a clear explanation of the problem, including exact error messages.
Provide an explanation of any steps you have taken to troubleshoot issues, and the results of these steps.
Provide exact releases (major release and patch release) of the affected software.
Provide a step-by-step procedure of what actions you carried out when you encountered the problem, so that Oracle Support can reproduce the problem.
Provide an evaluation of the effect of the issue, including affected deadlines and costs.
Provide screen shots, logs, Remote Diagnostic Agent (RDA) output, or other relevant information.

A.2 General Installation Issues

The following is a list of examples of types of errors that can occur during installation.

CLSRSC-444: Run root script on the Node with OUI session
Nodes unavailable for selection from the OUI Node Selection screen
Node nodename is unreachable
PROT-8: Failed to import data from specified file to the cluster registry
Timed out waiting for the CRS stack to start

See Also:

For additional help in resolving error messages, see My Oracle Support. For example, the note with Doc ID 1367631.1 contains some of the most common installation issues for Oracle Grid Infrastructure and Oracle Clusterware.

CLSRSC-444: Run root script on the Node with OUI session: Cause: If this message appears listing a node that is not the one where you are running OUI, then the likely cause is that the named node shut down during or before the root script completed its run.; Action: Retry the installation from the first node. After you complete Oracle Grid Infrastructure on all or part of the set of planned cluster member nodes, start OUI and deinstall the failed Oracle Grid Infrastructure installation on the node named in the error. When you have deinstalled the failed installation on the node, add that node manually to the cluster.

See Also:
Oracle Clusterware Administration and Deployment Guide for information about how to add a node

Failure to start network or VIP resources when Microsoft Failover Cluster is installed: Cause: If Microsoft Failover Cluster (MSFC) is installed on a Windows Server 2008 cluster (even if it is not configured) and you attempt to install Oracle Grid Infrastructure, the installation fails during the 'Configuring Grid Infrastructure' phase with an indication that it was unable to start the resource ora.net1.network or the VIP resources.
When MSFC is installed, it creates a virtual network adapter and places it at the top of the binding order. This change in the binding order can only be seen in the registry; it is not visible through 'View Network Connections' under Server Manager.; Action: The only solution is not to install MSFC and Oracle Grid Infrastructure or Oracle Clusterware on the same Windows Server 2008 cluster.

Nodes unavailable for selection from the OUI Node Selection screen: Cause: Oracle Grid Infrastructure is either not installed, or the Oracle Grid Infrastructure services are not up and running.; Action: Install Oracle Grid Infrastructure, or review the status of your installation. Consider restarting the nodes, as doing so may resolve the problem.

Node nodename is unreachable

Cause: Unavailable IP host

Action: Attempt the following:

At a command prompt, run the command ipconfig /all. Compare the output of this command with the contents of the hosts file to ensure that the node IP is listed.
Run the operating system command nslookup to see if the host is reachable.

PROT-8: Failed to import data from specified file to the cluster registry: Cause: Insufficient space in an existing Oracle Cluster Registry device partition, which causes a migration failure while performing an upgrade. To confirm, look for the error "utopen:12:Not enough space in the backing store" in the log file %GRID_HOME%\log\hostname\client\ocrconfig_pid.log, where pid stands for the process id.; Action: Identify a storage device that has 400 MB or more available space. Oracle recommends that you allocate the entire disk to Oracle ASM.

Timed out waiting for the CRS stack to start: Cause: If a configuration issue prevents the Oracle Grid Infrastructure software from installing successfully on all nodes, then you may see error messages such as "Timed out waiting for the CRS stack to start," or you may notice that Oracle Clusterware-managed resources were not create on some nodes after you exit the installer. You also may notice that resources have a status other than ONLINE.; Action: Unconfigure the Oracle Grid Infrastructure installation without removing binaries, and review log files to determine the cause of the configuration issue. After you have fixed the configuration issue, rerun the scripts used during installation to configure Oracle Clusterware.

See Also:
Section 9.4, "Unconfiguring Oracle Clusterware Without Removing the Software"

A.2.1 Other Installation Issues and Errors

For additional help in resolving error messages, see My Oracle Support.

For example, the note with Doc ID 1367631.1 contains some of the most common installation issues for Oracle Grid Infrastructure and Oracle Clusterware.

A.3 Interpreting CVU "Unknown" Output Messages Using Verbose Mode

If you run Cluster Verification Utility (CVU) using the -verbose argument, and a Cluster Verification Utility command responds with UNKNOWN for a particular node, then this is because Cluster Verification Utility cannot determine if a check passed or failed.

Possible causes for an "Unknown" response include:

The node is down
Common operating system command binaries required by Cluster Verification Utility are missing in the bin directory of the Oracle Grid Infrastructure home or Oracle home directory
The user account starting Cluster Verification Utility does not have privileges to run common operating system commands on the node
The node is missing an operating system patch, or a required package

A.4 Interpreting CVU Messages About Oracle Grid Infrastructure Setup

If the Cluster Verification Utility report indicates that your system fails to meet the requirements for Oracle Grid Infrastructure installation, then use the topics in this section to correct the problem or problems indicated in the report, and run Cluster Verification Utility again.

User Equivalence Check Failed

Cause: Failure to establish user equivalency across all nodes. This can be due to not creating the required users, the installation user not being the same on all nodes, or using a different password on the failed nodes.

Action: Cluster Verification Utility provides a list of nodes on which user equivalence failed. For each node listed as a failure node, review the Oracle Installation user configuration to ensure that the user configuration is properly completed, and that user equivalence is properly completed.

Check to ensure that:

You are using a Domain user account that has been granted explicit membership in the Administrators group on each cluster node.
The user account has the same password on each node.
The domain for the user is the same on each node.
The user account has administrative privileges on each node.
The user can connect to the registry of each node from the local node.
You might have to change the User Account Control settings on each node:
- Change the elevation prompt behavior for administrators to "Elevate without prompting". See http://technet.microsoft.com/en-us/library/cc709691.aspx
- Confirm that the Administrators group is listed under 'Manage auditing and security log'.

Node Reachability Check or Node Connectivity Check Failed

Cause: One or more nodes in the cluster cannot be reached using TCP/IP protocol, through either the public or private interconnects.

Action: Use the command ping address to check each node address. When you find an address that cannot be reached, check your list of public and private addresses to ensure that you have them correctly configured. Ensure that the public and private network interfaces have the same interface names on each node of your cluster.

Do not use the names PUBLIC and PRIVATE (all capital letters) for your public and interconnect network adapters (NICs). You can use the variations of private, Private, public, and Public for the network interface names.

See Also:

A.5 About the Oracle Clusterware Alert Log

Oracle Clusterware uses Oracle Database fault diagnosability infrastructure to manage diagnostic data and its alert log. As a result, most diagnostic data resides in the Automatic Diagnostic Repository (ADR), a collection of directories and files located under a base directory that you specify during installation.

Starting with Oracle Clusterware 12c release 1 (12.1.0.2), diagnostic data files written by Oracle Clusterware programs are known as trace files (have a .trc file extension), and appear together in the trace subdirectory of the ADR home. Besides trace files, the trace subdirectory in the Oracle Clusterware ADR home contains the simple text Oracle Clusterware alert log. The Oracle Clusterware alert log always has the name alert.log. The Oracle Clusterware alert log is also written as an XML file in the alert subdirectory of the ADR home, but the text alert log is most easily read.

The Oracle Clusterware alert log is the first place to look for serious errors. In the event of an error, it can contain path information to diagnostic logs that can provide specific information about the cause of errors.

After installation, Oracle Clusterware posts alert messages when important events occur. For example, you may see alert messages from the Cluster Ready Services daemon process (CRSD) when it starts, if it aborts, if the fail over process fails, or if automatic restart of an Oracle Clusterware resource fails.

Oracle Enterprise Manager monitors the Oracle Clusterware alert log and posts an alert on the Cluster Home page if an error is detected. For example, if a voting file is not available, then a CRS-1604 error is raised, and a critical alert is posted on the Cluster Home page of Oracle Enterprise Manager. You can customize the error detection and alert settings on the Metric and Policy Settings page.

The location of the Oracle Clusterware log file is ORACLE_BASE\diag\crs\hostname\crs\trace\alert.log, where ORACLE_BASE is the directory in which Oracle Clusterware was installed and hostname is the host name of the local node

See Also:

Oracle Clusterware Administration and Deployment Guide for information about Oracle Clusterware troubleshooting
Oracle Database Utilities for information about the Automatic Diagnostic Repository Command Interpreter (ADCRI) utility to manage Oracle Database diagnostic data
Oracle Database Administrator's Guide for information on the Automatic Diagnostic Repository (ADR)

A.6 Oracle Clusterware Install Actions Log Errors and Causes

During installation of the Oracle Grid Infrastructure software, a log file named installActions<Date_Timestamp>.log is written to the %TEMP%\OraInstall<Date_Timestamp> directory.

The following is a list of potential errors in the installActions.log:

PRIF-10: failed to initialize the cluster registry

Configuration assistant "Oracle Private Interconnect Configuration Assistant" failed
KFOD-0311: Error scanning device device_path_name
Step 1: checking status of Oracle Clusterware cluster

Step 2: configuring OCR repository

ignoring upgrade failure of ocr(-1073740972)

failed to configure Oracle Cluster Registry with CLSCFG, ret -1073740972

Each of these error messages can be caused by one of the following issues:

A.6.1 Symbolic links for disks were not removed

When you stamp a disk with ASMTOOL, it creates symbolic links for the disks.

If these links are not removed when the disk is deleted or reconfigured, then errors can occur when attempting to access the disks.

To correct the problem, you can try stamping the disks again with ASMTOOL.

A.6.2 Discovery string used by Oracle Automatic Storage Management is incorrect

When specifying Oracle Automatic Storage Management (Oracle ASM) for storage, you have the option of changing the default discovery string used to locate the disks.

If the discovery string is set incorrectly, Oracle ASM will not be able to locate the disks.

A.6.3 You used a period in a node name during Oracle Clusterware install

Periods (.) are not permitted in node names. Instead, use a hyphen (-).

To resolve a failed installation, remove traces of the Oracle Grid Infrastructure installation, and reinstall with a supported node name.

A.6.4 Ignoring upgrade failure of ocr(-1073740972)

This error indicates that the user that is performing the installation does not have Administrator privileges.

A.7 Performing Cluster Diagnostics During Oracle Grid Infrastructure Installations

If the installer does not display the Node Selection page, then use cluvfy to check the integrity of the Cluster Manager.

Use the following command syntax to check the integrity of the Cluster Manager:
```
cluvfy comp clumgr -n node_list -verbose
```
In the preceding syntax example, the variable node_list is the list of nodes in your cluster, separated by commas.

Note:

If you encounter unexplained installation errors during or after a period when scheduled tasks are run, then your scheduled task may have deleted temporary files before the installation is finished. Oracle recommends that you complete the installation before scheduled tasks are run, or disable scheduled tasks that perform cleanup until after the installation is completed.

A.8 CVU Cluster Healthchecks Command Reference

Starting with Oracle Grid Infrastructure 11g Release 2 (11.2.0.3) and later, you can use the CVU healthcheck command option to check your Oracle Clusterware and Oracle Database installations for their compliance with mandatory requirements and best practices guidelines, and to check to ensure that they are functioning properly.

Syntax

cluvfy comp healthcheck [-collect {cluster|database}] [-db db_unique_name] 
[-bestpractice|-mandatory] [-deviations] [-html] [-save [-savedir directory_path]

Example

C:\> cd app\12.1.0\grid\cvu_home\bin
C:\..\bin> cluvfy comp healthcheck -collect cluster -bestpractice -deviations
 -html

Command Options

-collect [cluster|database]

Use this option to specify that you want to perform checks for Oracle Clusterware (cluster) or Oracle Database (database). If you do not use the collect option with the healthcheck command, then the cluvfy comp healthcheck command performs checks for both Oracle Clusterware and Oracle Database.
-db db_unique_name

Use this option to specify checks on the database unique name that you enter after the db option.

CVU uses JDBC to connect to the database as the user cvusys to verify various database parameters. For this reason, if you want checks to be performed for the database you specify with the -db option, then you must first create the cvusys user on that database, and grant that user the CVU-specific role, CVUSAPP. You must also grant members of the CVUSAPP role SELECT permissions on system tables.

A SQL script, Grid_home\cv\admin\cvusys.sql, is provided to facilitate the creation of this user. Use this SQL script to create the cvusys user on all the databases that you want to verify using CVU.

If you use the db option but do not provide a database unique name, then CVU discovers all the Oracle Databases on the cluster. To perform best practices checks on these databases, you must create the cvusys user on each database, and grant that user the CVUSAPP role with the SELECT privileges needed to perform the best practice checks.
[-bestpractice | -mandatory] [-deviations]

Use the bestpractice option to specify best practice checks, and the mandatory option to specify mandatory checks. Add the deviations option to specify that you want to see only the deviations from either the best practice recommendations or the mandatory requirements. You can specify either the -bestpractice or -mandatory option, but not both flags. If you specify neither -bestpractice or -mandatory, then both best practices and mandatory requirements are displayed.
-html

Use the html option to generate a detailed report in HTML format.

If you specify the html option, and a browser CVU recognizes is available on the system, then the browser is started and the report is displayed on the browser when the checks are complete.

If you do not specify the html option, then the detailed report is generated in a text file.
-save [-savedir dir_path]

Use the save or -save -savedir flags to save validation reports (cvuchecdkreport_timestamp.txt and cvucheckreport_timestamp.htm), where timestamp is the time and date of the validation report.

If you use the save option by itself, then the reports are saved in the path CVU_home/cv/report, where CVU_home is the location of the CVU binaries.

If you use the flags -save -savedir, and enter a path where you want the CVU reports saved, then the CVU reports are saved in the path you specify.

A.9 Interconnect Configuration Issues

If the interconnect is not configured correctly, it can lead to errors or availability issues.

If you plan to use multiple network interface cards (NICs) for the interconnect, then you should use a third party solution to bond the interfaces at the operating system level. Otherwise, the failure of a single NIC will affect the availability of the cluster node.

If you install Oracle Grid Infrastructure and Oracle RAC, then they must use the same bonded NIC cards or teamed NIC cards for the interconnect. If you use bonded or teamed NIC cards, then they must be on the same subnet.

If you encounter errors, then perform the following system checks:

Verify with your network providers that they are using the correct cables (length, type) and software on their switches. In some cases, to avoid bugs that cause disconnects under loads, or to support additional features such as Jumbo Frames, you may need a firmware upgrade on interconnect switches, or you may need newer NIC driver or firmware at the operating system level. Running without such fixes can cause later instabilities to Oracle RAC databases, even though the initial installation seems to work.
Review virtual local area network (VLAN) configurations, duplex settings, and auto-negotiation in accordance with vendor and Oracle recommendations.

A.10 Storage Configuration Issues

The following is a list of issues involving storage configuration:

Recovering from Losing a Node File System or Grid Home
Oracle ASM Storage Issues
Oracle ASM Issues After Downgrading Oracle Grid Infrastructure for Standalone Server (Oracle Restart)

A.10.1 Recovering from Losing a Node File System or Grid Home

If you remove a file system by mistake, or encounter another storage configuration issue that results in losing the Oracle Local Registry or otherwise corrupting a node, you can recover the node in one of two ways.

If you add nodes in a GNS configuration, then that is called Grid Plug and Play (GPnP). GPnP uses profiles to configure nodes, which eliminates configuration data requirements for nodes and the need for explicit add and delete nodes steps. GPnP allows a system administrator to take a template system image and run it on a new node with no further configuration. GPnP removes many manual operations, reduces the opportunity for errors, and encourages configurations that can be changed easily. Removal of individual node configuration makes the nodes easier to replace, because nodes do not need to contain individually-managed states.

Grid Plug and Play reduces the cost of installing, configuring, and managing database nodes by making their state disposable. It allows nodes to be easily replaced with regenerated state.

Restore the node from an operating system level backup (preferred)
Remove the node from the cluster, and then add the node to the cluster, using Grid home/addnode/addnode.bat. Profile information for the cluster is copied to the node, and the node is restored.

You must run the addNode.bat command as an Administrator user on the node that you are restoring, to recreate OCR keys and to perform other configuration tasks. You initiate recovery of a node using the addnode command, similar to the following, where lostnode is the node that you are adding back to the cluster:
1. If you are using Grid Naming Service (GNS):
```
C:\Grid_home\addnode\bin> addNode.bat -silent "CLUSTER_NEW_NODES=lostnode"
```
2. If you are not using GNS:
```
C:\Grid_home\addnode\bin> addNode.bat -silent "CLUSTER_NEW_NODES={lostnode}" 
"CLUSTER_NEW_VIRTUAL_HOSTNAMES={lostnode-vip}"
```

Using addnode.bat enables cluster nodes to be removed and added again, so that they can be restored from the remaining nodes in the cluster.

After the addNode.bat command finishes, run the following command on the node being added to the cluster:

C:\> Grid_home\crs\config\gridconfig.bat

See Also:

Oracle Clusterware Administration and Deployment Guide for information about how to add nodes manually or with GNS

A.10.2 Oracle ASM Storage Issues

This section describes Oracle ASM storage error messages, and how to address these errors.

ASM-0001: could not open device \\?\volume... O/S-Error: (OS-5) Access is denied.

Cause: User Account Control (UAC) can require administrators to specifically approve administrative actions or applications before they are allowed to run. If you do not supply the proper credentials, the asmtool and asmtoolg utilities report these errors.

Action: There are a few ways to resolve this problem:

Click Continue in the UAC dialog box if you are logged in as an administrative user, or provide the credentials for an administrator user, then click Continue.
Create a desktop shortcut to a command window. Open the command window using the Run as Administrator option, then right-click the context menu, and launch asmtool.
Configure the UAC implementation on your Windows Server to turn off UAC or to change the elevation prompt behavior for administrator users.

Note:

For information about managing security and UAC in a business or enterprise environment, see the User Account Control paper at http://technet.microsoft.com/en-us/library/cc731416(WS.10).aspx.

O/S-Error: (OS-2) The system cannot find the file specified.: Cause: If a disk is disabled at the operating system level and enabled again, some of the Oracle ASM operations such as CREATE DISKGROUP, MOUNT DISKGROUP, ADD DISK, ONLINE DISK, or querying V$ASM_DISK fail with the error:
OS Error: (OS-2) The system cannot find the file specified.

This happens when a previously mounted disk is assigned a new volume ID by the operating system. When Oracle ASM uses the old volume ID, it fails to open the disk and signals the above error.; Action: Use ASMTOOL to restamp the disk and update the volume ID used by Oracle ASM.

Unable to mount disk group; ASM discovered an insufficient number of disks for diskgroup

Cause: You performed an Oracle Grid Infrastructure software-only installation, and want to configure a disk group for storing the OCR and voting files during the postinstallation configuration of Oracle Clusterware. You used ASMTOOL to stamp the disks, and then used ASMCMD to create a disk group using the stamped disks. If you then update a crsconfig_params file with the disk device or disk partition names that constitute the Oracle ASM disk group. During the configuration of Oracle Clusterware, errors are displayed such as ORA-15017: diskgroup "DATA" cannot be mounted.

Action: Change the crsconfig_params file to use the stamped names generated by ASMTOOL instead of the disk partition names, for example:

"\\.\ORCLDISKDATA0"

A.10.3 Oracle ASM Issues After Downgrading Oracle Grid Infrastructure for Standalone Server (Oracle Restart)

The following section explains an error that can occur when you downgrade Oracle Grid Infrastructure for standalone server (Oracle Restart), and how to address it.

CRS-2529: Unable to act on 'ora.cssd' because that would require stopping or relocating 'ora.asm'

Cause: After downgrading Oracle Grid Infrastructure for a standalone server (Oracle Restart) from 12.1.0.2 to 12.1.0.1, the ora.asm resource does not contain the Server Parameter File (SPFILE) parameter.

Action: When you downgrade Oracle Grid Infrastructure for a standalone server (Oracle Restart) from 12.1.0.2 to 12.1.0.1, you must explicitly add the Server Parameter File (SPFILE) from the ora.asm resource when adding the Oracle ASM resource for 12.1.0.1.

Follow these steps when you downgrade Oracle Restart from 12.1.0.2 to 12.1.0.1:

In your 12.1.0.2 Oracle Restart installed configuration, query the SPFILE parameter from the Oracle ASM resource (ora.asm) and remember it:
```
srvctl config asm
```

Deconfigure the 12.1.0.2 release Oracle Restart:

Grid_home/crs/install/roothas.bat -deconfig -force

Install the 12.1.0.1 release Oracle Restart by running root.sh:
```
Grid_home/root.sh
```
Add the listener resource:
```
Grid_home/bin/srvctl add LISTENER
```
Add the Oracle ASM resource and provide the SPFILE parameter for the 12.1.0.2 Oracle Restart configuration obtained in Step 1:
```
Grid_home/bin/srvctl add asm [-spfile <spfile>]
 [-diskstring <asm_diskstring>])
```

See Also:

Oracle Database Installation Guide for your platform for information about installing and deconfiguring Oracle Restart

A.11 Troubleshooting Windows Firewall Exceptions

If you cannot establish certain connections even after granting exceptions to the executable files, then follow these steps to troubleshoot the installation:

Examine Oracle configuration files (such as *.conf files), the Oracle key in the Windows registry, and network configuration files in %ORACLE_HOME%\network\admin.
Grant an exception in the Windows Firewall to any executable listed in %ORACLE_HOME%\network\admin\listener.ora in a PROGRAM= clause.

Each of these executables must be granted an exception in the Windows Firewall because a connection can be made through the TNS listener to that executable.
Examine Oracle trace files, log files, and other sources of diagnostic information for details on failed connection attempts.

Log and trace files on the database client computer may contain useful error codes or troubleshooting information for failed connection attempts. The Windows Firewall log file on the server may contain useful information as well.
If the preceding troubleshooting steps do not resolve a specific configuration issue on Windows, then provide the output from the following command to Oracle Support for diagnosis and problem resolution:
```
netsh firewall show state verbose=enable
```

See Also:

Section 8.1.3, "Configure Exceptions for the Windows Firewall"
http://www.microsoft.com/downloads/details.aspx?FamilyID=a7628646-131d-4617-bf68-f0532d8db131&displaylang=en for information on Windows Firewall troubleshooting
http://support.microsoft.com/default.aspx?scid=kb;en-us;875357 for more information on Windows Firewall configuration

A.12 Completing Failed or Incomplete Installations and Upgrades

Even if the installation or upgrade fails initially, you can takes steps to complete the operation.

About Failed or Incomplete Installations and Upgrades
Completing Failed or Interrupted Upgrades
Completing Failed or Interrupted Installations

A.12.1 About Failed or Incomplete Installations and Upgrades

During installations or upgrades of Oracle Grid Infrastructure, the following actions take place:

Oracle Universal Installer (OUI) accepts inputs to configure Oracle Grid Infrastructure software on your system.
OUI runs the gridconfig.bat script on each node.
OUI runs configuration assistants. The Oracle Grid Infrastructure software installation completes successfully.

If OUI exits before the gridconfig.bat script runs, or if OUI exits before the installation or upgrade session completes successfully, then the Oracle Grid Infrastructure installation is incomplete. If your installation or upgrade does not complete, then Oracle Clusterware does not work correctly. If you are performing an upgrade, then an incomplete upgrade can result in some nodes being upgraded to the latest software and others nodes not upgraded at all. If you are performing an installation, the incomplete installation can result in some nodes not being a part of the cluster.

Additionally, with Oracle Grid Infrastructure release 11.2.0.3 or later releases, the following messages may be seen during installation or upgrade:

ACFS-9427 Failed to unload ADVM/ACFS drivers. A system reboot is recommended

ACFS-9428 Failed to load ADVM/ACFS drivers. A system reboot is recommended

CLSRSC-400: A system reboot is required to continue installing

To resolve this error, you must reboot the server, and then follow the steps for completing an incomplete installation or upgrade.

A.12.2 Completing Failed or Interrupted Upgrades

If OUI exits on the node from which you started the installation (the first node), or the node reboots before you confirm that the gridconfig.bat script was run on all cluster nodes, then the upgrade remains incomplete.

In an incomplete upgrade, configuration assistants still need to run, and the new Grid home still needs to be marked as active in the central Oracle inventory. You must complete the upgrade on the affected nodes manually.

Continuing Upgrade When Upgrade Fails on the First Node
Continuing Upgrade When Upgrade Fails on Nodes Other Than the First Node

A.12.2.1 Continuing Upgrade When Upgrade Fails on the First Node

When the first node cannot be upgraded, use these steps to continue the upgrade process.

If the OUI failure indicated a need to reboot by raising error message CLSRSC-400, then reboot the first node (the node on which you started the upgrade). Otherwise, manually fix or clear the error condition, as reported in the error output.
Complete the upgrade of all other nodes in the cluster.
Configure a response file, and provide passwords for the upgrade.

See Section B.5, "Postinstallation Configuration Using Response Files" for information about how to create the response file.
To complete the upgrade, log in as the Oracle Installation user for Oracle Grid Infrastructure and run the script configToolAllCommands, located in the path Grid_home\cfgtoollogs\configToolAllCommands, specifying the response file that you created.

For example, if the response file is named gridinstall.rsp:
```
[C:\] cd app\12.1.0\grid\cfgtoollogs
[C:\..\cfgtoollogs] configToolAllCommands RESPONSE_FILE=gridinstall.rsp
```

A.12.2.2 Continuing Upgrade When Upgrade Fails on Nodes Other Than the First Node

For nodes other than the first node (the node on which you started the upgrade), use these steps to continue the upgrade process.

If the OUI failure indicated a need to reboot, by raising error message CLSRSC-400, then reboot the node with the error condition. Otherwise, manually fix or clear the error condition that was reported in the error output.
On the first node, within OUI, click Retry.

This instructs OUI to retry the upgrade on the affected node.
Continue the upgrade from the OUI instance on the first node.

A.12.3 Completing Failed or Interrupted Installations

If OUI exits on the node from which you started the installation (the first node), or the node reboots before you confirm that gridconfig.bat script was run on all nodes, then the installation remains incomplete.

In an incomplete installation, configuration assistants still need to run, and the new Grid home still needs to be marked as active in the central Oracle inventory. You must complete the installation on the affected nodes manually.

Continuing Incomplete Installations on First Nodes
Continuing Installations on Nodes Other Than the First Node

A.12.3.1 Continuing Incomplete Installations on First Nodes

To continue an incomplete installation, the first node must finish before the rest of the clustered nodes.

If the OUI failure indicated a need to reboot, by raising error message CLSRSC-400, then reboot the first node (the node where the installation was started). Otherwise, manually fix or clear the error condition that was reported in the error output.
If necessary, log in as the Oracle Installation user for Oracle Grid Infrastructure. Change directory to the Grid home on the first node and run the gridconfig.bat script on that node again.

For example:
```
[C:\] cd app\12.1.0\grid\crs\config\
[C:\..\config] gridconfig.bat
```
Complete the installation on all other nodes.
Configure a response file, and provide passwords for the installation.

See Section B.5, "Postinstallation Configuration Using Response Files" for information about how to create the response file.
To complete the installation, log in as the Oracle Installation user for Oracle Grid Infrastructure, and run the script configToolAllCommands, located in the path Grid_home\cfgtoollogs\configToolAllCommands, specifying the response file that you created.

For example, if the response file is named gridinstall.rsp:
```
[C:\] cd app\12.1.0\grid\cfgtoollogs
[C:\..\cfgtoollogs] configToolAllCommands RESPONSE_FILE=gridinstall.rsp
```

A.12.3.2 Continuing Installations on Nodes Other Than the First Node

For nodes other than the first node (the node on which you started the installation), use these steps to continue the installation process.

If the OUI failure indicated a need to reboot, by raising error message CLSRSC-400, then reboot the node with the error condition. Otherwise, manually fix or clear the error condition that was reported in the error output.
On the first node, within OUI, click Retry.
Continue the installation from the OUI instance on the first node.