10 Client Failover Best Practices for Highly Available Oracle Databases

This section describes Oracle Database 12c configuration best practices to automatically transition application connections from a failed primary database to a new primary database after a Data Guard / Active Data Guard role transition has occurred. This configuration also applies to client's connection to Real Application Clusters databases. In addition it also describes best practices for Application Continuity and Transaction Guard, new with Oracle Database 12c.

At a high level, automating client failover in a Data Guard configuration includes relocating database services to the new primary database as part of a Data Guard failover, notifying clients that a failure has occurred to break them out of TCP timeout, and redirecting clients to the new primary database.

The sections below describe how to create role-based database services for both OCI and JDBC applications in a Data Guard configuration. Subsequent sections provide detailed configuration steps for enabling OCI and JDBC clients to receive FAN notifications and reconnect to a new primary database.

Types of Failures

Unplanned failures of an Oracle Database instance fall into these general categories:

A server failure or other fault that causes the crash of an individual Oracle instance in an Oracle RAC database. To maintain availability, application clients connected to the failed instance must quickly be notified of the failure and immediately establish a new connection to the surviving instances of the Oracle RAC database.
A complete-site failure that results in both the application and database tiers being unavailable. To maintain availability users must be redirected to a secondary site that hosts a redundant application tier and a synchronized copy of the production database.
A partial-site failure where the primary database, a single-instance database, or all nodes in an Oracle RAC database become unavailable but the application tier at the primary site remains intact.

Configure Fast Connection Failover as a best practice to fully benefit from fast instance and database failover and switchover with Oracle RAC and Oracle Data Guard. Fast Connection Failover enables clients, mid-tier applications, or any program that connects directly to a database to failover quickly and seamlessly to an available database service when a database service becomes unavailable.

This chapter contains the following topics:

Automating Client Failover - JDBC, OCI, and ODP.Net
Configuring Oracle RAC Databases for Failover
Configuring the Oracle Data Guard Environment
Client Transition During Switchover Operations
Preventing Login Storms
Configuring Global Data Services

See Also:

The MAA white paper "Client Failover Best Practices for Highly Available Oracle Databases: Oracle Database 12c from the MAA Best Practices area for Oracle Database at
http://www.oracle.com/goto/maa
"Application High Availability with Services and FAN" in Oracle Database Administrator's Guide

10.1 Automating Client Failover - JDBC, OCI, and ODP.Net

You can enable OCI and JDBC, and ODP.Net application clients to receive FAN notifications and quickly reconnect to a new primary database. The configuration best practices to enable fast connection failover differ depending on the client type.

Configuring Fast Connection Failover for JDBC Clients
Configuring Application Continuity
Configuring Fast Connection Failover for OCI Clients
Configuring Automatic Failover for ODP.Net Clients

See Also:

Oracle Database Administrator's Guide for more information about Enabling Fast Connection Failover for Oracle Call Interface Clients
Oracle Call Interface Programmer's Guide for more information about Transparent Application Failover in OCI
"Client Failover Best Practices for Highly Available Oracle Databases - Oracle Database 12c" at http://www.oracle.com/technetwork/database/availability/client-failover-2280805.pdf
"Client Failover Best Practices for Data Guard 11g Release 2" at http://www.oracle.com/technetwork/database/features/availability/maa-wp-11gr2-client-failover-173305.pdf

10.1.1 Configuring Fast Connection Failover for JDBC Clients

Prerequisites:

The Universal Connection Pool (UCP) is enabled (UCP 12.1.0.2 or later)
The application uses service names to connect to the database
Oracle Notification Service (ONS) is configured and available on the node where JDBC is running
The Java Virtual Machine (JVM) in which your JDBC instance runs must have oracle.ons.oraclehome set to point to your ORACLE_HOME

Enable Fast Connection Failover (FCF) and configure the JDBC application to connect to all ONS daemons for both the primary and standby clusters using the setONSConfiguration property. The setONSConfiguration property should point to all primary and standby ONS daemons.
```
pds.setONSConfiguration("nodes=adczatdb01:6200,adczatdb02:6200,slcc17adm01:6200,slcc17adm02:6200"); 
pds.setFastConnectionFailoverEnabled(true); 
```
By default the JDBC application will randomly pick three hosts from the setONSConfiguration property and create connections to those three ONS daemons. This default must be changed so that connections are made to all ONS daemons. This is done by setting the following property when the JDBC application is invoked to the total number of ONS daemons in the configuration:
```
java -Doracle.ons.maxconnections=4 
```
The JDBC client must set the oracle.net.ns.SQLnetDef.TCP_CONNTIMEOUT_STR property. This property enables the JDBC client to quickly traverse an ADDRESS_LIST in the event of a failure. For example, if the client attempts to connect to a host that is unavailable, the connection attempt will be bounded to the time specified by the SQLnetDef.TCP_CONNTIMEOUT_STR property after which the client attempts to connect to the next host in the ADDRESS_LIST. The behavior continues for each host in the ADDRESS_LIST until a connection is made. Setting the property to a value of 3 seconds will suffice in most environments. It is important to note that the SQLnetDef.TCP_CONNTIMEOUT_STR property should be set on the data source and not on the Universal Connection Pool.
```
Properties prop = new Properties();  prop.put(oracle.net.ns.SQLnetDef.TCP_CONNTIMEOUT_STR, ""+3000); // 3000ms 
pds.setConnectionProperties(prop);
```

Set the thinForceDNSLoadBalancing property to get the correct behavior from SCAN load balancing:

// need to set oracle.jdbc.thinForceDNSLoadBalancing
prop.put("oracle.jdbc.thinForceDNSLoadBalancing","true");

Configure JDBC clients to use a connect descriptor that includes an address list that in turn includes the SCAN address for each site and connects to an existing service. Do not configure both TAF and JDBC FCF when using JDBC thick clients.

The following URL we search both primary and standby sites looking for the appropriate service with very little overhead. This URL configuration is the recommended approach:

PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
pds.setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource"); pds.setUser("system"); 
 pds.setPassword("oracle"); 
 String dbURL = 
 "jdbc:oracle:thin:@" + 
 "(DESCRIPTION=" + 
"(FAILOVER=on)" + 
 "(ADDRESS_LIST=" + 
 "(LOAD_BALANCE=on)" + 
 "(CONNECT_TIMEOUT=3)(RETRY_COUNT=3)" +
 "(ADDRESS=(PROTOCOL=TCP)(HOST=prmy-scan)(PORT=1521))"+     "(ADDRESS=(PROTOCOL=TCP)(HOST= stby-scan)(PORT=1521)))" + 
 "(CONNECT_DATA=(SERVICE_NAME=oltpworkload)))" 
 System.out.println("Url=" + dbURL); 
 pds.setURL(dbURL);

The following URL should be used if it is very rare for the primary to ever run on the secondary site and you wish to have connections connect as fast as possible:

PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
pds.setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource"); pds.setUser("system"); 
 pds.setPassword("oracle"); 
 String dbURL = 
 "jdbc:oracle:thin:@" + 
 "(DESCRIPTION_LIST=" + 
 "(LOAD_BALANCE=off)" + 
 "(FAILOVER=on)" + 
 "(DESCRIPTION=" +
 "(CONNECT_TIMEOUT=3)(RETRY_COUNT=3)" +
 "(ADDRESS_LIST=" + 
 "(LOAD_BALANCE=on)" + 
 "(ADDRESS=(PROTOCOL=TCP)(HOST=prmy-scan)(PORT=1521)))" + 
 "(CONNECT_DATA=(SERVICE_NAME=oltpworkload)))" + 
 "(DESCRIPTION=" + 
 "(ADDRESS_LIST=" + 
 "(LOAD_BALANCE=on)" + 
 "(ADDRESS=(PROTOCOL=TCP)(HOST= stby-scan)(PORT=1521)))" + 
 "(CONNECT_DATA=(SERVICE_NAME=oltpworkload))))"; 
 System.out.println("Url=" + dbURL); 
 pds.setURL(dbURL);

Note that if a switchover or failover occurs the above URL will force all connections to go through the old prmy-scan before using the stby scan where the primary currently runs.

See Also:

Oracle Database Administrator's Guide for more information about Enabling Fast Connection Failover for JDBC Clients
"Application High Availability with Services and FAN" in Oracle Database Administrator's Guide

10.1.2 Configuring Application Continuity

Application Continuity with Oracle Database 12c is used to mask outages for planned maintenance that is performed in rolling fashion across Oracle RAC instances or across a Data Guard primary and standby database. Application Continuity also masks unplanned outages of an Oracle RAC instance or a Data Guard primary database configured in Maximum Availability (zero data loss failover) with Data Guard Fast-Start Failover (automatic database failover). Use of Application Continuity for a Data Guard failover requires that both source and target databases be at Oracle 12.1.0.2 or later.

Application Continuity is available for:

Oracle JDBC Replay Driver 12c or later. This is a JDBC driver feature provided with Oracle Database 12c for Application Continuity, referred to as the "replay driver" onwards (OCI support is planned for a future release).
Oracle Universal Connection Pool, Oracle WebLogic Server 12c (12.1.2) or later, and third-party Java connection pools or standalone Java applications - using Oracle JDBC- Replay Driver 12c or later.
Standard 3rd Party Java application servers using the pooled connection interface - including IBM WebSphere and Apache Tomcat, from 12.1.0.2
Standard 3rd Party Java application servers supporting the Universal Connection Pool as the pooled data source - including IBM WebSphere, Apache Tomcat, and RedHat JBoss
Third-party Java connection pools or standalone Java applications - using Oracle JDBC- Replay Driver 12c or later and embedding their own request boundaries

Application Continuity uses Transaction Guard to reliably determine if the last transaction was committed or not. Without Transaction Guard, applications and users who attempt to retry operations following an outage can cause logical corruption by committing duplicate transactions or committing transactions out of order.

The following sections describe the options to configure Application Continuity using the Oracle JDBC 12c Replay Driver depending on your configuration:

Configuring Oracle UCP 12c
Configuring Oracle WebLogic Server 12c
Configuring Standalone Java Applications or Third-party Connection Pools
Configuring Connections for High Availability (Failover and Failback)
Configuring Services for Application Continuity
Checking Resource Allocation

See Also:

MAA white paper http://www.oracle.com/technetwork/database/database-cloud/private/application-continuity-wp-12c-1966213.pdf
MAA white paper http://www.oracle.com/technetwork/database/database-cloud/private/transaction-guard-wp-12c-1966209.pdf

10.1.2.1 Configuring Oracle UCP 12c

To configure the Oracle JDBC 12c Replay Data Source as a connection factory on UCP PoolDataSource:

setConnectionFactoryClassName("oracle.jdbc.replay.OracleDataSourceImpl");

10.1.2.2 Configuring Oracle WebLogic Server 12c

To configure the Oracle 12c JDBC Replay Data Source use the Oracle WebLogic Server Administration Console.

10.1.2.3 Configuring Standalone Java Applications or Third-party Connection Pools

To configure the Oracle JDBC 12c Replay Data Source in the property file or in the thin JDBC application

replay datasource=oracle.jdbc.replay.OracleDataSourceImpl

10.1.2.4 Configuring Connections for High Availability (Failover and Failback)

The REMOTE_LISTENER setting for the database must include the addresses in the ADDRESS_LISTs for all URL used for client connection:
- If any URL uses the SCAN Names, then REMOTE_LISTENERS must include the SCAN Name.
- If any URL uses an ADDRESS_LIST of host VIPs, then REMOTE_LISTENERS must include an ADDRESS list including all SCAN VIPs and all host VIPs

Set RETRY_COUNT, CONNECT_TIMEOUT parameters in the URL to allow new incoming connections to retry. For a complete discussion on this attributes please consult the JDBC Application Configuration Section later in this paper. For example,

"jdbc:oracle:thin:@" + 
 "(DESCRIPTION=" + 
"(FAILOVER=on)" + 
 "(ADDRESS_LIST=" + 
 "(LOAD_BALANCE=on)" + 
 "(CONNECT_TIMEOUT=3)(RETRY_COUNT=3)" +
 "(ADDRESS=(PROTOCOL=TCP)(HOST=prmy-scan)(PORT=1521))"+     "(ADDRESS=(PROTOCOL=TCP)(HOST= stby-scan)(PORT=1521)))" + 
 "(CONNECT_DATA=(SERVICE_NAME=oltpworkload)))"

10.1.2.5 Configuring Services for Application Continuity

Set the service attributes using SRVCTL / GDSCTL to use Application Continuity.
Set FAILOVER_TYPE to TRANSACTION to enable Application Continuity.
Set COMMIT_OUTCOME to TRUE to enable Transaction Guard (mandatory).
Review the following service attributes:
- REPLAY_INITIATION_TIMEOUT : Set this to the duration in seconds after which replay is not started (e.g. 180, 300, 1800 seconds - the override to cancel replay). This timer starts at beginRequest. (default 300 seconds)
- FAILOVER_RETRIES : Set this to specify the number of connection retries for each replay attempt. (default 30 retries, applied at replay driver)
- FAILOVER_DELAY : Set this to specify the delay in seconds between connection retries (default 10 seconds, applied at replay driver)
- AQ_HA_NOTIFICATIONS: Set this to TRUE to enable FAN (default TRUE)

The following is an example of configuring services for Application Continuity:

srvctl add service -db mts -service oltpworkload -role PRIMARY -notification TRUE -session_state dynamic -failovertype transaction -failovermethod basic -commit_outcome TRUE -failoverretry 30 -failoverdelay 10 -replay_init_time 900 -clbgoal SHORT -rlbgoal SERVICE_TIME -preferred mts1,mts2  -retention 3600 -verbose

10.1.2.6 Checking Resource Allocation

Ensure that the system has the necessary memory and CPU resources.
Memory: The JDBC replay driver uses more memory than the base JDBC driver because the calls are retained until the end of a database request. If the number of calls retained is small, then the memory consumption of the replay driver is comparable to the base driver. At the end of a request, the calls are released to the garbage collector. This action differs from the base driver that releases as calls are closed.
For good performance, if there is sufficient memory, allocate 4 to 8 GB (or more) of memory for the Virtual Machine (VM), for example, by setting - Xms4096m for 4 GB.
CPU: The JDBC replay driver uses some additional CPU for building proxy objects, managing queues, and for garbage collection. The server uses some additional CPU for managing the validation. CPU overhead is red.

10.1.3 Configuring Fast Connection Failover for OCI Clients

Prerequisites

An Oracle RAC environment with Oracle Clusterware set up and enabled or a single node (non-Oracle RAC) database with Oracle Restart
The application must have been linked with the threads library
The OCI environment must be created in OCI_EVENTS and OCI_THREADED mode

Enable FAN for OCI clients by initializing the environment with the OCI_EVENTS parameter, as in the following example:

OCIEnvCreate(...OCI_EVENTS...)

Link the OCI client applications with thread library libthread or libpthread.

Your application will need the ability to check if an event has occurred by using code similar to that used in the following example:

void evtcallback_fn(ha_ctx, eventhp)
...  
printf("HA Event received.\n");
  if (OCIHandleAlloc( (dvoid *)envhp, (dvoid **)&errhp, (ub4) OCI_HTYPE_ERROR, 
                      (size_t) 0, (dvoid **) 0))
    return;
  if (retcode = OCIAttrGet(eventhp, OCT_HTYPE_EVENT, (dvoid *)&srvhp, (ub4 *)0,
                           OCI_ATTR_HA_SRVFIRST, errhp))
    checkerr (errhp, (sword)retcode;
  else {
     printf("found first server handle.\n");
     /*get associated instance name */
     if (retcode = OCIAttrGet(srvhp, OCI_HTYPE_SERVER, (dvoid *)&instname,
                           (ub4 *)&sizep, OCI_ATTR_INSTNAME, errhp))
       checkerr (errhp, (sword)retcode);
     else
       printf("instance name is %s.\n", instname);

Clients and applications can register a callback that is invoked whenever a high availability event occurs, as shown in the following example:

/*Registering HA callback function */
  if (checkerr(errhp, OCIAttrSet(envhp, (ub4) OCI_HTYPE_ENV, 
                             (dvoid *)evtcallback_fn, (ub4) 0,
                             (ub4)OCI_ATTR_EVTCBK, errhp)))
  {
    printf("Failed to set register EVENT callback.\n");
    return EX_FAILURE;
  }
  if (checkerr(errhp, OCIAttrSet(envhp, (ub4) OCI_HTYPE_ENV,
                                (dvoid *)evtctx, (ub4) 0, 
                                (ub4)OCI_ATTR_EVTCTX, errhp)))
  {
    printf("Failed to set register EVENT callback context.\n");
    return EX_FAILURE;
  }
return EX_SUCCESS;

After registering an event callback and context, OCI will call the registered function once for each high availability event.

Configure an Oracle Net alias that the OCI application will use to connect to the database. The Oracle Net alias should specify both the primary and standby SCAN hostnames. For best performance while creating new connections the Oracle Net alias should have LOAD_BALANCE=OFF for the DESCRIPTION_LIST so that DESCRIPTIONs are tried in an ordered list, top to bottom. With this configuration the second DESCRIPTION is only attempted if all connection attempts to the first DESCRIPTION have failed.
```
SALES=
 (DESCRIPTION_LIST=
  (LOAD_BALANCE=off)
  (FAILOVER=on)
  (DESCRIPTION=     (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
    (ADDRESS_LIST=
      (LOAD_BALANCE=on) 
      (ADDRESS=(PROTOCOL=TCP)(HOST=prmy-scan)(PORT=1521)))
    (CONNECT_DATA=(SERVICE_NAME=oltpworkload)))
  (DESCRIPTION=
(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
    (ADDRESS_LIST=
      (LOAD_BALANCE=on)      
      (ADDRESS=(PROTOCOL=TCP)(HOST=stby-scan)(PORT=1521)))
    (CONNECT_DATA=(SERVICE_NAME=oltpworkload))))
```
When a new connection is made using the above Oracle Net alias the following logic is used:
1. Oracle Net contacts DNS and resolves prmy-scan to a total of three IP addresses.
2. Oracle Net randomly picks one of the three IP address and attempts to make a connection. If the connection attempt to the IP address does not respond in three seconds (TRANSPORT_CONNECT_TIMEOUT) the next IP address is attempted. All three IP addresses will be tried a total of four times (initial attempt plus RETRY_COUNT in the above example).
3. If the connection to primary site is unsuccessful, it then contacts DNS and resolves stby-scan to three addresses.
4. The same sequence is performed for the standby stby-scan as it was for the prmy-scan.

The following is some additional information about the Oracle Net parameters used in the above alias:

LOAD_BALANCE is ON by default for DESCRIPTION_LIST only. This parameter by default is OFF for an address list within a DESCRIPTION. Setting this ON for a SCAN-based address implies that new connections will be randomly assigned to one of the 3 SCAN-based IP addresses resolved by DNS.
The RETRY_COUNT parameter specifies the number of times an address list is traversed before the new connection attempt is terminated. The default value is 0. With respect to SCAN, with FAILOVER = on, setting this RETRY_COUNT parameter to a value of 2, for example, means the three SCAN IP addresses are traversed thrice (i.e. 3*3=9 connect attempts), before the connection is terminated:
- When the connection request initially comes in, the first randomly assigned IP address tries to service that request, followed by the two remaining IP addresses. (This behavior is controlled by the FAILOVER parameter.)
- The retries then kick in and the list of three IP addresses is tried two more times. RETRY_COUNT is only supported at DESCRIPTION level in connect string, but not at global (i.e. sqlnet.ora) level.

10.1.4 Configuring Automatic Failover for ODP.Net Clients

Prerequisites:

Namespace: Oracle.DataAccess.Client, Assembly: Oracle.DataAccess.dll.
Microsoft .NET Framework Version 2.0 or later.
Configure Oracle Net alias as described in Section 10.1.3, "Configuring Fast Connection Failover for OCI Clients."

Enable Fast Connection Failover for ODP.NET connection pools by subscribing to FAN high availability events. To enable Fast Connection Failover, include "HA Events=true" and "pooling=true" in the connection string, as shown in the following example where user_name is the name of the database user and password is the password for that user:
```
con.ConnectionString =
   "User Id=user_name;Password=password;Data Source=sales;" +
   "Min Pool Size=10;Connection Lifetime=120;Connection Timeout=60;" +
   "HA Events=true;Incr Pool Size=5;Decr Pool Size=2";
```
To take advantage of load balancing events with ODP.NET connection pools, set the load balancing attribute in the ConnectionString to TRUE (the default is FALSE). You can do this at connect time. This only works if you are using connection pools, or when the pooling attribute is set to TRUE which is the default.

The following example demonstrates how to configure the ConnectionString to enable load balancing
```
con.ConnectionString =
  "User Id=user_name;Password=password;Data Source=odpapp;" +
  "Min Pool Size=10;Connection Lifetime=120;Connection Timeout=60;" +
  "Load Balancing=true;Incr Pool Size=5;Decr Pool Size=2";
```
Configure the service to enable FAN and HA events as described in Section 10.2.1, "Configuring Database Services."

See Also:

"Client Failover Best Practices for Highly Available Oracle Databases - Oracle Database 12c" at http://www.oracle.com/technetwork/database/availability/client-failover-2280805.pdf
"Client Failover Best Practices for Data Guard 11g Release 2" at http://www.oracle.com/technetwork/database/features/availability/maa-wp-11gr2-client-failover-173305.pdf
Oracle Database Administrator's Guide

10.2 Configuring Oracle RAC Databases for Failover

Oracle Database 12c provides the infrastructure to make your application data highly available with Oracle Real Application Clusters (Oracle RAC) and with the Oracle Data Guard. At the database tier you must configure fast application failover.

10.2.1 Configuring Database Services

At a high level, automating client failover in an Oracle RAC configuration includes relocating database services to new or surviving instances, notifying clients that a failure has occurred to break the clients out of TCP timeout, and redirecting clients to a surviving instance (Oracle Clusterware sends FAN messages to applications; applications can respond to FAN events and take immediate action). For more information about FAN, see Section 5.1.1, "Client Configuration and Migration Concepts".

For services on an Oracle RAC database, Oracle Enterprise Manager or the SRVCTL utility are the recommended tools to manage services. A service can span one or more instances of an Oracle database and a single instance can support multiple services. The number of instances offering the service is managed by the database administrator independent of the application.

See Also:

10.2.2 Optionally Configure FAN Server Side Callouts

Server-side callouts provide a simple, yet powerful integration mechanism with the High Availability Framework that is part of Oracle Clusterware. You can use server side callouts to log trouble tickets or page Administrators to alert them of a failure. For Up events, when services and instances are started, new connections can be created so the application can immediately take advantage of the extra resources

See Also:

Oracle Real Application Clusters Administration and Deployment Guide for an Introduction to Automatic Workload Management.
For more information about client failover best practices and details on deploying FAN server side callouts, see the Technical Article, "Automatic Workload Management with Oracle Real Application Clusters 11g Release 2" on the Oracle Technology Network at

http://www.oracle.com/technetwork/database/clustering/overview/index.html

10.3 Configuring the Oracle Data Guard Environment

The following topics describe how to configure the Oracle Data Guard environment:

Configuring Database Services
Use Data Guard Broker

See Also:
The MAA white paper "Client Failover Best Practices for Highly Available Oracle Databases - Oracle Database 12c" from the MAA Best Practices area for Oracle Database at
http://www.oracle.com/goto/maa

10.3.1 Configuring Database Services

In an Oracle Data Guard configuration you should only run primary application services on the primary database and run standby application services on the standby database. You can automatically control the startup of database services on primary and standby databases by assigning a database role to each service (roles include: PRIMARY, PHYSICAL_STANDBY, LOGICAL_STANDBY, and SNAPSHOT_STANDBY).

A database service automatically starts upon database startup if the management policy for the service is AUTOMATIC and if a role assigned to that service matches the current role of the database.

See Also:

Oracle Database Administrator's Guide for information about Creating and Deleting Database Services with SRVCTL
Oracle Database Administrator's Guide for information About Automatic Startup of Database Services

10.3.2 Use Data Guard Broker

The best practice is to configure Oracle Data Guard to manage the configuration with Oracle Data Guard Broker. Oracle Data Guard Broker is responsible for sending FAN events to client applications to clean up their connections to the down database and reconnect to the new production database.

Oracle Clusterware must be installed and active on the primary and standby sites for both single instance (using Oracle Restart) and Oracle RAC databases. Oracle Data Guard broker coordinates with Oracle Clusterware to properly fail over role-based services to a new primary database after a Data Guard failover has occurred.

See Also:

Section 5.1.1, "Client Configuration and Migration Concepts" for more information about FAN.
Section 8.3.1, "Use Oracle Data Guard Broker with Oracle Data Guard"

10.4 Client Transition During Switchover Operations

In Oracle Data Guard, the term "switchover" describes a planned event where a primary and standby database switch roles, usually to minimize the downtime while performing planned maintenance. The configuration best practices to address unplanned failovers also address most of the requirements for a planned switchover, except for several additional manual steps that apply to logical standby databases (SQL Apply).

Note:

There are no additional considerations for switchovers using Oracle Active Data Guard.

The following steps describe the additional manual switchover steps for Oracle Data Guard 11g Release 2:

The primary database is converted to a standby database. This disconnects all sessions and brings the database to the mount state. Oracle Data Guard Broker shuts down any read/write services.
Client sessions receive a ORA-3113 and begin going through their retry logic (TAF for OCI and application code logic for JDBC).
The standby database is converted to a primary database and any existing sessions are disconnected. Oracle Data Guard Broker shuts down read-only services.
Read-only connections receive an ORA-3113 and begin going through their retry logic (TAF for OCI and application code logic for JDBC).
As the new primary and the new standby are opened, the respective services are started for each role and clients performing retries now see the services available and connect.

For logical standby switchover:

Ensure that the proper reconnection logic has been configured (for more information, see Section 10.1, "Automating Client Failover - JDBC, OCI, and ODP.Net" and Section 10.2, "Configuring Oracle RAC Databases for Failover"). For example, configure TAF and RETRY_COUNT for OCI applications and code retry logic for JDBC applications.
Stop the services that the primary application uses and the read-only applications enabled on the standby database.
Disconnect or shutdown the primary and read-only application sessions.
Once the switchover has completed, restart the services used by the primary application and the read-only application.
Sessions that were terminated reconnect once the service becomes available as part of the retry mechanism.
Restart the application if an application shuts down.

Note that FAN is not needed to transition clients during a switchover operation if the application performs retries. FAN is only needed to break clients out of TCP timeout, a state that should only occur during unplanned outages.

10.5 Preventing Login Storms

The process of failing over an application that has a large number of connections may create a login storm. A login storm is a sudden spike in the number of connections to a database instance, which drains CPU resources. As CPU resources are depleted, application timeouts and application response times are likely to increase.

To control login storms:

Implement the Connection Rate Limiter

The primary method of controlling login storms is to implement the Connection Rate Limiter feature of the Oracle listener. This feature limits the number of connections that can be processed in seconds. Slowing down the rate of connections ensures that CPU resources remain available and that the system remains responsive.
Configure Oracle Database for shared server operations

In addition to implementing the Connection Rate Limiter, some applications can control login storms by configuring Oracle Database for shared server operations. By using shared server, the number of processes that must be created at failover time are greatly reduced, thereby avoiding a login storm.
Adjust the maximum number of connections in the mid tier connection pool

If such a capability is available in your application mid tier, try limiting the number of connections by adjusting the maximum number of connections in the mid tier connection pool.

See Also:

Oracle Database Administrator's Guide for more information about configuring and controlling shared server operations
The "Oracle Net Listener Connection Rate Limiter" white paper for information about the Connection Rate Limiter at

http://www.oracle.com/technetwork/database/enterprise-edition/oraclenetservices-connectionratelim-133050.pdf
The "Best Practices for Optimizing Availability During Unplanned Outages Using Oracle Clusterware and Oracle Real Application Clusters" white paper for information and examples about listener connection rate throttling from the MAA Best Practices area for Oracle Database at

http://www.oracle.com/goto/maa

10.6 Configuring Global Data Services

Global Data Services enables administrators to automatically and transparently manage client workloads across replicated databases that offer common services. A database service is a named representation of one or more database instances. Services enable you to group database workloads and route a particular work request to an appropriate instance. A global service is a service provided by multiple databases synchronized through data replication.

Global Data Services provides dynamic load balancing, failover, and centralized service management for a set of replicated databases that offer common services. The set of databases can include Oracle RAC and noncluster Oracle databases interrelated through Oracle Data Guard, databases consolidated under Oracle Multitenant, Oracle GoldenGate, or any other replication technology.

The global services management framework is built around the following preexisting Oracle Database technologies:

Oracle Real Application Clusters (Oracle RAC) - Enables dynamic load balancing and workload management in a cluster
Oracle Active Data Guard - Enables high-performance farms of read-only databases
Data Guard Broker - Enables creation, management, and monitoring of Data Guard configurations that include a primary database and up to 30 standby databases
Oracle GoldenGate - Enables replication updates among multiple databases

Oracle GDS provides the following key capabilities for a set of replicated databases that are globally distributed or located within the same data center:

Region-based workload routing
Connect-time Load balancing
Provides Run-time load balancing advisory for Oracle integrated clients
Inter-database Service failover
Replication lag based workload routing for Active Data Guard
Role-based global Services for Active Data Guard
Centralized workload management framework

At a high level, configuring the Global Data Services Framework involves the following:

Installing a Global Service Manager. You must install at least one global service manager for each Global Data Services region. Global service managers can be hosted on physical or virtual environments. For high availability, Oracle recommends installing multiple (typically 3) global service managers in each region running on separate hosts.
Creating a Global Data Services Catalog. The catalog must reside in a 12c database and it is recommended that database be outside the GDS configuration. GDS catalog may be co-hosted along with catalogs of RMAN or Oracle Enterprise Manager. Oracle recommends that you use Oracle high availability features such as Oracle Real Application Clusters (Oracle RAC) and Oracle Data Guard to protect the Global Data Services catalog against outages.
Registering a Global Service Manager to the catalog.
Adding a Global Services Pool. A GDS pool is a subset of databases that provide a set of global services that belong to an administrative domain. This simplifies service management and provides for high security by allowing each pool to be administered by a different administrator.
Adding a Global Data Services Region. A region is a set of databases that share the same network proximity and network latency. A region normally corresponds to a local area network. For high availability purposes, each region in a GDS configuration should have a designated buddy region, which is a region that contains global service managers that can provide continued access to a GDS configuration if the global services managers in the local region become unavailable.
Adding databases to the GDS pool. To be part of a Global Data Services pool, a database must use a server parameter file (SPFILE). An Oracle RAC database should also have SCAN set up.
Add a service to a GDS pool. A global service name must be unique within a GDS pool and must also be unique within a GDS configuration. A global service cannot be created at a database if a local or global service with the same name already exists at that database.

See Also:

Oracle Database Global Data Services Concepts and Administration Guide
MAA white paper "Oracle Database 12c - Global Data Services" at http://www.oracle.com/technetwork/database/availability/global-data-services-12c-wp-1964780.pdf

10.6.1 Configuring the Database Client

When clients connect to the database they do so using an Oracle Net connect string that specifys a service. The connect string used when connecting to a global service is unique in the following ways:

Service name parameter must specify a global service
Multiple SCAN addresses are used to point to global service manager endpoints
The database client's region may be specified in the connection data section

Consider the following connect string:

(DESCRIPTION=
  (FAILOVER=on)
  (ADDRESS_LIST=
    (LOAD_BALANCE=ON)
    (ADDRESS=(host=sales-east1)(port=1522))
    (ADDRESS=(host=sales-east2)(port=1522))
    (ADDRESS=(host=sales-east3)(port=1522)))
  (ADDRESS_LIST=
    (LOAD_BALANCE=ON)
    (ADDRESS=(host=sales-west1)(port=1522))
    (ADDRESS=(host=sales-west2)(port=1522))
    (ADDRESS=(host=sales-west3)(port=1522)))
  (CONNECT_DATA=
   (SERVICE_NAME=sales)
   (REGION=east)))

Client-side load balancing is enabled across the global service managers within each region by setting the LOAD_BALANCE parameter to ON in the address list for each region. Connect-time failover between regions is enabled by setting the FAILOVER parameter to ON.