Oracle® Light Weight Availability Collection Tool User's Guide Release 3.3 for Oracle Solaris Part Number E20940-01 |
|
|
PDF · Mobi · ePub |
The Oracle Lightweight Availability Collection Tool is a package that has to be installed on each instance of Solaris. It is bundled with Oracle Services Tools Bundle (STB) for Sun Systems. By downloading STB and running ./install_stb.sh, you can add the Oracle Lightweight Availability Collection Tool with other deliverables in the STB.
The Oracle Lightweight Availability Collection Tool must be installed through STB and is made available via its download link. Use the following procedure to download the latest Services Tools Bundle:
Go to the Oracle Services Tools Bundle for Sun Systems site (http://www.oracle.com/us/support/systems/premier/services-tools-bundle-sun-systems-163717.html
) and click the Oracle Services Tools Bundle for Sun Systems software download site link in the Get Started Today section.
In the drop-down lists, select the appropriate Platform and Language for your download.
Review the STB License Agreement and mark the I agree check box to proceed with downloading.
Click install_stb.sh to download the installer.
To finish the installation, complete the instructions in the next section.
To install the Oracle Lightweight Availability Collection Tool using STB, complete the following steps when requested during the installation process:
Note:
Though questions not pertaining specifically to the Oracle Lightweight Availability Collection Tool are asked, this section does not address these questions. You must decide whether you want these tools installed and answer the questions accordingly.At the beginning of the installation, the following is displayed on your screen:
-bash-3.00# ./install_stb.sh Services Tools Bundle(STB) v6.0 Installer Checksumming... List of Components and Corresponding Selection 1. Install SNEEP Tool v2.9 ? (y/n) y Already Installed Sneep Tool has Version (2.9) Sneep Tool details can be found at <http://www.sun.com/sneep> and local system documentation reference is available at /opt/SUNWsneep/Docs 2. Install Service Tags v1.1.5,REV=2009.09.23.10.58 ? (y/n) y Already Installed Service Tags has Version (1.1.5,REV=2009.09.23.10.58) Service Tags details can be found at <http://wikis.sun.com/display/SunInventory/FAQ> and <http://wikis.sun.com/display/SunInventory/Discovery+and+Registration> 3. Install Explorer v6.5,REV=2010.07.02.12.51 ? (y/n) y Explorer details can be found at <http://docs.sun.com/app/docs/coll/1554.2> and local system documentation reference is available at /opt/SUNWexplo/doc 4. Install Lightweight Availability Collection Tool v3.3 ? (y/n) y Lightweight Availability Collection Tool details can be found at <http://docs.sun.com/app/docs/coll/1811.1> Would you like to (I)nstall, (X)tract component selections, or (E)xit ? I(default)
Accept the default: I. The installation proceeds with the default options:
Would you like to (I)nstall, (X)tract component selections, or (E)xit ? I(default) STB is installing all selected modules and their dependencies. Details of this will be in /var/log/install_stb-v6.0.log Please wait..... Installing Oracle Sneep ..... ---- Already Installed Sneep Packages has current Version (2.9) All sneep data sources are consistent. Installing Service Tags and Product Serial Number Package ..... ---- Checking Service Tags dependency packages... ---- Service Tags dependency check passed ---- Already Installed Product Serial Number Package has current Version (1.1.4,REV=2008.04.25.10.21) ---- Already Installed Service Tags Packages has current Version (1.1.5,REV=2009.09.23.10.58) ---- Already Installed Hardware Service Tag Registration Package has current Version (1.0,REV=2009.09.23.11.02) Installing Oracle Explorer Data Collector ..... Modifying /etc/opt/SUNWexplo/xscfinput.txt Modifying /etc/opt/SUNWexplo/tapeinput.txt Modifying /etc/opt/SUNWexplo/t3input.txt Modifying /etc/opt/SUNWexplo/srscinput.txt Modifying /etc/opt/SUNWexplo/se6920input.txt Modifying /etc/opt/SUNWexplo/se6320input.txt Modifying /etc/opt/SUNWexplo/se3kinput.txt Modifying /etc/opt/SUNWexplo/scinput.txt Modifying /etc/opt/SUNWexplo/saninput.txt Modifying /etc/opt/SUNWexplo/ipmiinput.txt Modifying /etc/opt/SUNWexplo/indyinput.txt Modifying /etc/opt/SUNWexplo/ilomsnapshotinput.txt Modifying /etc/opt/SUNWexplo/ilominput.txt Modifying /etc/opt/SUNWexplo/b1600switchinput.txt Modifying /etc/opt/SUNWexplo/b1600input.txt Modifying /etc/opt/SUNWexplo/alominput.txt Modifying /etc/opt/SUNWexplo/acinput.txt Modifying /etc/opt/SUNWexplo/Tx000input.txt Modifying /etc/opt/SUNWexplo/1280input.txt Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved. All sneep data sources are consistent. Installation of Oracle Explorer Data Collector <6.5,REV=2010.07.02.12.51> was successful Installing Oracle Lightweight Availability Collection Tool ..... It may take a few minutes to complete postinstall.. It may take a few minutes to complete postinstall.. Installation of Lightweight Availability Collection Tool <3.3> was successful STB v6.0 installation is complete...
Note:
In order to leverage complete functionality, the Oracle Lightweight Availability Collection tool requires Explorer at a minimum release of 6.0. Oracle Services Tools Bundle will fail to install the tool if this minimum requirement is not met.Verify that the Oracle Lightweight Availability Collection Tool completed successfully by reviewing the following conditions:
The output of the pkginfo -l SUNWlwact command reflects completely installed in the STATUS field.
Immediately upon successful installation of the Oracle Lightweight Availability Collection Tool package, the tool starts the tictimed daemon. This is the daemon responsible for continuous monitoring of the availability status of the system. You can check for the existence of this daemon by executing: /usr/bin/ps -eaf | grep tictimed
/etc/inittab contains a new entry for tictimed under the ID LT.
Availability datagram is created in the default location (defined by the configurable parameter LOGDIR).
As soon as the Oracle Lightweight Availability Collection Tool package is installed, it will kick-off the process tictimed which monitors availability related events. You can check this with:
# ps -eaf | grep tictime root 4817 1 0 16:47:43 ? 0:00 /usr/sbin/tictimed
Any availability-related event is logged by the tictime daemon to /var/log/<hostid>.lwact.xml
A local report based on this file can be viewed in a user friendly format with the following command:
# /usr/bin/ltreport -v
The file can also be viewed in its raw XML format using the following command:
# /usr/bin/ltreport -x
To prevent tampering with the file, each event that is logged has a checksum. If the file is manipulated, the checksum will become invalid and a message will be logged to /var/adm/messages.
To update with the latest version, you do not have to delete and re-install the existing Oracle Lightweight Availability Collection Tool. The Oracle Services Tools Bundle for Sun Systems installer will help you uninstall the existing version if you choose yes to the upgrade option.
Upgrading from any Oracle Lightweight Availability Collection Tool 2.1.16 (or greater) will retain the old availability data as long as it is not corrupted. It is recommended that you do a backup of the file prior to upgrading the application.
There are a set of configurable parameters for the Oracle Lightweight Availability Collection Tool, which enables you to set default actions based on local site policies. The tool configures itself using the parameters defined in /etc/default/lwact file. The following parameters are configurable:
LOGDIR specifies the path where availability data (hostid.lwact.xml) will be collected. By default, it is collected in /var/log. You can change the value to a different path and the tool will start logging the availability metrics into this new path after the tool is restarted. To retain the availability data collected thus far, you must ensure that the log file is manually copied into the new location; otherwise, the tool will start logging availability data in the new location afresh and the old data will be lost.
Note:
Before you restart the tool, be sure you retain the availability data already collected. To retain this data, manually copy the log file to the new location. If you do not copy the log file, the new data will be logged to the new location, but the old data will not be carried over to this new location when the tool restarts.BACKUP specifies the path where the Oracle Lightweight Availability Collection Tool will store a backup copy of the log file. By default this entry has the path set to /var/tmp and is commented; therefore, no backup will be stored. If you want a backup, you can un-comment the entry and change the path to your preferred location. The backup file will be found under the path you specify.
UPDATE specifies the path where the lwact.update file can be found. By default, the path is /var/tmp. You can modify this path.
The lwact.update file is a feature provided by the Oracle Lightweight Availability Collection Tool to auto-update predefined cause codes for any outage. You can use this feature to update a cause code to a single or bulk of hosts for an outage.
For example, an outage might have occurred on number of hosts within your site due to a power failure. Hence, you might want to update a common cause code across all these hosts for that particular outage. Instead of manually updating the cause code for that event after the outage on each host, you can push this lwact.update file into all these hosts soon after this activity is carried out. The Oracle Lightweight Availability Collection Tool will automatically pick the cause codes mentioned in the lwact.update file and set the cause codes to the outage event accordingly. After completing this update, the file is automatically deleted. By using this feature, you no longer need to manually log into each of the hosts to update the cause code after an outage occurs.
The structure of the lwact.update file is as follows:
# This file contains the cause codes for the outage <L1CauseCodeIndex>, <L2CauseCodeIndex>, <L3CauseCodeIndex>
For example:
$ cat lwact.update 1,2,7
Based on the file in this example, after the outage, the tool will set the cause codes as follows: L1=1, L2=2, L3=7
The L1CC, L2CC, L3CC parameters enable you to define default cause codes for L1, L2 and L3.
By default, the Oracle Lightweight Availability Collection Tool logs halt event cause codes as:
L1=Planned L2=Undefined, L3=Undefined
By default, it logs the panic event's cause codes as:
L1=Unplanned L2=Undefined L3=Undefined
The structure of the L1CC, L2CC, L3CC parameters is as follows:
L1CC=<L1CauseCodeString> L2CC=<L2CauseCodeString> L3CC=<L3CauseCodeString>
By default, there are no entries for cause codes in this file. So L1 cause code for halt and panic events are logged as Planned and Unplanned respectively, and L2 and L3 cause codes are logged as Undefined. If cause codes are explicitly set for different levels, they override the default cause codes for outage events (both halt and panic).
Note:
If any of the L1CC, L2CC, L3CC values are not valid, then the Oracle Lightweight Availability Collection Tool detects this and logs a corresponding log message in /var/adm/message and sets the invalid cause code entry as UndefinedUpon installation, the configurable parameters in the /etc/default/lwact file have the following default values:
LOGDIR=/var/log #BACKUP=/var/tmp UPDATE=/var/tmp
Note:
For any changes to take effect, you must restart the Oracle Lightweight Availability Collection Tool