Oracle® Light Weight Availability Collection Tool User's Guide Release 3.3 for Oracle Solaris Part Number E20940-01 |
|
|
PDF · Mobi · ePub |
The Oracle Lightweight Availability Collection Tool consists of the following three main binary utilities:
The tictimed utility is a heartbeat daemon for the Oracle Lightweight Availability Collection Tool. It changes the modified time (the UTC) of the log file once a second, and updates the time event once a minute. This utility starts automatically via the /etc/rc2.d/S95lwact script. An entry in /etc/inittab makes sure that it is re-spawned even if it is killed or it crashes for an unknown reason. It writes the system halts, panic, and boot records to a log file to track system availability. If the update file (lwact.update) is present under the update directory, the tictimed utility also modifies the event to update cause codes.
The tictimed utility captures the following five event types:
epoch - the beginning of event tracking
boot - UTC when system leaves run-level 2
halt - UTC when system exits run-level 3
panic - a boot event without a preceding halt recorded. Last modified time of the log file is used as the panic UTC
time - the last recorded UTC for off-line reporting
The Oracle Lightweight Availability Collection Tool has init scripts which are invoked by the system during run level changes. If the you try to invoke these scripts manually, it logs the appropriate info log to the /var/adm/messages file:
LWACT is started - Indicates that a user has used the /etc/init.d/lwact script to re-initialize the init tab
LWACT is going down - Indicates a user has stopped the Oracle Lightweight Availability Collection Tool using the /etc/init.d/lwact script. This causes the tictimed daemon to respawn and re-write the lock file under /var/spool/locks with the new tictimed pid
The logtime utility is used by the root user to update the cause code for events. This utility is also used by the system to create and update boot and halt events. Using the -M option of this utility, only the root user can modify the cause code for the halt and panic events, whereas -B and -H options are used by the system (host process, such as init) itself. The logtime utility can be executed in interactive and non-interactive mode. In the interactive mode, the user does not need to provide the cause code string; whereas in the non-interactive mode it does require the event number and cause code string.
The ltreport utility is a command line, binary executable reporting tool that reads the datagram and calculates the system availability. The output is written to stdout.
The ltreport utility calculates the following two availability figures:
Total - Total availability is a raw calculation whereby total uptime is divided by total elapsed time.
Adjusted - Adjusted availability is the sum of total uptime and total planned down time, divided by total elapsed time. Here, any planned downtime is accounted as uptime of the system.
The ltreport utility reports three downtime categories:
Planned
Unplanned
Undefined