Oracle® Auto Service Request Release 3.2 for Oracle Enterprise Linux and Solaris |
|
|
Mobi · ePub |
This section provides a variety of troubleshooting procedures for the ASR software.
Note:
This section provides instructions for Solaris. When possible, corresponding Oracle Enterprise Linux (OEL) instructions are provided. Please see the appropriate OEL documentation for details for general administration commands.An ASR Diagnostic Utility is included to provide analysis support of any installation problems. This utility packages the data collected and stores it in a consistent, configurable location for later retrieval/delivery and analysis. The .zip file, which is created, will need to be sent to Oracle manually.
The ASR Diagnostic Utility consists of the following files:
diag-config.properties
- property file for customizing the diagnostic utility configuration
asrDiagUtil.sh
- shell script for invoking the utility method
com.sun.svc.asr.util.diag.jar
- Java method for collecting diagnostic data
README
– a “readme” text file describing the utility
At command prompt, run ./asrDiagUtil.sh
and follow the on-screen instructions on where the diagnostic data file is being generated.
The ASR Diagnostic Utility is available as part of the ASR software bundle. After you download the latest ASR software bundle, access and run the utility:
Install SUNWswasr.
<version>.<timestamp>.pkg
(see "Install the ASR Package" for more information)
Verify the ASR Diagnostic Utility files are located under /opt/SUNWswasr/util/diag
Run the utility.
Note:
Support for Oracle Enterprise Linux (OEL) begins with ASR 2.7.The diag-config.properties
file consists a list of properties for specifying location of the configuration and log directories. It also contains "toggle switches" for enabling and disabling a particular data set to be collected:
com.sun.svc.asr.util.diag.home.directory
– The property for specifying where the diagnostic data .zip bundle will be generated. Default is current directory where the ASR Diagnostic Utility is located.
com.sun.svc.asr.util.diag.zip.file.prefix
– The property for configuring the diagnostic data .zip file's name.
com.sun.svc.asr.util.diag.zip.recursive property
– The property for enabling traversing into subdirectories of any configuration or log directories.
This section provides a variety of steps to check on the state of the Service Tools Bundle (STB) that must installed on most ASR systems. If issues arise during the installation and operation of ASR, STB may be part of the issue.
Open a browser window to the system you wish to check using the following command. Be sure to include the /
(slash) after agent.
http://
asr_system_hostname:6481/stv1/agent/
A response similar to the following will be displayed:
<st1:response><agent><agent_urn><agent urn number></agent_urn><agent_version>1.1.4</agent_version><registry_version>1.1.4</registry_version><system_info><system>SunOS</system><host><your host name></host><release>5.10</release><architecture>sparc</architecture><platform>SUNW,Sun-Fire-V215::Generic_137111-06</platform><manufacturer>Sun Microsystems, Inc.</manufacturer><cpu_manufacturer>Sun Microsystems, Inc.</cpu_manufacturer><serial_number>0707FL2015</serial_number><hostid><host ID number></hostid></system_info></agent></st1:response>
If you do not get a response from the Service Tags agent, consult the Service Tags man pages:
man in.stlisten man stclient
Follow the procedure below to check the Service Tags version:
Open a terminal window and log in as root to the ASR system you wish to check.
Run the following command to get the Service Tags version:
stclient -v
ASR requires Service Tags version 1.1.4 or later.
Follow the procedure below to determine that the Service Tag discovery probe is running:
Open a terminal window and log in as root to the ASR system you wish to check.
To determine that the Service Tag discovery probe is running, run the following command:
svcs -l svc:/network/stdiscover
If the probe is running correctly, the following information is displayed:
fmri svc:/network/stdiscover:default name Service Tag discovery probe enabled true
state online next_state none state_time Wed Sep 03 21:07:28 2008 restarter svc:/network/inetd:default
Follow the procedure below to determine that the Service Tags Listener is running:
Open a terminal window and log in as root to the ASR system you wish to check.
To determine if the Service Tags listener is running, run the following command:
svcs -l svc:/network/stlisten
If the listener is running correctly, the following information is displayed:
fmri svc:/network/stlisten:default name Service Tag Discovery Listener enabled true state online next_state none state_time Wed Sep 03 21:07:28 2008 restarter svc:/network/inetd:default xibreXR_US root@s4u-v215c-abc12
This message indicates that the activation failed during Service Tags discovery. The issue can be either Service Tags is not installed on the ASR Asset or is installed but not running. Also the issue can be network connectivity between ASR Manager and the ASR Asset. Complete the following checks:
Check if Service Tags is installed and running on an ASR Asset. Run:
stclient -x
If you cannot run this command, either Service Tags is not installed or not online.
Check if the Service Tags services are installed and online using the following command:
svcs | grep reg
The results should be similar to the following example:
online Aug_23 svc:/application/stosreg:default online Aug_23 svc:/application/sthwreg:default
If you cannot find these services, it means Service Tags is not installed on the ASR asset.
If the Service Tags services are online, check if psncollector
is online. Run:
svcs | grep psncollector
The results should be similar to the following example:
online Sep_09 svc:/application/psncollector:default
Make sure that there are no TCP Wrappers installed on the ASR asset to prevent any service tags discovery issues. Run the following command from the ASR Manager system:
wget http://
<assetHostNameOrIPaddress>:6481/stv1/agent/
If there are TCP wrappers installed on the ASR asset, edit /etc/hosts.allow
on the asset by adding:
in.stlisten:
<SASM host name>
If serial number is empty or "unknown" complete the following steps:
Input the correct serial number using the SNEEP command:
/opt/SUNWsneep/bin/sneep -s
<serial number>
Note:
SNEEP is part of the Services Tools Bundle that is a prerequisite of ASR (for more information, see "Install Service Tags"For versions of SNEEP older than 2.6, enter the following command:
svcadm restart psncollector
Note:
If you are using SNEEP version 2.6, it is not necessary to manually restart thepsncollector
after inputting the serial number.You can view the serial number using the following URL:
http://
<AgentipAddress>:6481/stv1/agent/
If product name is empty or "unknown" check if the Hardware Service Tags are installed and online. Run:
svcs | grep sthwreg
The results should be similar to the following example:
online Aug_23 svc:/application/sthwreg:default
If you cannot find this service, it means Hardware Service Tags are not installed on the ASR asset.
This message indicates that the message creation failed because of bad or missing data. Most of the time, this error is the result of an incorrect or incomplete serial number or product name. To resolve this message, complete the following steps:
Verify the serial number using the SNEEP command:
sneep
If serial number is not correct then input the correct serial number using the following SNEEP command:
/opt/SUNWsneep/bin/setcsn -c
<serial number>
Note:
SNEEP is part of the Services Tools Bundle that is a prerequisite of ASR (for more information, see "Install Service Tags"For versions of SNEEP older than 2.6, run the following command:
svcadm restart psncollector
Note:
If you are using SNEEP version 2.6, it is not necessary to manually restart thepsncollector
after inputting the serial number.You can view the serial number using the following URL:
http://
<AgentipAddress>:6481/stv1/agent/
Check if the Hardware Service Tags are installed and online. Run:
svcs | grep sthwreg
The results should be similar to the following example:
online Aug_23 svc:/application/sthwreg:default
If you cannot find this service, it means Hardware Service Tags are not installed on the ASR asset.
This error message indicates that the ASR Asset activation failed because the SASM IP address could not be retrieved. The final step for activating an ASR Asset includes this command:
asr activate_asset -i <host IP address>
When activation fails, the following error message displays:
Cannot retrieve the SASM IP address, please add the SASM IP address to /etc/hosts
You must edit the /etc/hosts file to update the localhost entry. For example, as root, change an entry that looks like this:
127.0.0.1 hostname123.com hostname123 localhost.localdomain localhost
to this:
127.0.0.1 localhost.localdomain localhost
Service tag processes (stlisten
and stdiscover
) must be online in order to activate assets successfully.
Check to determine if the stdiscover
or stlisten
services are disabled. Run the following command:
svcs stlisten stdiscover
If the services have been disabled, the output would look like this:
STATE STIME FMRI disabled 12:20:14 svc:/network/stdiscover:default disabled 12:20:14 svc:/network/stlisten:default
To enable the stdiscover and stlisten services, run the following command:
svcadm enable stlisten stdiscover
Verify the services are online:
svcs stlisten stdiscover
Once the services have been enabled, the output would look like this:
STATE STIME FMRI enabled 12:20:14 svc:/network/stdiscover:default enabled 12:20:14 svc:/network/stlisten:default
The SMA service needs to be online in order to support Solaris FMA enrichment data properly. Prior to configuring FMA, complete the following steps:
To check that the state of the SMA service is online, run:
svcs sma
If SMA is online, the state should indicate online, as in the following example:
STATE STIME FMRI online 15:40:31 svc:/application/management/sma:default
If SMA is not online, run the following command to enable it:
svcadm enable sma
Repeat these steps to confirm SMA is online.
For diagnostic purposes, it may be necessary to check the state of various application bundles installed on the ASR Manager system using the following procedure.
Open a terminal window and log in as root to the ASR Manager.
Enter the following command:
asr diag
Review the results of this command below along with the settings you should see:
id State Bundle 263 ACTIVE com.sun.svc.asr.sw_1.0.0 /fragnebts=264, 265 264 RESOLVED com.sun.svc.asr.sw-frag_1.0.0 Master=263 265 RESOLVED com.sun.svc.asr.sw-rulesdefinitions_1.0.0 Master=263 266 ACTIVE com.sun.svc.ServiceActivation_1.0.0
The state of each bundle should be as follows:
com.sun.svc.asr.sw bundle should be ACTIVE
com.sun.svc.asr.sw-frag should be RESOLVED
com.sun.svc.asr.sw-rules definitions should be RESOLVED
com.sun.svc.ServiceActivation should be ACTIVE
If any of these states are incorrect, enter the following commands:
asr stop asr start
Repeat steps 1 to 3.
To ensure everything is working properly, run the following commands:
asr test_connection asr send_test
When you are troubleshooting ASR, you can change the level of information displayed in the logs, and increase or decrease the number of logs that are saved before being overwritten. The logs are written to the sw-asr.log
files. Log files are located on the ASR Manager system at /var/opt/SUNWsasm/log
There are four levels of logs:
Fine: Displays the highest level of information. It contains fine, informational, warnings and severe messages.
Info: Displays not only informational data, but also both warnings and severe messages. This is the default setting.
Warning: Displays warnings and severe messages.
Severe: Displays the least amount of information; severe messages only.
The default number of logs collected and saved is 5. Once that number is reached, ASR begins overwriting the oldest file. You have the option to change the number of logs collected and saved. If you are gathering as much information as possible in a short time, you might want to limit the number of logs saved to accommodate the larger files.
Follow the procedure below to set logging levels:
Open a terminal window and log in as root on the ASR Manager system.
To view the current level of information being gathered, run:
asr get_loglevel
To change the logging level, run:
asr set_loglevel
level
The choices for level are: Fine, Info, Warning, or Severe.
Follow the procedure below to set log file counts:
Open a terminal window and log in as root on the ASR Manager system.
To view the current number of logs being saved, enter the following command:
asr get_logfilecount
To change the number of logs being saved, enter the following command:
asr set_logfilecount
<number>
Before installing ASR Manager on a blade system, make sure the service svc:/milestone/multi-user-server status is online.
To check the status of this service, run:
svcs svc:/milestone/multi-user-server
If the state indicates maintenance, run:
svcadm clear svc:/milestone/multi-user-server svcadm enable svc:/milestone/multi-user-server
If the state indicates disabled, run:
svcadm enable svc:/milestone/multi-user-server
If the ASR Manager is installed on a local zone, it is not possible to activate the ASR Manager as an ASR asset. If this is attempted, an error will be received: Asset cannot be activated due to unknown product name or serial number. This is a known issue expected to be corrected in a future version of ASR.
This section provides a variety of error conditions and resolution steps.
Error Message | Resolution |
---|---|
WARNING: Unable to retrieve fault details. For additional information and some insights into how to correct, please see the ASR Installation and Operations Guide - located at www.oracle.com/asr. See the ASR General Troubleshooting Section. |
|
WARNING: This trap is rejected because the asset is disabled | Enable the ASR Asset using one of the following commands:
or
|
WARNING: this trap is rejected because SASM ASR Plugin is not activated | Enable the ASR Manager using one of the following commands:
or
|
WARNING: this trap is rejected because the asset is not found | Enable the ASR Asset using one of the following commands:
or
|
SEVERE: Cannot attach snmp trap to snmp service! | This indicates that there could be another process using port 162. Kill that process and then run:
|
Failure to Register Errors | The sasm.log has more detailed information and a Java stacktrace on what failed during registration. When a failure error is encountered, additional details can be found in:
|
No Such Host Exception | This error indicates that the host running ASR and the SASM ASR Plugin cannot resolve the name transport.sun.com . This usually means that you need to add an entry of transport.sun.com/198.232.168.156 in the /etc/hosts file (or the name_service file you are using). |
Not Authorized. The Sun Online Account provided could not be verified by the transport server | This error indicates that the communication between transport server and Oracle is down or busy. This can also indicate that the queue set-up is wrong or that the user does not have permissions to the queue. |
Socket Exception: Malformed reply from SOCKS server | This error indicates one of the following:
|
If you get this error message running an ASR command on the ASR Manager system, it indicates that only one command can go into the SASM admin port at a time. Each command has a max handle on the connection for 60 seconds before SASM console kills the connection. Try executing the command after 60 seconds. If you still get same message, do the following:
Check if SASM is running:
ps -ef | grep SUNWsasm
Results:
root 16817 1 0 16:09:49 ? 4:24 java -cp /var/opt/SUNWsasm/lib/com.sun.svc.container.ManagementTier.jar:/var/opt
If SASM is running, kill the process using the following command:
kill -9
<Process_ID>
Restart the SASM using the following command:
svcadm restart sasm
ASR uses fault rules to filter the telemetry data sent from ASR Assets. This filtering is done to remove telemetry that contains no real fault data and general telemetry “noise.” The filtering process also ensures that telemetry that contains faults is reported. These fault rules can change as ASR improves its filtering and as new platforms and telemetry sources are supported by ASR. ASR installs a cron job on the ASR Manager system to periodically check Oracle's auto-update server for any new rules updates. When there are new rules, the ASR Manager automatically downloads and installs the latest rules bundle. If the cron job is not set to download the fault rules automatically, an email is sent to:
The email address of the Sun Online Account (SOA) associated with the ASR installation.
The contact assigned to the asset in My Oracle Support.
A distribution list assigned to the asset in My Oracle Support (optional)
For more information on fault rules, refer to:
Note:
If the asr heartbeat is disabled in crontab, you will not be notified, via email, if your ASR fault rules are out of date with the most current release. To be sure your fault rules are current, you can run theasr update_rules
command from the ASR Manager system.In cases where an ASR Manager experiences a critical failure, you can set up a new ASR Manager and reconfigure ASR Assets to report to the new host. The following steps describe a sample scenario:
An ASR Manager is set up (e.g., hostname: ASRHOST01, IP address: 10.10.10.1) and configured on the network. This ASR host is registered and activated to itself.
All ASR assets are configured to report failures to the ASR Manager host (ASRHOST01), and all ASR assets are activated on the host.
A critical failure occurs in the cabinet of ASRHOST01 (for example: a fire destroys the system and its data). The assets need to be attached to a different ASR Manager host (e.g., hostname: ASRHOST02).
A new ASR Manager is set up (e.g., hostname: ASRHOST02, IP address: 10.10.10.2) and configured on the network. The new ASR host is registered and activated to itself.
All ASR assets are now re-configured to report failures to the new ASR Manager host ASRHOST02, and the trap destination is changed to report failures to ASRHOST02.
All ASR assets are now activated on ASRHOST02
Note:
In order to reduce the additional work with moving the ASR Manager to a different location (e.g., from ASRHOST1 to ASRHOST2), you can create an ASR backup on another host or on the existing host. Creating a backup is crucial when recovering from a crash (see "ASR Backup and Restore" for a details on creating an ASR backup).