Oracle EM auto discovery fails if the host is under blackout
During Exadata project I had to put some order and tidy up the Enterprise Manager targets. I’ve decided to discover all targets, promote the one which are missing and delete old/stale one. I’ve got strange error and decided to share it if someone hit it. I’m running Oracle Enterprise Manager Cloud Control 12c Release 2.
When you try to run auto discovery from the console for a host you almost immediately get the following error:
Run Discovery Now failed on host oraexa201.host.net: oracle.sysman.core.disc.common.AutoDiscoveryException: Unable to run on discovery on demand.RunCollection: exception occurred: oracle.sysman.gcagent.task.TaskPreExecuteCheckException: non-existent, broken, or not fully loaded target
When review the agent log the following exception could be seen:
tail $ORACLE_HOME/agent/sysman/log/gcagent.log 2013-07-16 11:26:06,229 [34:2C47351F] INFO - >>> Reporting exception: oracle.sysman.emSDK.agent.client.exception.NoSuchMetricException: the DiscoverNow metric does not exist for host target oraexa201.host.net (request id 1) <<< oracle.sysman.emSDK.agent.client.exception.NoSuchMetricException: the DiscoverNow metric does not exist for host target oraexa201.host.net
Got another error message during my second (test) run:
2013-08-02 15:20:47,155 [33:B3DBCC59] INFO - >>> Reporting response: RunCollectionResponse ([DiscoverTargets : host.oraexa201.host.net oracle.sysman.emSDK.agent.client.exception.RunCollectionItemException: Metric evaluation failed : RunCollection: exception occurred: oracle.sysman.gcagent.task.TaskPreExecuteCheckException: non-existent, broken, or not fully loaded target @ yyyy-MM-dd HH:mm:ss,SSS]) (request id 1) <<<
Although emctl status agent shows that last successful heartbeat and upload are up to date, still you cannot discover targets on the host.
This is caused by the fact that the host is under BLACKOUT!
1. Through the console end the blackout for that host:
Go to Setup -> Manager Cloud Control -> Agent, find the agent for which you experience the problem and click on it. Then you clearly can see that the status of the
agent is “Under Blackout”. Simply select Agent drop down menu – > Control and then End Blackout.
2. Using emcli, first login, list blackout and then stop the blackout:
[oracle@em ~]$ emcli login -username=sysman Enter password Login successful [oracle@em ~]$ emcli get_blackouts Name Created By Status Status ID Next Start Duration Reason Frequency Repeat Start Time End Time Previous End TZ Region TZ Offset test_blackout SYSMAN Started 4 2013-08-02 15:06:43 01:00 Hardware Patch/Maintenance once none 2013-08-02 15:06:43 2013-08-02 16:06:43 none Europe/London +00:00
List of target which are under blackout and then stop the blackout:
[oracle@em ~]$ emcli get_blackout_targets -name="test_blackout" Target Name Target Type Status Status ID has_oraexa201.host.net has In Blackout 1 oraexa201.host.net host In Blackout 1 TESTDB_TESTDB1 oracle_database In Blackout 1 oraexa201.host.net:3872 oracle_emd In Blackout 1 Ora11g_gridinfrahome1_1_oraexa201 oracle_home In Blackout 1 OraDb11g_home1_2_oraexa201 oracle_home In Blackout 1 agent12c1_3_oraexa201 oracle_home In Blackout 1 sbin12c1_4_oraexa201 oracle_home In Blackout 1 LISTENER_oraexa201.host.net oracle_listener In Blackout 1 +ASM_oraexa2-cluster osm_cluster In Blackout 1 +ASM4_oraexa201.host.net osm_instance In Blackout 1 [oracle@em ~]$ emcli stop_blackout -name="test_blackout" Blackout "test_blackout" stopped successfully
And now when the discovery is run again:
Run Discovery Now – Completed Successfully
I was unable to get an error initially when I set the blackout, but then got the error after restarting the EM agent.
22 Oct 2013 Update:
After update to 12c (described here) now meaningful error is raised when you try to discover targets during agent blackout:
Run Discovery Now failed on host oraexa201.host.net: oracle.sysman.core.disc.common.AutoDiscoveryException: Unable to run on discovery on demand.the target is currently blacked out