VMware Cloud Community
LeighSat
Contributor
Contributor
Jump to solution

Operations Manager Alerts\Notifications

Hi All,

Can anyone advised if I can configure operations manger to alert me if a
virtual server has crashed? I need it to send me an email if a particular
server has crashed.

I know operations manager can send you emails when Critical, Immediate and
Warnings levels are meet in the following areas: Administrative Alerts, Health,
Risk and Efficiency but what I would only like to be notified by email if a
server has crashed and not get bombarded with all the other alerts that also
come through.

If anyone could assist with my question it would be much appreciated

Many Thanks

0 Kudos
1 Solution

Accepted Solutions
LeighSat
Contributor
Contributor
Jump to solution


Hi All,

I have found a way to get an alert for a server that has crashed or is in a no responsive state. It is around abouts way to get alerts on servers that are unresponsive but works.

How to:

1. Within Vsphere Turn on VMWare HA (High Avialaiblity) on the cluster where you want your alerts from your servers to come from.

2. Under HA settings Confgure VM Monitoring to the parameters as required.

3. Within Vsphere Click on the Alerts tab at the cluster level or on the VM you would like to monitor.

4 . Right Click New Alarm       General Tab: Create name-> Monitor Virtual Machine ->Monitor for specific events occuring on this object

                                                Triggers Tab: Click Add button -> Under event change drop down to: "HA enabled VM reset with screenshot"

                                                 Actions Tab: Click Add button -> Send a notification email: "enter the email address where you would like the alert to go.

5. An email will be sent advising the server has been restarted. You know there was a problem on the server and can check event logs.

The above works I have tested this by simulating a non responsive server: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100730...

The Alert checks for when VMware HA resets the server and when this occurs an email is sent. VMware HA polls the locally installed VMware tools and local disk I\O on the server if these become unresponsive because the server has blue screened or what ever HA resets the server.

Many Thanks for your posts.

View solution in original post

0 Kudos
9 Replies
mlebied
Enthusiast
Enthusiast
Jump to solution

While there might be a way to configure vcops to perform this function (Alerts?) the better tool would be an operating system element manager such as Hyperic, System Center, Open View, or the like.

0 Kudos
LeighSat
Contributor
Contributor
Jump to solution

We already have a tool such as the ones you have mentioned called WhatsUpGold but we are trying to reduce our licensing cost and we can use Operations manager which we get for free with our current VM licensing model we would prefer to use it and reduce our costs.

0 Kudos
gradinka
VMware Employee
VMware Employee
Jump to solution

LeighSat,

how do you determine that a server is crashed - e.g. what is your definition for alerting on such a situation?

is it loss of connectivity for xx minutes, or monitoring some process or... ?

0 Kudos
Alexander_Dimi1
Hot Shot
Hot Shot
Jump to solution

Hi Leighsat,

Take a look at this discussion: http://communities.vmware.com/thread/449319

It is explained how you can monitor services inside VMs. You can use that technique and monitor VMware vCenter Service and detect if it is working or not.

The same is true for other services (MS SQL Server, etc).

Let me know if this works for your case.

Alex D.

0 Kudos
mlebied
Enthusiast
Enthusiast
Jump to solution

That's a great question. We have experimented with multiple methods to detect a host crash/reboot. We have used a few of the following, with varying results:

1. Check the output of the uptime command to derive whether the server has recently restarted, i.e. 1 < uptime > 5 mins

2. Check the specific windows event log ID associated with server startup (don't have the exact Event ID, but it's easily googleable.

3. Loss of ICMP ping for a specified number of intervals. this will detect loss of communications, but can be an unreliable mechanism as network issues can deliver a false alert.

4. monitoring agent startup.

0 Kudos
Alexander_Dimi1
Hot Shot
Hot Shot
Jump to solution

I think that network connectivity is a must for any server crashing detection technique. Also if your server doesn't have a network connection this usually is equal to a server being crashed. ("If a tree falls in a forest and no one is around to hear it, does it make a sound?"). In all cases it doesn't sound like an alert in this case is a false positive.

Or you have other usecases in mind?

0 Kudos
gradinka
VMware Employee
VMware Employee
Jump to solution

0 Kudos
LeighSat
Contributor
Contributor
Jump to solution


Hi All,

I have found a way to get an alert for a server that has crashed or is in a no responsive state. It is around abouts way to get alerts on servers that are unresponsive but works.

How to:

1. Within Vsphere Turn on VMWare HA (High Avialaiblity) on the cluster where you want your alerts from your servers to come from.

2. Under HA settings Confgure VM Monitoring to the parameters as required.

3. Within Vsphere Click on the Alerts tab at the cluster level or on the VM you would like to monitor.

4 . Right Click New Alarm       General Tab: Create name-> Monitor Virtual Machine ->Monitor for specific events occuring on this object

                                                Triggers Tab: Click Add button -> Under event change drop down to: "HA enabled VM reset with screenshot"

                                                 Actions Tab: Click Add button -> Send a notification email: "enter the email address where you would like the alert to go.

5. An email will be sent advising the server has been restarted. You know there was a problem on the server and can check event logs.

The above works I have tested this by simulating a non responsive server: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100730...

The Alert checks for when VMware HA resets the server and when this occurs an email is sent. VMware HA polls the locally installed VMware tools and local disk I\O on the server if these become unresponsive because the server has blue screened or what ever HA resets the server.

Many Thanks for your posts.

0 Kudos
Alexander_Dimi1
Hot Shot
Hot Shot
Jump to solution

It's good that this can be solved with only one product (vCenter Server) rather than integrating vCenter / Hyperic / VCOps or other combination of products.

Simpler - the better.

Alex D.

0 Kudos