Are the Windows Agents setup to only collect Windows Event Logs, and not any other logs? Also, is anything else pointed to the Log Insight syslog server, or maybe other items dumping syslogs to the ESX hosts/vCenter?
edit: I just noticed that our "average active OSIs" jumped up 21 in the last 24 hours with no change too!
Interesting find. I'm in the same boat as well. This discrepancy causes me to be "out of compliance" with the terms of my license. The Administration > License page shows an "Average Active OSI" count that is more than double the devices I am capturing logs for. I am not using the Windows agent feature. Just VMhosts, vCenter and vCOPs.
1 person found this helpful
It might be because you have devices that are sending logs via IP and FQDN. To check which devices are sending logs, go to IA and run a query for unique count of hostname non-times series grouped by hostname. Then hover over each bar in the chart to see what devices LI sees.
Seems to be a flaw in the license tracking mechanism as the additional host counts are coming from app name "vcenter-server", 'source' <vCenterName>, 'hostname' <ESX Host>. However, the additional ESX hosts listed are not configured to send syslog to LogInsight.
I have some clusters that I don't manage with Log Insight yet, but since they are part of the vCenter, when the vCenter sends it's log data to Log Insight, if it has an event that includes the hostname of an ESXi host (even if unmanaged by Log Insight), Log Insight is marking that ESXi host as being an OSI instance.
Ah, makes sense! It is because ESXi hosts are being counted more than once and possibly non-configured ESXi hosts are being counted. The issue has to be with vCenter Server event, task, and alarm collection. The hostname of those fields is the ESXi host and not the vCenter Server. Thanks for reporting!
Well said that's it exactly. BTW, thanks for all the hard work. The Log Insight team rocks!
For what it's worth, the license page does not enforce compliance so the good news is Log Insight will continue to function even if out of compliance. I will update this thread when more information is available.
Hey Andreas, does the information about answer your question? If so, can you mark this question as answered?
One further piece of information, if you want to see active OSI in the UI, I would advise voting for this feature (if you have not already): List of hosts submitting logs in Administration view - The VMware Log Insight Community
I ran the query for unique hostnames and discovered that the two hosts are listed twice with different FQDNs (e.g. host01.example.org and host01.internal.example.org).
The hosts have two VMkernel/Management interfaces and IP addresses each, an external one and an internal one on a host-only switch.
They are registered with their external IP addresses and FQDNs to vCenter and have that external FQDN (host01.example.org) configured as "Host identification" in their DNS settings, but they log to LI through the internal interface with the internal IP address that resolves to a different FQDN (host01.internal.example.org).
This is probably a very uncommon network setup ...
For the two hostnames, check an event from each hostname and hover over the source field for the event. Is the source the same for both hostnames? My guess is they do not (one is likely vCenter).
The source name is always the same: It is always the internal DNS name of the host (host01.internal.example.org), probably because LI determines it by doing a DNS reverse lookup on the sending IP address.
The hostname though is the external name (host01.example.org) in 99% of the entries. But in very few cases LI has problems parsing the syslog message, cannot determine the hostname from the message and falls back to using the source name for the hostname instead. And this is why the internal DNS name appears as an additional hostname and OSI.
The problematic events always look like this in LI:
Section for VMware ESX, host01.example.org hostd-probe: id=3474965, version=5.5.0, build=1881737, option=Release
Normally all messages start with a time stamp, but these don't, and this is probably causing the failure in parsing and hostname detection.
You can find these messages in /var/log/vpxa.log on the sending host, and there they look like this:
Section for VMware ESX, pid=34741, version=5.5.0, build=1881737, option=Release
Here the timestamp is also missing, and overall they look very different from all the other messages in vpxa.log. I think if you look at your own hosts' logs then you will notice the same.
So the root cause of the issue (at least in my case) is an unexpected syslog event coming from vpxa.log.
Update: Messages starting with "Section for VMware ESX" are not only in vpxa.log, but also in hostd.log, rhttpproxy.log and hostd-probe.log. I found a message containing "pid=3474965" in hostd-probe.log, so this is the source of the evil.
Ah yes, if the hostname is not in the syslog RFC specified location then LI will not be able to extract it. Since all events received over syslog LI ensures contains a source and hostname, if the hostname cannot be determined then the source field is used. This would explain the behavior you are seeing.