VMware Cloud Community
AnonAdmin
Enthusiast
Enthusiast

ESXi 7.0 ssh session logs filling vcsa database

I recently discovered that, in VMware ESXi 7.0, event logging has been changed to include ssh login and logout events. These events then get captured and logged in the /storage/seat partition of the vCenter appliance. We have a busy enough environment that these SSH connections to the ESXi hosts generate a significantly high number of events that are filling up the vCenter database (see attached image) and I can see a huge number of events in the esx.audit.ssh.session.opened and esx.audit.ssh.session.closed tables in the vCenter database.

 

Of course this brought our vCenter appliance down and I followed the instructions from https://kb.vmware.com/s/article/2119809 to reduce the disk space usage of the /storage/seat partition. I also increased the disk space for the vCenter appliance /storage/seat partition per https://kb.vmware.com/s/article/2145603 .

 

Will there be an option to filter out specific events such as SSH events in future releases of ESXi? In the meantime, can I create some type of a cron job to regularly purge these specific types of events from the database?

Reply
0 Kudos
13 Replies
gibou13
Contributor
Contributor

Hi 

Did you find a solution for your problem? I have same...

 

Rgds

Reply
0 Kudos
AnonAdmin
Enthusiast
Enthusiast

Unfortunately not - I continue to periodically monitor the partition sizes on each of my vcsa appliance's and follow kb 2119809 to reduce the space usage of the /storage/seat partition when needed. I'm concerned that this issue is low on the priority list and may even have been engineered on purpose since VMware competes with Nutanix. 

Reply
0 Kudos
sjesse
Leadership
Leadership

Not really sure what that would have to do with Nutanix?  Sounds like this would affect any vcenter depending on the number of ssh sessions. Its more likely they don't expect a large number of ssh sessions since most things can be done with other tools.

Reply
0 Kudos
gibou13
Contributor
Contributor

Yes, I have the same problems. Nutanix hosts generate these ssh connection but I don't find yet the way to disable audit for ssh connection...

Reply
0 Kudos
AnonAdmin
Enthusiast
Enthusiast

@sjesse That's a fair point - I may be reading into it too much. 

Reply
0 Kudos
Ajay1988
Expert
Expert

This issue was reported few months back by a customer where in Nutanix Controller VM's kept on login to the hosts and creating the below sessions rapidly.

esx.audit.ssh.session.closed and   esx.audit.ssh.session.opened.   

SSH to VCSA:  cd to   /storage/seat/vpostgres  and run    du -shc * and share output.

Can u connect to vcdb (/opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres )and run the below:

SELECT COUNT(EVENT_ID) AS NUMEVENTS, EVENT_TYPE, USERNAME FROM VPXV_EVENT_ALL GROUP BY EVENT_TYPE, USERNAME ORDER BY NUMEVENTS DESC LIMIT 5;

Note:- This query can take some time.

I am quite certain Nutanix is making this connections and filling up vcdb faster. Last I remember VMware  Engineering asking Nutanix involvement as to why so many connections r made.

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
Reply
0 Kudos
gibou13
Contributor
Contributor

I have truncated vpx_event* tables this morning and this query result for now :

97241 esx.audit.ssh.session.opened
97188 esx.audit.ssh.session.closed
13259 vim.event.UserLogoutSessionEvent root
13259 vim.event.UserLoginSessionEvent root
2082 com.vmware.vc.EventBurstStartedEvent

 

 

Reply
0 Kudos
Ajay1988
Expert
Expert

Almost 1lakh in a day is too high. I suppose Nutanix needs to  tell why are they doing so many login and logout

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
Reply
0 Kudos
AllBlack
Expert
Expert

The reason why this is happening is because VMware changed their logging behaviour in vSphere 7.
This could affect any other platform in theory but I guess less likely. Nutanix uses SSH excessively for communication between the CVM and the hypervisor. I am not sure why VMware decided to start logging this and I do not know whether you can disable these events from logging.
The workaround is to increase the SEAT partition and/or reduce retention.
Also, set up vCenter alerts to monitor health.

I have attached a Nutanix KB that explains it in more detail


 

Please consider marking my answer as "helpful" or "correct"
Ajay1988
Expert
Expert

Recommendation is always to keep the SSH service down on the ESXi hosts and only bring it up for ad-hoc tasks.
Good that VMware started logging these information. This was long pending.

I would say reduce retention . Increasing seat partition would increase the storage requirement for a future upgrade but you can still do it if not bother about space.
https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vcenter.install.doc/GUID-FB268055-5D36-4624...

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
Reply
0 Kudos
DavidGriswoldeB
Enthusiast
Enthusiast

You can't disable the SSH service on a Nutanix-backed ESXi cluster. SSH is required for the CVM to communicate with ESXi. I am not saying it is a good or bad decision, it is just a fact.

Reply
0 Kudos
gatornut2
Contributor
Contributor

Increasing any of the drives of the VCSA forces a move to a larger deployment model in a future upgrade and if you enlarge enough it goes to the max, so it is more than just space, it is CPU and Memory too.

I don't recommend enlarging the drives unless you have no other choice, we have seen negative consequences of doing so.

 

 

Reply
0 Kudos
mmi9567
Contributor
Contributor

Nutanix Consistency Checker (NCC) runs its health check about every minute or so and uses SSH to run its host health collection routines. The more Nutanix hosts you have managed by VC, the more login/logout events it will capture. IMO we should request a feature to add SEAT granularity by event type to VC so we can capture just what we need.

Reply
0 Kudos