Skip navigation
1 2 3 4 Previous Next

Rajeev's Blog

46 posts

To get out of this situation do the following:

      • Restart and turn off Secure Boot in the UEFI firmware and boot the host with Secure Boot turned off.
      • When booted, log into the host and remove the offending VIB and shutdown. ( which is on PSOD screen)
      • Re-enable Secure Boot and restart the host and the system should boot normally.

When using the VMXNET3 driver on ESXi 4.x, 5.x, 6.x, you see significant packet loss during periods of very high traffic bursts

Cause

This issue occurs when packets are dropped during high traffic bursts. This can occur due to a lack of receive and transmit buffer space or when receive traffic which is speed-constrained. For example, with a traffic filter.

 

To resolve this issue, ensure that there is no traffic filtering occurring (for example, with a mail filter). After eliminating this possibility, slowly increase the number of buffers in the guest operating system.

To reduce burst traffic drops in Windows Buffer Settings:

  1. Click Start > Control Panel > Device Manager.
  2. Right-click vmxnet3 and click Properties.
  3. Click the Advanced tab.
  4. Click Small Rx Buffers and increase the value. The maximum value is 8192.
  5. Click Rx Ring #1 Size and increase the value. The maximum value is 4096

Note:-

  • These changes will happen on the fly, so no reboot is required. However, any application sensitive to TCP session disruption can likely fail and have to be restarted. This applies to RDP, so it is better to do this work in a console window.
  • This issue is seen in the Windows guest operating system with a VMXNET3 vNIC. It can occur with versions besides 2008 R2.
  • It is important to increase the value of Small Rx Buffers and Rx Ring #1 gradually to avoid drastically increasing the memory overhead on the host and possibly causing performance issues if resources are close to capacity.
  • If this issue occurs on only 2-3 virtual machines, set the value of Small Rx Buffers and Rx Ring #1 to the maximum value. Monitor virtual machine performance to see if this resolves the issue.
  • The Small Rx Buffers and Rx Ring #1 variables affect non-jumbo frame traffic only on the adapter.

Symptoms

 

After upgrading the operating system where VMware componenets are installed, you experience these symptoms:

 

  • You are unable to start the VMware VirtualCenter Server, VMware VirtualCenter Management Webservices, or VMware vSphere Web Client service
  • Starting a service fails with the error:

    Error 1075: The dependency service does not exist or has been marked for deletion

  • The operating system was upgraded to Microsoft Windows Server 2012 R2 from Windows Server 2008 R2

Cause

 

This issue occurs due to the change of services and service names in Microsoft Windows Server 2012 R2.

 

In Microsoft Windows Server 2008 R2, the VMware Virtual Center Server and VMware vSphere Web Client services are dependent on the Protected Storage service. In Microsoft Windows Server 2012 R2, the Protected Storage service is renamed to the Security Accounts Manager service. During the operating system upgrade, the VMware service dependencies are not updated with the new service names. This results in the VMware Virtual Center Server service not being able to start as the Protected Storage service no longer exists.

 

I followed this KB (2112741)

 

VMware Knowledge Base

We taken offline VSAN cluster and re-install all hosts , here we forgot de-attached drive before re-image server.

After re-image we were not able to see any nvme ssd in ESXi 6.7 , I found driver was not in vib list

I installed driver on all hosts and rebooted

Now I am able to see all drive but required erase partition for creating VSAN cluster ,

 

Download this driver and install by command line

 

(VMware ESXi 6.7 intel-nvme-vmd 1.4.0.1016 NVMe Driver)

Error :-

rror vpxd[7F9037595800] [Originator@6876 sub=Main] Init failed. VdbError: Error[VdbODBCError] (-1) "ODBC error: (23503) - ERROR: insert or update on table "vpx_entity" violates foreign key constraint "fk_vpx_ent_ref_vpx_ent_type";

--> Error while executing the query" is returned when executing SQL statement "INSERT INTO VPX_ENTITY (ID, NAME, TYPE_ID, PARENT_ID) VALUES (?, ?, ?, ?)"

--> Backtrace:

 

 

This is a known issue affecting upgrades and migration paths from vCenter Server 6.0 Update 3g deployed with embedded Postgres DB to vCenter Server 6.5 and 6.7.

 

This issue is resolved in vSphere 6.5 Update 2d and 6.7 Update 1

 

https://kb.vmware.com/s/article/57738

This is occuured when upgrade VCSA6.0 to VCSA 6.5 if vpostgress have customize DB

 

connect with VCDB

 

VMware Knowledge Base

 

run command

 

VCDB=# \dv

 

Check how many view item there , then cascade it.

 

Test to manually suppress VPXV_VMS view in VCDB

VCDB=#

VCDB=# DROP VIEW IF EXISTS VPXV_VMS;

ERROR:  cannot drop view vpxv_vms because other objects depend on it

DETAIL:  view "DCS_BV_VIEW3" depends on view vpxv_vms

HINT:  Use DROP ... CASCADE to drop the dependent objects too.

VCDB=# DROP VIEW IF EXISTS VPXV_VMS CASCADE;

NOTICE:  drop cascades to view "DCS_BV_VIEW3"

DROP VIEW

VCDB=# DROP VIEW VPXV_VMS;

ERROR:  view "vpxv_vms" does not exist

VCDB=# \q

root@VC [ /var/log/vmware/vpxd ]#

 

 

 

 

Getting the below error while starting the vCenter services.

vmware-vpxd: Waiting for vpxd to start listening for requests on 8089

Waiting for vpxd to initialize: ..........................................................Fri Aug 17 15:00:05 EDT 2018 Captured live core: /var/core/live_core.vpxd.2804.08-17-2018-15-00-05

[INFO] writing vpxd process dump retry:2 Time(Y-M-D H:M:S):2018-08-17 19:00:03

.Fri Aug 17 15:00:16 EDT 2018 Captured live core: /var/core/live_core.vpxd.2804.08-17-2018-15-00-16

[INFO] writing vpxd process dump retry:1 Time(Y-M-D H:M:S):2018-08-17 19:00:15

.failed

failed

vmware-vpxd: vpxd failed to initialize in time.

vpxd is already starting up. Aborting the request.

 

Stderr =

2018-08-17T19:00:26.608Z {

"resolution": null,

"detail": [

{

"args": [

"Command: ['/sbin/service', u'vmware-vpxd', 'start']\nStderr: "

],

"id": "install.ciscommon.command.errinvoke",

"localized": "An error occurred while invoking external command : 'Command: ['/sbin/service', u'vmware-vpxd', 'start']\nStderr: '",

"translatable": "An error occurred while invoking external command : '%(0)s'"

}

    ],

    "componentKey": null,

"problemId": null

}

ERROR:root:Unable to start service vmware-vpxd, Exception: {

"resolution": null,

"detail": [

{

"args": [

"vmware-vpxd"

],

"id": "install.ciscommon.service.failstart",

"localized": "An error occurred while starting service 'vmware-vpxd'",

  "translatable": "An error occurred while starting service '%(0)

 

Solution need to check domain controller connectivity between VC/PSC

 

cpu2:32999)0x4390c119b660:[0x4180163128c3]VTDQISync@vmkernel#nover+0xf7 stack: 0x1

cpu2:32999)0x4390c119b6a0:[0x4180163137b2]VTDIRWriteIRTE@vmkernel#nover+0x8e stack: 0x2e

cpu2:32999)0x4390c119b6d0:[0x418016313895]VTDIRSteerVector@vmkernel#nover+0x61 stack: 0x43004d129f10

cpu2:32999)0x4390c119b700:[0x4180162e96c9]IOAPICSteerVector@vmkernel#nover+0x59 stack: 0x1c00

cpu2:32999)0x4390c119b740:[0x418016057514]IntrCookie_SetDestination@vmkernel#nover+0x174 stack: 0x4

 

VMware Knowledge Base

In the vpxd.log file, you see entries similar to:

 

2012-04-02T13:07:49.438+02:00 [02248 info 'Default' opID=66183d64] [VpxLRO] -- BEGIN task-internal-252 -- -- vim.SessionManager.acquireSessionTicket -- 52fa8682-47e0-2566-fb05-6192cb2c22f9(5298e245-ffb6-f7f8-e8a0-dedfbe369255)
2012-04-02T13:07:49.579+02:00 [02068 info 'Default'] [VpxLRO] -- BEGIN task-internal-253 -- host-94 -- VpxdInvtHostSyncHostLRO.Synchronize --
2012-04-02T13:07:49.579+02:00 [02068 warning 'Default'] [VpxdInvtHostSyncHostLRO] Connection not alive for host host-94
2012-04-02T13:07:49.579+02:00 [02068 warning 'Default'] [VpxdInvtHost::FixNotRespondingHost] Returning false since host is already fixed!
2012-04-02T13:07:49.579+02:00 [02068 warning 'Default'] [VpxdInvtHostSyncHostLRO] Failed to fix not responding host host-94
2012-04-02T13:07:49.579+02:00 [02068 warning 'Default'] [VpxdInvtHostSyncHostLRO] Connection not alive for host host-94
2012-04-02T13:07:49.579+02:00 [02068 error 'Default'] [VpxdInvtHostSyncHostLRO] FixNotRespondingHost failed for host host-94, marking host as notResponding
2012-04-02T13:07:49.579+02:00 [02068 warning 'Default'] [VpxdMoHost] host connection state changed to [NO_RESPONSE] for host-94
2012-04-02T13:07:49.610+02:00 [02248 info 'Default' opID=66183d64] [VpxLRO] -- FINISH task-internal-252 -- -- vim.SessionManager.acquireSessionTicket -- 52fa8682-47e0-2566-fb05-6192cb2c22f9(5298e245-ffb6-f7f8-e8a0-dedfbe369255)
2012-04-02T13:07:49.719+02:00 [02068 info 'Default'] [VpxdMoHost::SetComputeCompatibilityDirty] Marked host-94 as dirty.

 

This issue may occur if heartbeat packets are not received from the host before the one minute timeout period expires. These heartbeat packets are UDP packets sent over port 902.

 

This issue may also occur when the Windows firewall is enabled and the ports are not configured.

 

Resolution

To resolve this issue, check the Windows Firewall on the vCenter Server machine. If ports are not configured, disable the Windows Firewall.

 

If ports are configured, verify if network traffic is allowed to pass from the ESXi/ESX host to the vCenter Server system, and that it is not blocking UDP port 902.

 

To perform a basic verification from the guest operating system perspective:
  1. Click Start > Run, type wf.msc, and click OK. The Windows Firewall with Advanced Security Management console appears.
  2. In the left pane, click Inbound Rules.
  3. Right-click the VMware vCenter Server -host heartbeat rule and click Properties.
  4. In the Properties dialog, click the Advanced tab.
  5. Under Profiles, ensure that the Domain option is selected.

VMware Knowledge Base

 

 

In the /var/log/hostd.log file, you see entries similar to:

Failed to get physical location of SCSI disk: Failed to get location information for naa.600c0ff00025d308b29de55501000000lsu-hpsa-plugin Unknown error

 

  • The /var/log/vpxa.log file contains errors similar to:
YYYY-MM-DDT<time> warning vpxa[7DD7FB70] [Originator@6876 sub=hostdcnx] [VpxaHalCnxHostagent] Could not resolve version for authenticating to host agent</time>
YYYY-MM-DDT<time> verbose vpxa[FFD40AC0] [Originator@6876 sub=hostdcnx] [VpxaHalCnxHostagent] Creating temporary connect spec: localhost:443</time>
YYYY-MM-DDT<time> verbose vpxa[FFD40AC0] [Originator@6876 sub=vpxXml] [VpxXml] Error fetching /sdk/vimServiceVersions.xml: 503 (Service Unavailable)</time>
YYYY-MM-DDT<time> warning vpxa[FFD40AC0] [Originator@6876 sub=Default] Closing Response processing in unexpected state: 3</time>
This issue occurs when the upgrade replaces the new esx-base, but keeps the higher version of the lsu plugins
This issue is resolved in VMware ESXi 6.0 Update 3
VMware Knowledge Base

This issue occurred during upgrade 6.0 to 6.5

Because of root password suppose to expired

 

Solution :- reset root password of PSC/VC

 

For more information we can look these KB

 

VMware Knowledge Base

 

VMware Knowledge Base

"/storage/db/vmware-vmdir/data.mdb', '[Errno 28] No space left on device"

 

/var/log/firstboot/vmafd-firstboot.py_XXXX_stderr.log contains the following error message:

 

Error: [('/storage/db/cis-export-folder/vmafd/data/vmdir/data.mdb', '/storage/db/vmware-vmdir/data.mdb', '[Errno 28] No space left on device')]

 

Multiple issues can occur if a Platform Services Controller has more than 100,000 tombstone entries, below this threshold the symptoms in this article are likely unrelated.

 

To determine the number of tombstone entries on a Platform Services Controller Appliance, run this command:

 

/opt/likewise/bin/ldapsearch -H ldap://PSC_FQDN -x -D "cn=administrator,cn=users,dc=vsphere,dc=local" -w 'password' -b "cn=Deleted Objects,dc=vsphere,dc=local" -s sub "" -e 1.2.840.113556.1.4.417 dn| perl -p00e 's/\r?\n //g' | grep '^dn' | wc -l

 

for more information find the KB 52387

 

VMware Knowledge Base

The ESXi host shows event warnings matching the event below:

   Agent can't send heartbeats: No route to host.

 

There may be no other symptoms relating to this issue.

 

Workaround:- The solution is to enable the fail back option on the portgroup configuration. (vSwitch settings

 

 

Unable to vMotion a virtual machine from one host to another. vMotion activity fails with the following error:-

Error code “The source detected that the destination failed to resume.
Heap dvfilter may only grow by 33091584 bytes (105325400/138416984), which is not enough for allocation of 105119744 bytes
vMotion migration [-1407975167:1527835473000584] failed to get DVFilter state from the source host <xxx.xxx.xxx.xxx>
vMotion migration [-1407975167:1527835473000584] failed to asynchronously receive and apply state from the remote host: Out of memory.
Failed waiting for data. Error 195887124. Out of memory

 

As a workaround configure a larger Heap size on a suitable target host (that can be rebooted after making the changes)

-To increase the Heap Size use type the following command on the target host.
esxcfg-module -s DVFILTER_HEAP_MAX_SIZE=276834000 dvfilter
-This requires a reboot of the ESXi host to take effect.
- Once the target host is up try vMotion the affected VM again to the target host and see if it's successful.

 

This is a known issue that the NSX team have been working upon for a while. As per VMware the default heap size is increased in ESXi 6.7 to resolve this issue.

In vsphere 6.0 some time we can see , through thick client we can see inventory but web-client empty

 

This is because of heap size on web-client

 

Increase heap size

 

In Windows, locate the file C:\Program Files\VMware\vCenter Server\visl-integration\usr\sbin)

\cloudvm-ram-size.bat and run:

 

 

https://kb.vmware.com/s/article/2150757

cloudvm-ram-size.bat -C XXX vspherewebclientsvc (where XXX is the desired heap size in MB).

 

 

https://kb.vmware.com/s/article/2150757