Skip navigation
1 2 3 Previous Next

Rajeev's Blog

33 posts

Unable to vMotion a virtual machine from one host to another. vMotion activity fails with the following error:-

Error code “The source detected that the destination failed to resume.
Heap dvfilter may only grow by 33091584 bytes (105325400/138416984), which is not enough for allocation of 105119744 bytes
vMotion migration [-1407975167:1527835473000584] failed to get DVFilter state from the source host <xxx.xxx.xxx.xxx>
vMotion migration [-1407975167:1527835473000584] failed to asynchronously receive and apply state from the remote host: Out of memory.
Failed waiting for data. Error 195887124. Out of memory

 

As a workaround configure a larger Heap size on a suitable target host (that can be rebooted after making the changes)

-To increase the Heap Size use type the following command on the target host.
esxcfg-module -s DVFILTER_HEAP_MAX_SIZE=276834000 dvfilter
-This requires a reboot of the ESXi host to take effect.
- Once the target host is up try vMotion the affected VM again to the target host and see if it's successful.

 

This is a known issue that the NSX team have been working upon for a while. As per VMware the default heap size is increased in ESXi 6.7 to resolve this issue.

In vsphere 6.0 some time we can see , through thick client we can see inventory but web-client empty

 

This is because of heap size on web-client

 

Increase heap size

 

In Windows, locate the file C:\Program Files\VMware\vCenter Server\visl-integration\usr\sbin)

\cloudvm-ram-size.bat and run:

 

 

https://kb.vmware.com/s/article/2150757

cloudvm-ram-size.bat -C XXX vspherewebclientsvc (where XXX is the desired heap size in MB).

 

 

https://kb.vmware.com/s/article/2150757

RajeevVCP4 Hot Shot

Virtual machine hung

Posted by RajeevVCP4 Mar 26, 2018

If you want to power off hung virtual machine ,

Try these command

 

1. Take process id and kill

 

ps | grep vmx

kill -9 <pid> 300731

 

 

Method-2:-  take vmid and use this command

 

 

vim-cmd vmsvc/getallvms

 

 

vim-cmd vmsvc/power.off <vmid>

 

 

Method-3 :- By world ID

 

 

esxcli vm process list ( take world id)

 

 

esxcli vm process kill -t=soft -w=wordid

From obfl logs

 

5:2018 Mar 17 13:13:12 MST:3.1(21d):selparser:1688: selparser.c:706: # 2D 02 00 00 01 02 00 00 D7 76 AD 5A 33 00 04 07 00 00 00 00 6F A5 92 11 # 22d | 03/17/2018 13:13:11 | BIOS | Processor #0x00 | Configuration Error |  | Asserted

5:2018 Mar 17 22:25:39 MST:3.1(21d):avct_server:1589: callback_http: new client connected, wsi 0x40e03778

5:2018 Mar 17 22:25:39 MST:3.1(21d):avct_server:13538: Client supports encrypted/unencrypted kbd/mouse.

 

From SEL

 

22d | 03/17/2018 13:13:11 | BIOS | Processor #0x00 | Configuration Error |  | Asserted

 

This is cisco bug (CSCuz55148). Vce kb (000004971)

 

Solution:- Proactive plan replace both CPUs.

RajeevVCP4 Hot Shot

MCA error detected via CMCI

Posted by RajeevVCP4 Mar 23, 2018

Condition:- Cisco UCS B200 M3

 

 

cpu13:5344825)MCE: 1020: cpu13: MCA error detected via CMCI (Gbl status=0x0): Restart IP: invalid, Error IP: invalid, MCE in progress: no.

cpu13:5344825)MCE: 190: cpu13: bank7: status=0x8c00004000010091: (VAL=1, OVFLW=0, UC=0, EN=0, PCC=0, S=0, AR=0), Addr:0x1421652080 (valid), Misc:0x42389a00 (valid)

2018-03-16T12:36:38.906Z cpu13:5344825)MCE: 199: cpu13: bank7: MCA recoverable error (CE): "Memory Controller Read Error on Channel 1."

 

CIMC | Processor IERR #0x99 | Predictive Failure asserted | Asserted

5:2018 Mar 16 23:13:53 GMT:3.1(21d):selparser:1724: selparser.c:706: # DA 01 00 00 01 02 00 00 B1 4F AC 5A 20 00 04 24 95 00 00 00 7F 05 FF FF # 1da | 03/16/2018 23:13:53 | CIMC | Platform alert LED_BLADE_STATUS #0x95 | LED color is amber | Asserted

 

 

CPU 1 : MCA_ERR_SRC_LOG : 0xc0000000

CPU 2 : READ MCA_ERR_SRC_LOG Register : FAILED : RetVal = 0 : CC = 0x81

 

 

Solution:- replace faulty system board along with TPM.

Scenario :- If we are using N3K and N9K , mostly used teaming policy ip hash , but when we changed NIC/IP its take origination port id.

 

Try to change it by these command

 

To Set the NIC teaming policy on a Virtual Switch on an ESXi 5.x

  • To list the current NIC teaming policy of a vSwitch, use the command:

    # esxcli network vswitch standard policy failover get -v vSwitch0
  • To set the NIC teaming policy of a vSwitch, use this command:
# esxcli network vswitch standard policy failover set -l policy -v vSwitchXFor example, to set the NIC teaming policy of a vSwitch to IP hash:

 

# esxcli network vswitch standard policy failover set -l iphash -v vSwitch0Note: Available Policy Options:
  • explicit = Use explicit failover order
  • portid = Route based upon port id (This is the Default setting)
  • mac = Source Based Upon MAC Hash
  • iphash = Source based up IP hash (This is only to be used in a etherchannel\Portchannel)

To Set the NIC teaming policy on a Port Group

  1. To list the current NIC teaming policy of a port group, run this command:


    esxcli network vswitch standard portgroup policy failover get -p "Management Network"
  2. To set the NIC teaming policy of a port group, run this command:


    esxcli network vswitch standard portgroup policy failover set -p "Management Network" -l "Policy Options"

policy option is iphash

 

VMware Knowledge Base

In vcenter server 6.0 U3b , from vpxd logs.

 

 

--> Panic: Win32 exception: Access Violation (0xc0000005)

--> Read (0) at address 0000000000000058

--> rip: 00007ffde6a18cc0 rsp: 0000000018ede918 rbp: 00000000113f4a40

 

This issue occurs due to Microsoft SQL databases not supporting shared connections.

 

Solution:- Upgrade vcenter server/PSC by 6.0 U3c/d

 

For more information refer

 

VMware Knowledge Base

I got issue when I checked  windows authentication  box, same user was able to logon by web client but from vc-client getting issue.

 

When I typed manually user name and password I was able to logon. I went through logs and got use full KB which is applicable 6.0 also.

 

 

2017-11-24T12:55:37.853Z error vpxd[7F3237CF8700] [Originator@6876 sub=GSSAPI opID=F65D9D27-00000004-8a] gss_accept_sec_context failed: (0x00070000, 0x00000000)

2017-11-24T12:55:37.853Z error vpxd[7F3237CF8700] [Originator@6876 sub=GSSAPI opID=F65D9D27-00000004-8a] Supported mechanisms: ({ 1 2 840 113554 1 2 2 }, { 1 3 5 1 5 2 }, { 1 2 840 48018 1 2 2 }, { 1 3 6 1 5 2 5 }, { 1 3 6 1 5 5 2 }, { 1 3 6 1 4 1 311 2 2 10 }, { 1 2 840 113554 1 2 10 })

2017-11-24T12:55:37.861Z info vpxd[7F3237CF8700] [Originator@6876 sub=vpxLro opID=F65D9D27-00000004-8a] [VpxLRO] -- FINISH task-internal-1992

2017-11-24T12:55:37.861Z info vpxd[7F3237CF8700] [Originator@6876 sub=Default opID=F65D9D27-00000004-8a] [VpxLRO] -- ERROR task-internal-1992 -- SessionManager -- vim.SessionManager.loginBySSPI: vim.fault.InvalidLogin:

--> Result:

--> (vim.fault.InvalidLogin) {

-->    faultCause = (vmodl.MethodFault) null,

-->    msg = ""

--> }

--> Args:

-->

--> Arg base64Token:

 

 

Logging in to VMware vCenter Server Appliance 5.1 or 5.5 using the Use Windows session credentials option fails with the…

When we run command

 

Service-control --stop --all

Service-control --start -all

 

We got error cm or proxy service not started that why vpxd have issue.

 

Try this

 

Restart, proxy service then restart vpxd. It because of component manager service not starting.

KB: - https://kb.vmware.com/s/article/2147891

Some time web client start but , facing the issue for login and got the error 503.  Mostly issue in 6.0 U2 which is resolved in U3

 

  1. 1. Temporary  solution :- Increase MaxPermMB size 

 

https://kb.vmware.com/s/article/2148445

  1. 2. Permanent: - Upgrade your infrastructure according to 6.0 U3.

 

          https://kb.vmware.com/s/article/2148355

This issue mostly comes when we upgrade PSC ( appliances) , not able to open PSC in GUI mode.

 

This is certificate issue . there are 2 solution

1. Take snap shot and regenerate certificate then restart web-client service

2. Decommission affected PSC ( first check which PSC is pointing if you have 2 PSC) do not touch pointing PSC

3. Once Decomm delete it from disk and remove it from AD/DNS

4. Recreate DNS record

5. Deploy new PSC and join in Domain

6. Try IP/psc you should able to open it in GUI.

 

 

Using the cmsso command to unregister vCenter Server from Single Sign-On (2106736) | VMware KB

RajeevVCP4 Hot Shot

Host not responding

Posted by RajeevVCP4 Sep 8, 2017

When host is not responding and not able to press F2 key but it is ping able

 

Mostly issue from storage side scsci reservation locked host's lock file and not able to release it

 

It depend on storage environment in vnx when we check we got

 

09/08/17 10:22:06 FCDMTL 9 (FE1/SC)  ebd1b60 Target command error: mp context 0x300ff, SCSI status = 0x18 - returned by UL

                       A 09/08/17 10:22:06 TDD f32c7730 GRO:Register/Reserve Reservation RsvKey Mismatch Error.

                       A 09/08/17 10:22:06 FCDMTL 9 (FE1/SC)  ef6f680 Target command error: mp context 0x100f6, SCSI status = 0x18 - returned by UL

                       A 09/08/17 10:22:06 TDD f32c7730 GRO:Register/Reserve Reservation RsvKey Mismatch Error.

 

Here scsi reservation because of HLU and ALU was mismatch

Need to reconfigure from storage side and reboot all host in the cluster

 

After upgrade PSC u1 to u2 PSC disjoin from domain , and user got this error.

 

com.vmware.identity.idm.server.provider.vmwdirectory.VMwareDirectoryProvider]

[2017-08-18T13:23:14.779-05:00 bba76607-42b4-4a15-a3c2-7542f427d12c WARN ] [ActiveDirectoryProvider] There may be a domain join status change since native AD is configured. ActiveDirectoryProvider can function properly only when machine is properly joined

 

Tried to join domain  but got (Idm client exception: Error trying to join AD, error code [31])

 

It was because of smb was disabled on PSC (appliances)

 

run this command

 

/opt/likewise/bin/lwregshell list_values '[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]'

 

if value is zero of smb  then run this command

 

  1. /opt/likewise/bin/lwregshell set_value '[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]' Smb2Enabled 1
  2. /opt/likewise/bin/lwsm restart lwio
  3. service-controll --stop --all
  4. service-control --start -all
  5. restart web client on vcenter server
  6. then logon psc and try to add PSC in domain
  7. If you have secondary PSC , it will take some time for showing domain user.

Scenario , Windows based vcenter server with external PSC (appliances)

 

When click any component by using administrator@vsphere.local

 

Error:- Do not have permission to look this object

 

Work around:-

1. Shut down the vCenter Server services
2. Shut down the PSCs services
3. Started up PSC services then verified they were all running
4. Started up vCenter services then verified those services

When user is trying to browse storage (data store/device) through web-client getting error

 

I worked with VMware , support opened internal PR with engineering team and provided solution as

Per development they've root caused the issue. The fix will be included in vCenter 6.0 patch 6. The release of this fix is early Q4 this year.

 

 

The fix will also be in the next releases of 5.5 and 6.5 as well.