VMware Cloud Community
StefanS_hyperic
Contributor
Contributor

Virtual machine is assigned an invalid IP address after a reboot, after latest update esxi 7.0.1

Hi there,
after the last update to 7.0.1, the clients no longer receive an IP address from the DHCP server (169.254.x.x).
Then we activated the DHCP backup server, everything seems to be OK again.
However, when the remaining hosts in the cluster were updated, VM's were moved to the other host with vmotion, the game started all over again.
The activation of the master DHCP helped here again, the Bakup DHCP was now deactivated as a change between the two.
We don't see this problem on the second cluster, but the difference is that here the DHCP servers are Windows 2019 servers and where the problems are Windows 2012R2.

Has anyone seen such a problem before?
Thx for any help here

0 Kudos
9 Replies
nachogonzalez
Commander
Commander

Hi Stefan, hope you are doing fine

Whenever you see a 169.254.x.x IP address it is an APIPA, meaning you didn't reach the DHCP server. 
This can be caused by a lot of different things, so let me ask you?
Are you sure VMs are on the same VLAN as the DHCP server?
Are you sure NICs are connected at power on?

If all of that is true, I'd like to ask another question
Before upgrading your esxi hosts to 7.0.1 did you check the compatibility matrix to make sure drivers are compatible with your hardware?

0 Kudos
StefanS_hyperic
Contributor
Contributor

Hi @nachogonzalez 
The hardware is compatible with 7.x, in version 7.0 we didn't see this problem, only after upgrading to 7.0.1.

>>Are you sure VMs are on the same VLAN as the DHCP server?
DHCP server and client are in different VLans, this construct has worked for more than 10 years without any problems.

>>Are you sure NICs are connected at power on?
Yes

>>Before upgrading your esxi hosts to 7.0.1 did you check the compatibility matrix to make sure drivers are compatible with your hardware?
Also yes

As already mentioned, we have two identical clusters with identical hardware, the only difference we are currently seeing with this problem is the different OS version of the DHCP server.

0 Kudos
nachogonzalez
Commander
Commander

Are the W2019 and W2012 DHCP servers virtual?
If so which VM Nics do they have?

0 Kudos
StefanS_hyperic
Contributor
Contributor

@nachogonzalez 
yes, all DHCP servers (Windows DC) are virtual.

Windows 2021R2 = E1000E
Windows 2019 = E1000E

All DHCP servers have static IP addresses.
All servers, including the DHCP server, that have a static IP address could be reached via the network.
Only the clients with DHCP had problems getting an IP address. Either the DHCP client itself has a problem or there is a problem in the Vmware network stack.
Interestingly, the problem can be solved by alternately activating either the DHCP back server or the master DHCP, depending on who has a problem.

0 Kudos
nachogonzalez
Commander
Commander

Please try changing the VMNICs to VMXNET3 and updating VMware tools to the latest version.
There are known issues with windows hosts and E1000 interfaces.

0 Kudos
StefanS_hyperic
Contributor
Contributor

OK,
but how does that explain this behavior?

Short info.
Restarting the DHCP clients or the affected DHCP server does not help either, the only thing that helps is activating a "different" DHCP server.
All OSs with DHCP * .nix and Windows are affected.

0 Kudos
nachogonzalez
Commander
Commander

Honestly, I don't think this has to do with VMware itself.

Why don't you do this test:

Place one DHCP client on the same VLAN as one of the windows DHCP servers and that way you will be able to see if the problem is with the DHCP servers or with VMware. 

0 Kudos
StefanS_hyperic
Contributor
Contributor

Small correction, there is a Windows 10 client in the same subnet / Vlan as the DHCP server and that too had lost the IP after the lease expired or was no longer available.
It must have something to do with the update.
"Before" everything OK, during the update and afterwards this problem.
You don't see any problems on the DHCP server, no network error nothing. It looks like they don't get a DHCP request.
You are very surprised when all of a sudden 80% of the servers are missing because they can no longer renew your IP and thus cannot be reached.

0 Kudos
nachogonzalez
Commander
Commander

Hey, hope you are doing fine
sorry for the late response

coming back on the E1000e Nics

https://kb.vmware.com/s/article/2109922

This KB will shed some light on the issue.

Additionally, have you reviewed event viewer on the windows servers?
What does it say?

0 Kudos