BUG: sometimes UDP packets aren't passed from guest to host machine over vmware network interface
Summary
Since vmware 5.5 DNS resolution under the guest VM randomly stops working. Under version
5.5 I recall this issue occuring every few months, when I upgraded to v6.0
the frequency of the problem increased to every few weeks. Having recently upgraded
to v6.5.2-build156735 this issue happens multiple times per day.
Through the use of tcpdump the problem has been tracked down to failure of the
UDP packets being passed from the guest machine to host machine, although TCP or
ICMP packets pass without issue. This results in DNS lookups to fail, but ping
testing (ICMP) directly to an external IP address works, as well as telnet (TCP).
Details
VMware workstation 6.5.2-builld156735 is running on host operating system
CentOS 5.3 (x86_64), having recently upgraded from v6.0.x The guest VM
instances use a combination of various linux distributions, windows XP, and
openBSD. This bug affects all guest operating systems, where DNS lookups
fail.
To immediately and temporarily fix this problem all guest VM instances are
suspended and vmware workstation is quit. Then the vmware service is restarted on the host
machine using "/etc/init.d/vmware restart". This has the affect of restarting
the vmware virtual network interface.
Once the vmware service has been restarted, vmware workstation application is
started again and all suspended guest VM's are resumed. Upon resume DNS resolution
under the guest VM's works again.
Note that when DNS resolution fails to work under the guest, it works without
issue on the host machine.
System Architechture
A linksys router with IP address 192.168.1.1 acts as the "real" internet gateway.
The host server is plugged into the linksys router and has IP address 192.168.1.104
The host server has vmnet8 interface with IP address 172.16.237.1
The guest VM instance has IP address 172.16.237.132
On the guest VM instance in /etc/resolv.conf the nameserver is set to 192.168.1.1
for DNS lookups.
NAT networking is using for guest VM's.
Testing
By using tcpdump on both host and guest instances when DNS resolution fails and
when it works, it can be clearly seen that when DNS resolution fails
the UDP packets and not been transmitted over the vmware network from the guest VM
to host machine.
tcpdump is started on both guest and host instances using command "tcpdump udp".
then in the guest, command "host ann.lu" is executed to perform a DNS lookup.
The following output demonstrates the issue when DNS resolution is NOT working.
- GUEST MACHINE
$ tcpdump udp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
22:43:20.912071 IP 172.16.237.132.44731 > 192.168.1.1.domain: 32076+ A? ann.lu. (24)
22:43:25.911861 IP 172.16.237.132.44731 > 192.168.1.1.domain: 32076+ A? ann.lu. (24)
22:43:37.966990 IP 172.16.237.132.44736 > 192.168.1.1.domain: 21156+ A? ann.lu. (24)
22:43:42.967224 IP 172.16.237.132.44736 > 192.168.1.1.domain: 21156+ A? ann.lu. (24)
22:44:21.773888 IP 172.16.237.132.44738 > 192.168.1.1.domain: 33711+ A? ann.lu. (24)
22:44:26.773992 IP 172.16.237.132.44738 > 192.168.1.1.domain: 33711+ A? ann.lu. (24)
- HOST MACHINE
$ tcpdump udp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
- NOTHING CAPTURED **
The following output demonstrates when DNS resolution IS working.
- GUEST MACHINE
$ tcpdump udp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
22:46:27.267874 IP 172.16.237.132.44750 > 192.168.1.1.domain: 3108+ A? ann.lu. (24)
22:46:27.401527 IP 192.168.1.1.domain > 172.16.237.132.44750: 3108 1/0/0 A apache2-argon.thorin.dreamhost.com (40)
22:46:27.401809 IP 172.16.237.132.44751 > 192.168.1.1.domain: 32521+ PTR? 251.218.113.208.in-addr.arpa. (46)
22:46:27.403831 IP 172.16.237.132.44752 > 192.168.1.1.domain: 43254+ AAAA? ann.lu. (24)
22:46:27.540549 IP 192.168.1.1.domain > 172.16.237.132.44752: 43254 0/1/0 (88)
22:46:27.540892 IP 172.16.237.132.44753 > 192.168.1.1.domain: 3675+ MX? ann.lu. (24)
22:46:27.670054 IP 192.168.1.1.domain > 172.16.237.132.44753: 3675 2/0/1 MXdomain
22:46:27.683845 IP 192.168.1.1.domain > 172.16.237.132.44751: 32521 1/0/0 (94)
- HOST MACHINE
$ tcpdump udp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
22:46:27.267992 IP 192.168.1.104.36424 > 192.168.1.1.domain: 3108+ A? ann.lu. (24)
22:46:27.401318 IP 192.168.1.1.domain > 192.168.1.104.36424: 3108 1/0/0 A apache2-argon.thorin.dreamhost.com (40)
22:46:27.401488 IP 192.168.1.104.54345 > 192.168.1.1.domain: 5305+ PTR? 251.218.113.208.in-addr.arpa. (46)
22:46:27.401909 IP 192.168.1.104.36304 > 192.168.1.1.domain: 32521+ PTR? 251.218.113.208.in-addr.arpa. (46)
22:46:27.403943 IP 192.168.1.104.55388 > 192.168.1.1.domain: 43254+ AAAA? ann.lu. (24)
22:46:27.540421 IP 192.168.1.1.domain > 192.168.1.104.55388: 43254 0/1/0 (88)
22:46:27.541043 IP 192.168.1.104.49960 > 192.168.1.1.domain: 3675+ MX? ann.lu. (24)
22:46:27.669933 IP 192.168.1.1.domain > 192.168.1.104.49960: 3675 2/0/1 MXdomain
22:46:27.682846 IP 192.168.1.1.domain > 192.168.1.104.54345: 5305 1/0/0 (94)
22:46:27.683736 IP 192.168.1.1.domain > 192.168.1.104.36304: 32521 1/0/0 (94)
Conclusion
The tcpdump output shows that when DNS resolution fails to work in the guest
instances this is due to vmware networking failing to pass UDP packets back
to the host machine.
As mentioned above TCP and ICMP packets pass normally when UDP packets fail.
Tags:
workstation,
6.5.2,
networking,
bug,
udp,
packets