Hi,
I'm running a Xubuntu 16.04 guest in VMWare Fusion 8.1 installation on OS X 10.11.5. The VM is using a NAT Interface.
When I'm issuing a dns request to a not fully qualified domain name the response takes roughly 10 seconds.
~$ time host foobar
;; connection timed out; no servers could be reached
real 0m10.026s
user 0m0.020s
sys 0m0.000s
So I took a look at what is going on on the network. The queries and responses leaving and coming to my mac look fine.
But when I look at what is going on on the vmnet8 interface I see malformed responses by the VM Ware DNS.
So here is my outgoing request:
Frame 34: 66 bytes on wire (528 bits), 66 bytes captured (528 bits) on interface 0
Ethernet II, Src: Vmware_6c:df:ea (00:0c:29:6c:df:ea), Dst: Vmware_eb:ae:81 (00:50:56:eb:ae:81)
Internet Protocol Version 4, Src: 172.16.251.200, Dst: 172.16.251.2
User Datagram Protocol, Src Port: 33964 (33964), Dst Port: 53 (53)
Domain Name System (query)
Transaction ID: 0xe702
Flags: 0x0100 Standard query
0... .... .... .... = Response: Message is a query
.000 0... .... .... = Opcode: Standard query (0)
.... ..0. .... .... = Truncated: Message is not truncated
.... ...1 .... .... = Recursion desired: Do query recursively
.... .... .0.. .... = Z: reserved (0)
.... .... ...0 .... = Non-authenticated data: Unacceptable
Questions: 1
Answer RRs: 0
Authority RRs: 0
Additional RRs: 0
Queries
foobar: type A, class IN
Name: foobar
[Name Length: 6]
[Label Count: 1]
Type: A (Host Address) (1)
Class: IN (0x0001)
And this is the response I'm getting from the DNS.
Frame 36: 94 bytes on wire (752 bits), 94 bytes captured (752 bits) on interface 0
Ethernet II, Src: Vmware_eb:ae:81 (00:50:56:eb:ae:81), Dst: Vmware_6c:df:ea (00:0c:29:6c:df:ea)
Internet Protocol Version 4, Src: 172.16.251.2, Dst: 172.16.251.200
User Datagram Protocol, Src Port: 53 (53), Dst Port: 33964 (33964)
Domain Name System (query)
Transaction ID: 0x4500
Flags: 0x0034 Standard query
0... .... .... .... = Response: Message is a query
.000 0... .... .... = Opcode: Standard query (0)
.... ..0. .... .... = Truncated: Message is not truncated
.... ...0 .... .... = Recursion desired: Don't do query recursively
.... .... .0.. .... = Z: reserved (0)
.... .... ..1. .... = AD bit: Set
.... .... ...1 .... = Non-authenticated data: Acceptable
Questions: 1624
Answer RRs: 16384
Authority RRs: 16401
Additional RRs: 58740
[Malformed Packet: DNS]
[Expert Info (Error/Malformed): Malformed Packet (Exception occurred)]
[Malformed Packet (Exception occurred)]
[Severity level: Error]
[Group: Malformed]
So the transaction ID is clearly not matching the one of the query and all other fields look quite broken and the payload is too short.
This causes linux not to recognize the response and waiting until a timeout occurs. That's why it takes 10 seconds.
I can reproduce this with other linux guests as well.
Is this a known problem? Are there any known fixes?
Cheers,
Michael
PS: Here are the binary dumps of the two packets:
Query:
0000 00 50 56 eb ae 81 00 0c 29 6c df ea 08 00 45 00 .PV.....)l....E.
0010 00 34 06 58 40 00 40 11 e5 74 ac 10 fb c8 ac 10 .4.X@.@..t......
0020 fb 02 84 ac 00 35 00 20 f9 a2 e7 02 01 00 00 01 .....5. ........
0030 00 00 00 00 00 00 06 66 6f 6f 62 61 72 00 00 01 .......foobar...
0040 00 01 ..
Response:
0000 00 0c 29 6c df ea 00 50 56 eb ae 81 08 00 45 00 ..)l...PV.....E.
0010 00 50 ff 10 00 00 80 11 ec 9f ac 10 fb 02 ac 10 .P..............
0020 fb c8 00 35 84 ac 00 3c 7a c6 45 00 00 34 06 58 ...5...<z.E..4.X
0030 40 00 40 11 e5 74 ac 10 fb c8 ac 10 fb 02 84 ac @.@..t..........
0040 00 35 00 20 f9 a2 e7 02 01 00 00 01 00 00 00 00 .5. ............
0050 00 00 06 66 6f 6f 62 61 72 00 00 01 00 01 ...foobar.....
If it's the issue I'm thinking of, we fixed the bug in 8.1.1.
If you are using 8.1.1 and the issue persists I'd love to know about it so we can get our engineers to take a look.
Hi,
I'm using 'Version 8.1.1 (3771013)'.
If this might be of interest: My Installation was initially installed as 7.x on Yosemite and was upgraded all the way to 8.1.1 when updates were available. The Problem happens with a fresh installed VM as well with a VM that was created with an earlier Version.
Hm, interesting... We should look into that.
fubvmware
Our networking engineer is looking into it.
- ~$ time host foobar
- ;; connection timed out; no servers could be reached
- real 0m10.026s
- user 0m0.020s
- sys 0m0.000s
So what's the result of
~$ time host foobar
on your Mac host?
I tried it on my VM, and could get the answer:
test.localdomain is an alias for host.example.com.
host.example.com has address 10.x.x.x.
host.example.com is an alias for host.example.com.
host.example.com has address 10.x.x.x.
real 0m0.524s
user 0m0.016s
sys 0m0.004s
On my Mac I get the correct response
time host foo [17:45:53]
Host foo not found: 2(SERVFAIL)
host foo 0.00s user 0.01s system 13% cpu 0.113 total
What you are showing in your example is that you try to resolve a host that actually exists. (test.localdomain is an alias for host.example.com. show that). Yes that works.
The problem is that when you use a not qualified domain name the response is broken.
When I do a "host foobar" actually two queries are sent to the dns server. "foobar.localdomain" and "foobar". The response for "foobar.localdomain" comes instantly and is correct, telling that the host is unknown. The response for "foobar" also comes instantly but is malformed (as shown in my post above). host discards this answer and waits for a correct one that is never coming, hence the timeout.
The communication between VMWare and the outer network is working correctly (I can see that in Wireshark), the malformed response must be generated by the VMWare internal DNS.
I just tested with a windows 8 VM and got the same behaviour. (using ping as host is not available). `ping foo.foo` gives an instant error `ping foo` takes multiple seconds until giving an error when the request times out.
Hi heinemml
- Frame 36: 94 bytes on wire (752 bits), 94 bytes captured (752 bits) on interface 0
- Ethernet II, Src: Vmware_eb:ae:81 (00:50:56:eb:ae:81), Dst: Vmware_6c:df:ea (00:0c:29:6c:df:ea)
- Internet Protocol Version 4, Src: 172.16.251.2, Dst: 172.16.251.200
- User Datagram Protocol, Src Port: 53 (53), Dst Port: 33964 (33964)
- Domain Name System (query)
- Transaction ID: 0x4500
- Flags: 0x0034 Standard query
- 0... .... .... .... = Response: Message is a query
- .000 0... .... .... = Opcode: Standard query (0)
- .... ..0. .... .... = Truncated: Message is not truncated
- .... ...0 .... .... = Recursion desired: Don't do query recursively
- .... .... .0.. .... = Z: reserved (0)
- .... .... ..1. .... = AD bit: Set
- .... .... ...1 .... = Non-authenticated data: Acceptable
- Questions: 1624
- Answer RRs: 16384
- Authority RRs: 16401
- Additional RRs: 58740
- [Malformed Packet: DNS]
- [Expert Info (Error/Malformed): Malformed Packet (Exception occurred)]
- [Malformed Packet (Exception occurred)]
- [Severity level: Error]
- [Group: Malformed]
This should be the packet DNS server sends to VM, but why it shows '0... .... .... .... = Response: Message is a query '? It should be a response .
Also,I still could not reproduce your problem in my environment.
I installed xubunt vm , this is the result of the VM.
vmware@ubuntu:~/Desktop$ time host foo
Host foo not found: 3(NXDOMAIN)
real 0m0.466s
user 0m0.020s
sys 0m0.000s
and I didn't found '
in my sniffer file.
What does your /etc/resolve.conf look like?
Hi,
thanks for your time trying to reproduce this.
I took another look into the problem. It appears on all our macbook at our office when using our internal network which has it's own Windows Server based DNS Server.
I now tried different DNS Servers. I changed the DNS in the OS X preferences. Not the VM. When I use the one of our provider I get an instant response inside the VM.
So I compared the different responses. I found a pattern.
I always query with the command "host foobar"
Result when I use the DNS of our provider. Or the Google one (8.8.8.8) the server response code is:
"No such name"
Flags: 0x8183 Standard query response, No such name
1... .... .... .... = Response: Message is a response
.000 0... .... .... = Opcode: Standard query (0)
.... .0.. .... .... = Authoritative: Server is not an authority for domain
.... ..0. .... .... = Truncated: Message is not truncated
.... ...1 .... .... = Recursion desired: Do query recursively
.... .... 1... .... = Recursion available: Server can do recursive queries
.... .... .0.. .... = Z: reserved (0)
.... .... ..0. .... = Answer authenticated: Answer/authority portion was not authenticated by the server
.... .... ...0 .... = Non-authenticated data: Unacceptable
.... .... .... 0011 = Reply code: No such name (3)
which is the happily passed down to the VM.
Result when I use our internal DNS Server:
"Server failure"
Flags: 0x8182 Standard query response, Server failure
1... .... .... .... = Response: Message is a response
.000 0... .... .... = Opcode: Standard query (0)
.... .0.. .... .... = Authoritative: Server is not an authority for domain
.... ..0. .... .... = Truncated: Message is not truncated
.... ...1 .... .... = Recursion desired: Do query recursively
.... .... 1... .... = Recursion available: Server can do recursive queries
.... .... .0.. .... = Z: reserved (0)
.... .... ..0. .... = Answer authenticated: Answer/authority portion was not authenticated by the server
.... .... ...0 .... = Non-authenticated data: Unacceptable
.... .... .... 0010 = Reply code: Server failure (2)
and this is where the problem starts. The VMWare DNS is then retrying multiple times before sending a malformed response to the VM.
Okay, so how could you reproduce this? I found another case where this is happening.
If you are using a DNS which refuses to serve you you get a similar behaviour.
So try using the DNS of T-Online (217.5.100.185) which will refuse to serve you as it is for internal customers only. You should get a similar behaviour.
Summary:
if the DNS gives a "No such Name" response it is passed down to the VM and everything is fine. If the DNS gives a "Server failure" or "REFUSED" response VMWare retries multiple times and is then sending a malformed response to the VM.
using a NAT Interface
NAT does not appear on the network.
Do you want Bridged, which allows the VM to be seen/addressable by the internet?
ChipMcK sorry, I don't understand what you are referring to. Could you elaborate?