We have an "HCI' device that allows external iscsi connections, but the only requirement is they also use jumbo frames, otherwise, I wouldn't bother. Everything between the esxi host and the storage are all enabled for jumbo frames. The weird thing is if I try to test it with this command
vmkping -I vmk3 -s 8972 10.14.2.16 -d
the host disconnects from vCenter, I can't access the host ip address, all I can do is restart the host to get access again. I'm going to go through the logs but I don't understand why this would even occur when I'm trying to just use one vmkernel adapter. The other adapters are still at 1500MTU, and my understanding is they can stay that way?
You are sure that you configured jumbo frames on the physical network, the vSwitch, and the vmkernel adapter? Here's a link to a troublshooting page: http://rickardnobel.se/troubleshoot-jumbo-frames-with-vmkping/ Maybe it helps...
Thats the command I'm running to test the connection with, its almost exactly the same. I tested for fragmentation before I changed the vmkernel port to 9000 and it showed the correct response. I then changed the vmkernel port to 9000 and then it started to cause issues.
So before changing the vmkernel port you could see fragmentation occur? And after changing the vmkernel port your host disconnects. Is anything visible in the vmkernel.log file? And is the MTU supported on your physical nic?
From everything I've seen so far 9000 should work, these are old hs23 blades, but we checked with lenovo and they say they enable in the bladecenters and blade by default. I did see the fragmentation before, this problem only occurent after setting the vmkernel adapter to 9000. I'm trying to get the logs but we off load them to iscsi datastores, which is getting cut off once the blade goes nuts. I think we have enough space on the blade harddrive to redirect the logs back I'm going to try that, but there is alot of stuff that comes up if I watch the blade directly, the issue is shell looks like it crashes as well because I can't type, but tail -f still outputs stuff so I'm not sure whats going on.
vmk3
Name: vmk3
MAC Address: 00:50:56:66:e9:bc
Enabled: true
Portset: vSwitch0
Portgroup: Vlan 3103
Netstack Instance: defaultTcpipStack
VDS Name: N/A
VDS UUID: N/A
VDS Port: N/A
VDS Connection: -1
Opaque Network ID: N/A
Opaque Network Type: N/A
External ID: N/A
MTU: 9000
TSO MSS: 65535
Port ID: 33554436
vSwitch0
Name: vSwitch0
Class: etherswitch
Num Ports: 5632
Used Ports: 4
Configured Ports: 128
MTU: 9000
CDP Status: listen
Beacon Enabled: false
Beacon Interval: 1
Beacon Threshold: 3
Beacon Required By:
Uplinks: vmnic0
Portgroups: Vlan 3103
Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description
------ ------------ ------ ------------ ----------- ----- ------ ----------------- ---- -------------------------------------------------
vmnic0 0000:16:00.0 elxnet Up Up 8000 Full 34:40:b5:e2:14:f8 9000 Emulex Corporation Emulex OneConnect OCe11100 NIC
vmnic1 0000:16:00.1 elxnet Up Up 8000 Full 34:40:b5:e2:14:fc 1500 Emulex Corporation Emulex OneConnect OCe11100 NIC
vmnic2 0000:16:00.2 elxnet Up Up 1000 Full 34:40:b5:e2:14:f9 1500 Emulex Corporation Emulex OneConnect OCe11100 NIC
vmnic3 0000:16:00.3 elxnet Up Up 1000 Full 34:40:b5:e2:14:fd 1500 Emulex Corporation Emulex OneConnect OCe11100 NIC
vmnic4 0000:16:00.4 elxnet Up Down 0 Half 34:40:b5:e2:14:fa 1500 Emulex Corporation Emulex OneConnect OCe11100 NIC
vmnic5 0000:16:00.5 elxnet Up Down 0 Half 34:40:b5:e2:14:fe 1500 Emulex Corporation Emulex OneConnect OCe11100 NIC
vmnic6 0000:16:00.6 elxnet Up Up 1000 Full 34:40:b5:e2:14:fb 1500 Emulex Corporation Emulex OneConnect OCe11100 NIC
vmnic7 0000:16:00.7 elxnet Up Up 1000 Full 34:40:b5:e2:14:ff 1500 Emulex Corporation Emulex OneConnect OCe11100 NIC
Couldn't get the logs off because there isn't enough space locally to make another datastore. I did watch the vmkernel.log on the blade directly and saw this, which is a fimware bug, I see a newer version I'll have to try
Sometimes an older version might do the trick. I saw several other articles on this error.