Our Meraki network switches had their firmware updated, and all of a sudden 3 out of 4 ESXis lost connectivity to an iSCSI target.
(The target is a Dell r730 server running Ubuntu 22.04.3 LTS with the sole purpose of being iSCSI storage for a VMware cluster. The ESXi hosts are 7.03.)
When scanning the iSCSI storage adapter on an ESXi host that can no longer mount the datastore, the host appears to recognize the LUN: it adds an item to "static targets" - based I am assuming on scanning the dynamic ones:
(The device / target / LUN is highlighted in green.)
Yet I don't see it presented as a "device" though (which I could mount as a datastore, or where I could create one) under "devices":
For comparison, here is one of the ESXis that can see the device:
It shows as "degraded" (probably because of lack of NIC redundancy - where would I look to confirm?) - yet it does show up, and I can seemingly create a datastore on that target.
I also spun up an older standalone ESXi 6.7 and it can also see the device. My Windows desktop - ditto.
How would I troubleshoot this issue on the ESXi hosts that can't seem to recognize the iSCSI target as a valid device?
Thanks!
P.S. (Edit) '/var/log/vobd.log' has a number of these pointing to a network configuration issue:
2023-08-04T23:31:20.246Z: [iscsiCorrelator] 624003451802us: [vob.iscsi.target.connect.error] vmhba64 @ vmk1 failed to login to iqn.1988-11.com.dell:01.array.bc305bf24e32 because of a network connection failure.
2023-08-04T23:31:20.246Z: [iscsiCorrelator] 624000365087us: [esx.problem.storage.iscsi.target.connect.error] Login to iSCSI target iqn.1988-11.com.dell:01.array.bc305bf24e32 on vmhba64 @ vmk1 failed. The iSCSI initiator could not establish a network connection to the target.
2023-08-07T16:25:21.187Z: [iscsiCorrelator] 857645568069us: [vob.iscsi.discovery.connect.error] discovery failure on vmhba64 to r730b-00.datastores.infra.<masked>.com because of a network connection failure.
2023-08-07T16:25:21.187Z: [iscsiCorrelator] 857641306178us: [esx.problem.storage.iscsi.discovery.connect.error] iSCSI discovery to r730b-00.datastores.infra.<masked>.com on vmhba64 failed. The iSCSI Initiator could not establish a network connection to the discovery address.
What command could I run on the affected ESXi hosts to confirm lack of necessary connectivity to the target?
Resolved, all four ESXis can now see the iSCSI target in question.
Misconfigured iSCSI Port Binding configuration was likely one of the culprits, as @a_p_ suggested.
The other one was likely some stale configuration where rebooting, rescanning, detaching stale devices, rebooting, rescanning several times - seems to have helped.
Thank you all very very much for the help! (Been at it for quite some time, feels good to get some progress.)
The culprit wasn't network port binding - it seems to have something to do with ESXis holding on to stale paths and items in Dynamic and Static Discovery.
Here are some tests I did:
To sum up, if an iSCSI datastore disappeared on an ESXi host and isn't showing up no matter how many times you reboot or rescan, try this:
This worked for me and I hope this works for someone else in a similar situation.
