marauli's Posts

5 hours into this rabbit hole (figuring out how to download, deploy and configure this thing), it looks like it requires a peer vCenter instance? ... which something we don't have: All sites ... See more...
5 hours into this rabbit hole (figuring out how to download, deploy and configure this thing), it looks like it requires a peer vCenter instance? ... which something we don't have: All sites and ESXis are managed with a single central vCenter instance. I.e. vSphere Replication is not for us? P.S. The "vSphere Replication" article you linked to lists this as one of the use cases: Data protection locally, within a single site Does this use case still require a "peer" vCenter instance? If not: Does it do what I need it to do? (Replicate VMs between two hosts in a remote site.) is there a decent "explain this to me like I am a 10-year-old" article telling me how to configure it, what the prerequisites are? Is it an overkill for this use case? (Deploying and configuring this appliance to each of the 15+ remote sites where there are limited support resources, i.e. can't afford spending hours on configuring and maintaining it.)
Looks like with respect to scripting, there are fairly similar questions already covered here: Clone a VM with a script ...Where a PowerCLI script could check for an existing clone, delete it if ... See more...
Looks like with respect to scripting, there are fairly similar questions already covered here: Clone a VM with a script ...Where a PowerCLI script could check for an existing clone, delete it if needed, start a new clone. "Scheduled Tasks" was a promising option until it became clear a new clone would fail if there's an existing clone with the same name, and that this existing clone couldn't be renamed or deleted via "Scheduled Tasks". The two remaining questions are: What are the best practices in hosting such a script? (Where to run from - some random Windows machine?, how to securely store needed credentials, how to maintain it - e.g. track changes.) Does vSphere DR/HA offer a good alternative? (I.e. can this task be automated there?)
What VM replication or cloning mechanism is most suitable for remote sites each running two ESXi hosts with local storage, with the goal of protecting the site from a single host failure? Existing ... See more...
What VM replication or cloning mechanism is most suitable for remote sites each running two ESXi hosts with local storage, with the goal of protecting the site from a single host failure? Existing license: vSphere Standard for Retail and Branch Offices (no DR or HA included). 15+ remote sites, each with two ESXi hosts. Each ESXi host runs a unique set of VMs. Each host has local storage only. No shared storage is used. The objective is to protect sites from single host failures by regularly (once a day) replicating all VMs across hosts within each site. If one of the hosts goes down or loses VMs (e.g. if its internal storage get corrupted), the other host would be able to bring those VM clones (replicas) up. All sites and ESXis are managed with a single central vCenter instance. What I imagine (being fairly new to VMware) is something like this: Clone all VMs from one host to the other (and the other way around), replacing previous clones, every night at 3am. (The "replacing previous clones" part is a fairly important as the hosts have limited storage, and are unlikely to accommodate more than one clone of each VM from the other side.) Should one of the hosts fail, powering up the clones can (and should) be done manually, as some reconfiguration will be needed. I.e. there is no requirement for the time being for the cloned VMs to be powered up automatically should the original ones fail. The mechanism has the added benefit of snapshot-like short-term data backup: should data loss occur within one of the VMs (e.g. due to human error), that data can be restored from the clone of the VM made the previous night. (Snapshots are generally better for this purpose, of course - I am just describing one of the considerations.) If this cloning can be done differentially, i.e. by updating the changed data only in the previously made clone - that would be awesome. Can this be done exactly like described above using any of the vSphere built-in services like DR or HA? (If not exactly like described above, what's the next best option?) If not, can something like this be scripted fairly easily, especially the "replace existing clones during cloning" part? Thanks!
The culprit wasn't network port binding - it seems to have something to do with ESXis holding on to stale paths and items in Dynamic and Static Discovery. Here are some tests I did: Enable (delibe... See more...
The culprit wasn't network port binding - it seems to have something to do with ESXis holding on to stale paths and items in Dynamic and Static Discovery. Here are some tests I did: Enable (deliberately misconfigured) Network Port Binding (on unrelated vmks not connected to the iSCSI targets in question, and on vmks not in the same local subnet - i.e. a deliberate misconfiguration to see if this would cause issues with an iSCSI target and datastore attached to a different vmk) The datastore on the iSCSI target in question is still accessible and browsable Rescanned the adapter - no change Rebooted the ESXi host - no change Disconnected the datastore in question by removing items in static and dynamic discovery and rescanning the adapter The datastore in question is no longer accessible Added needed items to the Dynamic Discovery, rescanned the adapter The device and the datastore showed up (I think this demonstrates that misconfigured Network Port Binding had no effect on this issue.) CHAP misconfiguration resulting in losing access to the datastore, restored only by removing and re-adding the datastore - by removing relevant items in Dynamic and Static Discovery, rescanning, and then re-adding them. Configured CHAP (incoming) authentication on the target (no changes on the ESXi hosts - not yet) The datastore became inaccessible Tried to configure one of the ESXi hosts to use CHAP authentication. No luck - likely something I was doing wrong. Or, possibly, it's the same "stale info" issue and I needed to remove and re-add the target before the ESXis could access the datastore with CHAP configured. Cleared CHAP configs on both sides, restarted iSCSI service on the target, rescanned the adapter - no change. Rebooted the target, rebooted the ESXi host, verified iSCSI configuration, rescanned several times - no change. Datastore still inaccessible.  Removed the relevant items in the Dynamic and Static Discovery, rescanned the adapter - thus effectively removing the datastore. Re-added them on one (not all) of the ESXi hosts. The datastore magically showed up. Surprisingly, it also showed up and became accessible on all other ESXi hosts - without any changes there - without even rescanning or refreshing anything. (Weird, isn't it?) I think this was the original issue I experienced. The datastore was available and could be connected all along, I just needed to not just rescan the adapter, but perform the above steps of effectively removing the iSCSI target, re-scanning, re-adding it, and hoping the device and the datastore would show up. To sum up, if an iSCSI datastore disappeared on an ESXi host and isn't showing up no matter how many times you reboot or rescan, try this: Remove the target and the datastore by: removing relevant items in Static and Dynamic Discovery tabs in "iSCSI Software Adapter" configuration on an ESXi host re-scanning the adapter and confirm the relevant device(s) and the datastore are gone Re-add the target by re-adding the relevant items back to Static and Dynamic Discovery, and re-scanning the adapter. This worked for me and I hope this works for someone else in a similar situation.
Resolved, all four ESXis can now see the iSCSI target in question. Misconfigured iSCSI Port Binding configuration was likely one of the culprits, as @a_p_ suggested. The other one was likely some s... See more...
Resolved, all four ESXis can now see the iSCSI target in question. Misconfigured iSCSI Port Binding configuration was likely one of the culprits, as @a_p_ suggested. The other one was likely some stale configuration where rebooting, rescanning, detaching stale devices, rebooting, rescanning several times - seems to have helped. Thank you all very very much for the help! (Been at it for quite some time, feels good to get some progress.)
did you ever perform a Rescan operation from your ESXi Hosts (or Cluster wide) to verify if the target (and it's LUNs) are still visible after the Rescan? At least a few dozen times - I rescan a... See more...
did you ever perform a Rescan operation from your ESXi Hosts (or Cluster wide) to verify if the target (and it's LUNs) are still visible after the Rescan? At least a few dozen times - I rescan any time I try any change with a network or storage configuration. I have a feeling something on the iSCSI target (r730b) might be misconfigured given it disappeared even from ESXis it was present before, while my test iSCSI target (on the same Ubuntu version) seems to be working OK, visible on multiple ESXis. Will have to dive into Ubuntu networking and iSCSI (tgt) configuration.
yet it doesn't appear to affect anything. I was wrong on that one. Put one of the affected hosts in maintenance, removed the binding, re-scanned the adapter - and while the ESXi still doesn't see... See more...
yet it doesn't appear to affect anything. I was wrong on that one. Put one of the affected hosts in maintenance, removed the binding, re-scanned the adapter - and while the ESXi still doesn't see the iSCSI target in question, it does now see a test one I just created yesterday on an Ubuntu VM. (In green are the paths to the new test iSCSI target, in red - to iSCSI target the hosts can no longer see.) Another new development is that the 4th ESXI that used to see the iSCSI target in question - can no longer see it. (The only change was the server running that iSCSI storage - Ubuntu Linux on Dell r730 - was updated via 'apt-get upgrade' and rebooted today.) With all this, the culprit is possibly with the Ubuntu Linux iSCSI service configuration (tgt)?
Agreed, that was my bad - enabled it before reading up on it - but after this issue occurred - yet it doesn't appear to affect anything. First three hosts have the same configuration, the fourth one... See more...
Agreed, that was my bad - enabled it before reading up on it - but after this issue occurred - yet it doesn't appear to affect anything. First three hosts have the same configuration, the fourth one doesn't (as it's not connected to the direct-attached ME4024 iSCSI target, unlike the first three). As soon as I can put one of them in maintenance, I'll remove the binding, see if that does anything - but am not expecting it to. (Putting it in maintenance is tricky as our cluster is running low on memory - so I am trying to tread gently.)
Are you able to ping the iSCSI target(s) ip addresses using vmkping -I vmk0 -d -s 1450 <ip address iSCSI target)? [root@***-ESXi-01:~] vmkping -I vmk0 -d -s 1450 <IP> PING <IP> (<IP>): 1450 data... See more...
Are you able to ping the iSCSI target(s) ip addresses using vmkping -I vmk0 -d -s 1450 <ip address iSCSI target)? [root@***-ESXi-01:~] vmkping -I vmk0 -d -s 1450 <IP> PING <IP> (<IP>): 1450 data bytes 1458 bytes from <IP>: icmp_seq=0 ttl=64 time=0.224 ms 1458 bytes from <IP>: icmp_seq=1 ttl=64 time=0.277 ms 1458 bytes from <IP>: icmp_seq=2 ttl=64 time=0.278 ms --- <IP> ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.224/0.260/0.278 ms How do the NIC stats look like using esxcli network nic stats get -n vmnicX? NIC statistics for vmnic0 Packets received: 1696266074 Packets sent: 2119143975 Bytes received: 645191082460 Bytes sent: 2243595138013 Receive packets dropped: 0 Transmit packets dropped: 0 Multicast packets received: 108342319 Broadcast packets received: 30483370 Multicast packets sent: 1932358 Broadcast packets sent: 1573782 Total receive errors: 0 Receive length errors: 0 Receive over errors: 0 Receive CRC errors: 0 Receive frame errors: 0 Receive FIFO errors: 0 Receive missed errors: 0 Total transmit errors: 0 Transmit aborted errors: 0 Transmit carrier errors: 0 Transmit FIFO errors: 0 Transmit heartbeat errors: 0 Transmit window errors: 0 NIC statistics for vmnic1 Packets received: 471521582 Packets sent: 212317143 Bytes received: 314177704382 Bytes sent: 80328464786 Receive packets dropped: 0 Transmit packets dropped: 0 Multicast packets received: 110083599 Broadcast packets received: 32038626 Multicast packets sent: 257459 Broadcast packets sent: 19461 (all "error" values - 0) NIC statistics for vmnic2 Packets received: 140071964 Packets sent: 0 Bytes received: 13097333291 Bytes sent: 0 Receive packets dropped: 0 Transmit packets dropped: 0 Multicast packets received: 107577113 Broadcast packets received: 32255292 Multicast packets sent: 0 Broadcast packets sent: 0 (all "error" values - 0) NIC statistics for vmnic3 Packets received: 140073820 Packets sent: 0 Bytes received: 13097515909 Bytes sent: 0 Receive packets dropped: 0 Transmit packets dropped: 0 Multicast packets received: 107578538 Broadcast packets received: 32255717 Multicast packets sent: 0 Broadcast packets sent: 0 (all "error" values - 0) NIC statistics for vmnic4 Packets received: 12515272569 Packets sent: 2463924551 Bytes received: 17668531459730 Bytes sent: 2695041739606 Receive packets dropped: 0 Transmit packets dropped: 0 Multicast packets received: 0 Broadcast packets received: 2 Multicast packets sent: 0 Broadcast packets sent: 1699 (all "error" values - 0) NIC statistics for vmnic5 Packets received: 140778 Packets sent: 266174 Bytes received: 16919128 Bytes sent: 24241914 Receive packets dropped: 0 Transmit packets dropped: 0 Multicast packets received: 0 Broadcast packets received: 2 Multicast packets sent: 0 Broadcast packets sent: 1695 (all "error" values - 0)  
Sorry about that - it's all very confusing to me too. vmk1 and 2 should not handle connections to the iSCSI target in question (located on a switched network). Only vmk0 can handle those as that's t... See more...
Sorry about that - it's all very confusing to me too. vmk1 and 2 should not handle connections to the iSCSI target in question (located on a switched network). Only vmk0 can handle those as that's the only adapter with a path to it. (vmk1 and 2 are adapters for direct-attached connections only, i.e. for targets not on a switched network) If that still doesn't clear it up - let me know.
Could this be relevant? All iSCSI connection failures occur on vmk1 and vmk2 (which have no network connection to the target), and there isn't anything for vmk1 - which is the NIC that the has the ne... See more...
Could this be relevant? All iSCSI connection failures occur on vmk1 and vmk2 (which have no network connection to the target), and there isn't anything for vmk1 - which is the NIC that the has the network path to the iSCSI target in question.   2023-08-07T22:49:25.324Z: [iscsiCorrelator] 569241542us: [esx.problem.storage.iscsi.target.connect.error] Login to iSCSI target iqn.1988-11.com.dell:01.array.bc305bf24e32 on vmhba64 @ vmk2 failed. The iSCSI initiator could not establish a network connection to the target. 2023-08-07T22:49:25.325Z: [iscsiCorrelator] 569233690us: [vob.iscsi.target.connect.error] vmhba64 @ vmk2 failed to login to iqn.1988-11.com.dell:01.array.bc305bf24e32 because of a network connection failure. 2023-08-07T22:49:25.326Z: [iscsiCorrelator] 569242709us: [esx.problem.storage.iscsi.target.connect.error] Login to iSCSI target iqn.1988-11.com.dell:01.array.bc305bf24e32 on vmhba64 @ vmk2 failed. The iSCSI initiator could not establish a network connection to the target. 2023-08-07T22:49:25.326Z: [iscsiCorrelator] 569234520us: [vob.iscsi.target.connect.error] vmhba64 @ vmk1 failed to login to iqn.1988-11.com.dell:01.array.bc305bf24e32 because of a network connection failure. 2023-08-07T22:49:25.326Z: [iscsiCorrelator] 569243520us: [esx.problem.storage.iscsi.target.connect.error] Login to iSCSI target iqn.1988-11.com.dell:01.array.bc305bf24e32 on vmhba64 @ vmk1 failed. The iSCSI initiator could not establish a network connection to the target. 2023-08-07T22:49:25.327Z: [iscsiCorrelator] 569235334us: [vob.iscsi.target.connect.error] vmhba64 @ vmk1 failed to login to iqn.1988-11.com.dell:01.array.bc305bf24e32 because of a network connection failure. 2023-08-07T22:49:25.327Z: [iscsiCorrelator] 569244253us: [esx.problem.storage.iscsi.target.connect.error] Login to iSCSI target iqn.1988-11.com.dell:01.array.bc305bf24e32 on vmhba64 @ vmk1 failed. The iSCSI initiator could not establish a network connection to the target.   vmk1 and vmk2 are kernel NICs dedicated to direct-attached iSCSI connections. They do not have a path to the iSCSI target in question that is on a switched network. I guess I am puzzled why doesn't the ESXi attempt to connect to the target using vmk0, and what I can do to force it to.  
Sorry it's taking me so long to respond! (Was fighting other fires.) i. Could you confirm that it was only the switches firmware that was updated, and that none of, or any other parts of the infra... See more...
Sorry it's taking me so long to respond! (Was fighting other fires.) i. Could you confirm that it was only the switches firmware that was updated, and that none of, or any other parts of the infrastructure or configuration was changed (i.e. that switches swapped out, switches config, routing, cabling, and any config on the ESXi or iSCSI target were not changed in any way, what so ever) ? To my knowledge, just the firmware - although can't be 100% sure. The network infra is handled by someone else, and I have limited access to it. On VMware side - I do have full access, and do not see any changes made to ESXis or to the iSCSI system. (One possible relevant bit of info: the affected ESXis (3 out of 4) pre-date me joining the team, i.e. configured by someone else originally. The last one, that is unaffected by the change - added to the cluster and configured by yours truly. I pored over network config pages on all ESXis trying to zero in on what could be different between the affected and unaffected ESXis - can't find anything. Did the same in Meraki - ditto.) ii. How many switches are involved, if more than one, which ESXi and iSCSI target is connected to which switch ? Around 4-5: there are four 10Gb ports on each switch, each connected system uses 2 of them, and between 4 ESXis and 1 iSCSI target, ten 10Gb ports are used across a number of switches. iii. Is there only one iSCSI target for all of the four ESXi servers ? Two for the 1st three ESXis, one for the last one. The 2nd one is a Dell/EMC ME4024 flash array direct-attached (via direct 10Gb links, no switches involved) to the 1st three ESXis. iv. Are all of the switches at the same firmware version ? v. Are all the switches the same make/model/version ? vi. Have you reviewed the switches firmware version update notes to determine what was changed ? Yes, yes; see no notes. vii. Are you using VLANs ? We do use VLANs, and the ports on the switches are configured the same away across all ESXis and the iSCSI target - at least while we're troubleshooting the issue:   Type Trunk Native VLAN <masked> Allowed VLANs all Access policy Open     viii. You mentioned that you spun up an 'older standalone ESXi 6.7' and a 'Windows Desktop' both of which could "also see the device".  Would I be correct to assume that 'see the device' to mean that they could see the iSCSI target in question, and mount the storage ? Correct. ix. How many network connections do the each of the ESXi servers have ? x. How many network connections does the iSCSI target have ? The first three (affected) ESXis: six total: two 10Gb NICs for general traffic, vMotion, switched iSCSI two 1Gb legacy NICs: still connected but no longer active (no attached port groups, vSwitches or VMkernel adapters are connected to them) two 10Gb NICs for direct-attached iSCSI (ME4024 mentioned above) The 4th (unaffected): two 10Gb NICs for general traffic, vMotion, switched iSCSI (it's not connected to ME4024) iSCSI target: two 10Gb NICs; only one active now; the 2nd one is connected to a switch port that was disabled by our network admin as for some reason there were IP conflict alarms from Meraki on the two ports for the target despite no apparent conflicts (the IPs are different). - I would suggest moving one of the ESXi's that cannot connect to iSCSI target to one of the switches network ports that you know is working (either the, ESXi that works after the switches firmware upgrade, ESXi 6.7 or Win Desktop (assuming 'viii' to be correct)). - As an experiment, you could exclude the switches from the equation altogether and make an appropriate direct connection . . or alternatively via another type/make of switch. Thank you! I'll check with the network admin on both options.
@marauli wrote: The question then is, how do I bypass those "vmware.esximage.Errors.MetadataFormatError: File is not a zip file" errors and patch the server? Turns out, I needed to reboot the... See more...
@marauli wrote: The question then is, how do I bypass those "vmware.esximage.Errors.MetadataFormatError: File is not a zip file" errors and patch the server? Turns out, I needed to reboot the host not once but twice to bypass 'esxupdate error codes: -1' and all the related errors in the log. The original question is a bad one as the article and the script are not applicable to ESXi 6.7, it seems.
My bad: the article suggesting to run the script only applies to ESXi 7, and this one is 6.7. The question then is, how do I bypass those "vmware.esximage.Errors.MetadataFormatError: File is not a z... See more...
My bad: the article suggesting to run the script only applies to ESXi 7, and this one is 6.7. The question then is, how do I bypass those "vmware.esximage.Errors.MetadataFormatError: File is not a zip file" errors and patch the server?
On ESXi, 6.7.0, 19898906, trying to run 'cleanup_patch_store.sh' suggested in "VMware vSphere Lifecycle Manager fails to download patches after vCenter Server upgrade from 5.5.x up through to 7.0.x (... See more...
On ESXi, 6.7.0, 19898906, trying to run 'cleanup_patch_store.sh' suggested in "VMware vSphere Lifecycle Manager fails to download patches after vCenter Server upgrade from 5.5.x up through to 7.0.x (80227)", results in:     ./cleanup_ patch_store.sh Traceback (most recent call last): File "./worker.py", line 10, in <module> from vumUtils import parseConfigFile ImportError: No module named 'vumUtils'     What do I need to do to fix this? --- Background: can't update the server using vCenter because of:   The host returns esxupdate error codes: -1. Check the Lifecycle Manager log files and esxupdate log files for more details.   ... and   2023-08-25T16:09:20Z esxupdate: 2100557: Metadata.pyc: INFO: Reading metadata zip /tmp/tmp6hx5xtx4 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: An esxupdate error exception was caught: 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: Traceback (most recent call last): 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: File "/build/mts/release/bora-19898906/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Metadata.py", line 64, in ReadMetadataZip 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: File "/build/mts/release/bora-19898906/bora/build/esx/release/vmvisor/sys-boot/lib64/python3.5/zipfile.py", line 1026, in __init__ 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: File "/build/mts/release/bora-19898906/bora/build/esx/release/vmvisor/sys-boot/lib64/python3.5/zipfile.py", line 1093, in _RealGetContents 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: zipfile.BadZipFile: File is not a zip file 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: During handling of the above exception, another exception occurred: 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: Traceback (most recent call last): 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: File "/usr/sbin/esxupdate", line 239, in main 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: cmd.Run() 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: File "/build/mts/release/bora-19898906/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esx5update/Cmdline.py", line 105, in Run 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: File "/build/mts/release/bora-19898906/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Transaction.py", line 88, in DownloadMetadatas 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: File "/build/mts/release/bora-19898906/bora/build/esx/release/vmvisor/esxupdate/lib64/python3.5/site-packages/vmware/esximage/Metadata.py", line 68, in ReadMetadataZip 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: ERROR: vmware.esximage.Errors.MetadataFormatError: File is not a zip file 2023-08-25T16:09:20Z esxupdate: 2100557: esxupdate: DEBUG: <<<   ... which lead me to the above mentioned article and the instructions to download and run a cleanup script. I got through this error:   -sh: ./cleanup_patch_store.sh: Permission denied   ... by running this:   chmod +x cleanup_patch_store.sh   I even got through this error thanks to e.g. "run bash script" discussion (thank you!):   -sh: ./cleanup_patch_store.sh: not found   ... by updating the first line in the script from   #!/bin/bash   to:   #!/bin/sh   ... and now I am getting the "ImportError: No module named 'vumUtils'" error. (Do I really need to install 'pip' or something like that and install that python package?) Appreciate any help!  
From what I can tell, no jumbo frames. MTU size is the default 1500.
Pinging and port connectivity - no issues, e.g.: [***@*****ESXi01:~] nc -z <iSCSI tgt IP> 3260 Connection to <iSCSI tgt IP> 3260 port [tcp/*] succeeded!
Thank you - some of the steps in those articles may be exactly what I was looking for - I'll try them. Check network connectivity: vmkping I < SW iSCSI vmkernel > Target_IP Check SW iSCSI port: nc z... See more...
Thank you - some of the steps in those articles may be exactly what I was looking for - I'll try them. Check network connectivity: vmkping I < SW iSCSI vmkernel > Target_IP Check SW iSCSI port: nc z < Target_IP > 3260
Thanks @trobertson! The switches in question are MS225-48FP (the ESXis and iSCSI devices are connecting to the 10Gb ports on them) and I don't believe it's an option to try a different switch. Perha... See more...
Thanks @trobertson! The switches in question are MS225-48FP (the ESXis and iSCSI devices are connecting to the 10Gb ports on them) and I don't believe it's an option to try a different switch. Perhaps what I am looking for is something like `telnet <iSCSI target's IP> 3260` from the affected ESXi host. If this gets dropped or times out - whereas the behavior on an unaffected ESXi is different - that would confirm the issue is likely with the switches (or less likely, in the ESXi network configuration, or the iSCSI target's firewall.) I.e. basic troubleshooting steps from someone more knowledgeable about ESXis.
Our Meraki network switches had their firmware updated, and all of a sudden 3 out of 4 ESXis lost connectivity to an iSCSI target. (The target is a Dell r730 server running Ubuntu 22.04.3 LTS with t... See more...
Our Meraki network switches had their firmware updated, and all of a sudden 3 out of 4 ESXis lost connectivity to an iSCSI target. (The target is a Dell r730 server running Ubuntu 22.04.3 LTS with the sole purpose of being iSCSI storage for a VMware cluster. The ESXi hosts are 7.03.) When scanning the iSCSI storage adapter on an ESXi host that can no longer mount the datastore, the host appears to recognize the LUN: it adds an item to "static targets" - based I am assuming on scanning the dynamic ones: (The device / target / LUN is highlighted in green.) Yet I don't see it presented as a "device" though (which I could mount as a datastore, or where I could create one) under "devices": For comparison, here is one of the ESXis that can see the device: It shows as "degraded" (probably because of lack of NIC redundancy - where would I look to confirm?) - yet it does show up, and I can seemingly create a datastore on that target. I also spun up an older standalone ESXi 6.7 and it can also see the device. My Windows desktop - ditto. How would I troubleshoot this issue on the ESXi hosts that can't seem to recognize the iSCSI target as a valid device? Thanks! P.S. (Edit) '/var/log/vobd.log' has a number of these pointing to a network configuration issue:   2023-08-04T23:31:20.246Z: [iscsiCorrelator] 624003451802us: [vob.iscsi.target.connect.error] vmhba64 @ vmk1 failed to login to iqn.1988-11.com.dell:01.array.bc305bf24e32 because of a network connection failure. 2023-08-04T23:31:20.246Z: [iscsiCorrelator] 624000365087us: [esx.problem.storage.iscsi.target.connect.error] Login to iSCSI target iqn.1988-11.com.dell:01.array.bc305bf24e32 on vmhba64 @ vmk1 failed. The iSCSI initiator could not establish a network connection to the target. 2023-08-07T16:25:21.187Z: [iscsiCorrelator] 857645568069us: [vob.iscsi.discovery.connect.error] discovery failure on vmhba64 to r730b-00.datastores.infra.<masked>.com because of a network connection failure. 2023-08-07T16:25:21.187Z: [iscsiCorrelator] 857641306178us: [esx.problem.storage.iscsi.discovery.connect.error] iSCSI discovery to r730b-00.datastores.infra.<masked>.com on vmhba64 failed. The iSCSI Initiator could not establish a network connection to the discovery address.   What command could I run on the affected ESXi hosts to confirm lack of necessary connectivity to the target?