I have an issue in my testing environment where I have 3 ESXi 4.1 Update 2 hosts with 2 single port fibre card installed and 1 ESX4.0 host with 1 single port fibre card installed, connected to the SAN in the manner outlined in the graphic.
The 3 ESXi 4.1 hosts are connected in the same manner as the 4.0 host but the extra hba is connected to the other SAN switch. I hope this makes sense, please let me know if it doesn’t!
When adding a LUN as VMFS Datastore to any of the ESXi4.1 hosts it will add no problems regardless of which controller the LUN is using as its preferred path. The SAN by the way is a DS3500, with dual controllers A and B.
There are two HOST groups created on the SAN, 1 for the 3 ESXi 4.1 hosts and 1 with the single ESX4.0 host.
My problem is adding a VMFS Datastore to the ESX 4.0, when the LUN is using Controller B as its path, the Datastore creation fails at the point where it is creating the store. i.e the HOST can see the LUN and it is available for selection with adding storage to the host. When adding the storage to the host and the LUN path is controller B the GUI add wizard is slow and eventually fails at the creating VMFS partition step.
If however the LUN path is via controller A the add storage wizard is snappy and the Datastore is created successfully.
I’m pretty sure my zoning is correct because the HOST can actually see the LUN, it just errors out when adding as a Datastore.
Is my problem the fact that ESX4.0 host only has the one hba? I’m not so sure about this because I have another environment where this is not an issue.
Cheers
David
can you check /var/log/vmkernel in esx4.0 host?
Hi John
I have copied the message log after attemping to add a datastore using a LUN on controller B. lot of erros.
Cheers
David
Dec 12 09:09:08 TEMP kernel: [313066.903613] Vendor: IBM Model: 1746 FAStT Rev: 1070
Dec 12 09:09:08 TEMP kernel: [313066.910841] Type: Direct-Access ANSI SCSI revision: 05
Dec 12 09:09:08 TEMP kernel: [313066.946492] SCSI device sdg: 10485760 512-byte hdwr sectors (5369 MB)
Dec 12 09:09:08 TEMP kernel: [313066.954043] sdg: Write Protect is off
Dec 12 09:09:08 TEMP kernel: [313066.962047] SCSI device sdg: drive cache: write back w/ FUA
Dec 12 09:09:08 TEMP kernel: [313066.969731] SCSI device sdg: 10485760 512-byte hdwr sectors (5369 MB)
Dec 12 09:09:08 TEMP kernel: [313066.977325] sdg: Write Protect is off
Dec 12 09:09:08 TEMP kernel: [313066.985083] SCSI device sdg: drive cache: write back w/ FUA
Dec 12 09:09:08 TEMP kernel: [313066.992273] sdg:end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:08 TEMP kernel: [313067.503176] printk: 11 messages suppressed.
Dec 12 09:09:08 TEMP kernel: [313067.510341] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:09 TEMP kernel: [313068.019626] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:09 TEMP kernel: [313068.026825] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:09 TEMP kernel: [313068.536162] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:09 TEMP kernel: [313068.543383] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:10 TEMP kernel: [313069.052752] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:10 TEMP kernel: [313069.059984] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:10 TEMP kernel: [313069.569496] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:10 TEMP kernel: [313069.576682] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:11 TEMP kernel: [313070.085851] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:11 TEMP kernel: [313070.093015] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:11 TEMP kernel: [313070.100284] unable to read partition table
Dec 12 09:09:11 TEMP kernel: [313070.107706] sd 3:0:8:0: Attached scsi disk sdg
Dec 12 09:09:11 TEMP kernel: [313070.114958] sd 3:0:8:0: Attached scsi generic sg9 type 0
Dec 12 09:09:12 TEMP kernel: [313071.152288] end_request: I/O error, dev sdg, sector 10485632
Dec 12 09:09:12 TEMP kernel: [313071.159476] Buffer I/O error on device sdg, logical block 1310704
Dec 12 09:09:12 TEMP kernel: [313071.668848] end_request: I/O error, dev sdg, sector 10485632
Dec 12 09:09:12 TEMP kernel: [313071.676014] Buffer I/O error on device sdg, logical block 1310704
Dec 12 09:09:13 TEMP kernel: [313072.185378] end_request: I/O error, dev sdg, sector 10485752
Dec 12 09:09:13 TEMP kernel: [313072.192598] Buffer I/O error on device sdg, logical block 1310719
Dec 12 09:09:13 TEMP kernel: [313072.701984] end_request: I/O error, dev sdg, sector 10485752
Dec 12 09:09:13 TEMP kernel: [313072.709206] Buffer I/O error on device sdg, logical block 1310719
Dec 12 09:09:14 TEMP kernel: [313073.218512] end_request: I/O error, dev sdg, sector 10485752
Dec 12 09:09:14 TEMP kernel: [313073.225714] Buffer I/O error on device sdg, logical block 1310719
Dec 12 09:09:14 TEMP kernel: [313073.735050] end_request: I/O error, dev sdg, sector 10485752
Dec 12 09:09:15 TEMP kernel: [313074.251620] end_request: I/O error, dev sdg, sector 10485752
Dec 12 09:09:15 TEMP sfcb[4082]: storelib Physical Device Device ID : 0x9
Dec 12 09:09:15 TEMP last message repeated 20 times
Dec 12 09:09:15 TEMP kernel: [313074.768183] end_request: I/O error, dev sdg, sector 10485752
Dec 12 09:09:16 TEMP kernel: [313075.284809] end_request: I/O error, dev sdg, sector 10485696
Dec 12 09:09:16 TEMP kernel: [313075.801274] end_request: I/O error, dev sdg, sector 10485744
Dec 12 09:09:17 TEMP kernel: [313076.317832] end_request: I/O error, dev sdg, sector 10485752
Dec 12 09:09:17 TEMP kernel: [313076.834411] end_request: I/O error, dev sdg, sector 10485752
Dec 12 09:09:18 TEMP kernel: [313077.350938] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:19 TEMP kernel: [313077.867524] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:19 TEMP kernel: [313077.874752] printk: 8 messages suppressed.
Dec 12 09:09:19 TEMP kernel: [313077.881921] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:19 TEMP kernel: [313078.400820] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:20 TEMP kernel: [313078.917292] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:20 TEMP kernel: [313079.433835] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:21 TEMP kernel: [313079.950409] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:21 TEMP kernel: [313080.466961] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:22 TEMP kernel: [313080.983517] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:22 TEMP kernel: [313081.500061] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:23 TEMP kernel: [313082.016644] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:23 TEMP kernel: [313082.533170] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:23 TEMP kernel: [313082.540378] printk: 8 messages suppressed.
Dec 12 09:09:23 TEMP kernel: [313082.547556] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:24 TEMP kernel: [313083.066397] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:24 TEMP kernel: [313083.582944] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:25 TEMP kernel: [313084.099584] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:25 TEMP kernel: [313084.616080] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:26 TEMP kernel: [313085.132638] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:26 TEMP kernel: [313085.649177] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:27 TEMP kernel: [313086.165750] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:27 TEMP kernel: [313086.682332] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:28 TEMP kernel: [313087.194647] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:28 TEMP kernel: [313087.715422] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:28 TEMP kernel: [313087.722586] printk: 9 messages suppressed.
Dec 12 09:09:28 TEMP kernel: [313087.729790] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:29 TEMP kernel: [313088.248635] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:29 TEMP kernel: [313088.765189] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:30 TEMP kernel: [313089.281739] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:30 TEMP kernel: [313089.798443] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:31 TEMP kernel: [313090.314866] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:32 TEMP kernel: [313090.831416] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:32 TEMP kernel: [313091.347979] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:33 TEMP kernel: [313091.981157] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:33 TEMP kernel: [313092.497799] end_request: I/O error, dev sdg, sector 0
Dec 12 09:09:33 TEMP kernel: [313092.504990] printk: 11 messages suppressed.
Dec 12 09:09:33 TEMP kernel: [313092.512178] Buffer I/O error on device sdg, logical block 0
Dec 12 09:09:59 TEMP sfcb[4082]: storelib Physical Device Device ID : 0x9
Dec 12 09:10:45 TEMP last message repeated 42 times
End of log reached.
I am not able to make any relation on the log messages, Its just prompting for /dev/sdg....
Is it the same volume which you trying to add
Thanks for looking over the log John.
That log is me trying to add just the one volume. Although other attemps at adding different volumes had failed in the same manner.
Not sure if i mentioned in my original post but the volume that failed to add the the ESX 4.0 host with the one HBA card, added just fine to the esxi 4.1 host that had dual hba cards, one connected to each switch. I'm at a loss, tempted to install another hba into the ESX4.0 host just to see if that fixes it.
Cheers
David
Assuming the SAN is cabled as per your diagram, i.e. the switch the ESX 4.0 host is connected to is also connected to both controller A and controller B then it would suggest there is an issue with the zonng.
This is the typical symptoms when the LUN is seen only via the controller NOT ownng the LUN on IBM DS3000 and DS4000 Storage Systems.
Can you post the zoning configuration from the switch, assuming it is an IBM B series switch can you post the output from switchshow and zoneshow and let me know whch ports the DS3500 is conneced to and which port the ESX 4.0 hosts is connected to.
All the help is much appreciated. switchshow and zoneshow as requested.
Once again, thank you.
Cheers
David
IBM_2498_24E:admin> switchshow
switchName: IBM_2498_24E
switchType: 71.2
switchState: Online
switchMode: Native
switchRole: Principal
switchDomain: 1
switchId: fffc01
switchWwn: 10:00:00:05:33:4e:21:33
zoning: ON (TEMP_Citrix_2_SAN)
switchBeacon: OFF
Index Port Address Media Speed State Proto
==============================================
0 0 010000 id N8 Online FC F-Port 10:00:00:05:1e:fb:d2:92
1 1 010100 id N8 Online FC F-Port 10:00:00:05:1e:fb:ce:f1
2 2 010200 id N8 Online FC F-Port 10:00:00:05:1e:fb:d2:96
3 3 010300 id N8 Online FC F-Port 20:54:00:80:e5:24:1e:12
4 4 010400 id N8 Online FC F-Port 20:55:00:80:e5:24:1e:12
5 5 010500 id N8 Online FC F-Port 10:00:00:05:1e:fb:ce:ee
6 6 010600 -- N8 No_Module FC
7 7 010700 -- N8 No_Module FC
Defined configuration:
cfg: TEMPESXi_Citrix_2_SAN
TEMPESX01_HBA03; TEMPESXi1_hba3; TEMPESXi2_hba3; TEMPESXi3_hba3
zone: TEMPESX01_HBA03
TEMPESX01_vmhba3; TEMPSANA_hba5; TEMPSANB_hba5
zone: TEMPESXi1_hba3
TEMPESXi1_vmhba3; TEMPSANA_hba5; TEMPSANB_hba5
zone: TEMPESXi2_hba3
TEMPESXi2_vmhba3; TEMPSANA_hba5; TEMPSANB_hba5
zone: TEMPESXi3_hba3
TEMPESXi3_vmhba3; TEMPSANA_hba5; TEMPSANB_hba5
alias: TEMPESX01_vmhba3
10:00:00:05:1e:fb:ce:ee
alias: TEMPESXi1_vmhba3
10:00:00:05:1e:fb:d2:92
alias: TEMPESXi2_vmhba3
10:00:00:05:1e:fb:ce:f1
alias: TEMPESXi3_vmhba3
10:00:00:05:1e:fb:d2:96
alias: TEMPSANA_hba5
20:54:00:80:e5:24:1e:12
alias: TEMPSANB_hba5
20:55:00:80:e5:24:1e:12
Effective configuration:
cfg: TEMPESXi_Citrix_2_SAN
zone: TEMPESX01_HBA03
10:00:00:05:1e:fb:ce:ee
20:54:00:80:e5:24:1e:12
20:55:00:80:e5:24:1e:12
zone: TEMPESXi1_hba3
10:00:00:05:1e:fb:d2:92
20:54:00:80:e5:24:1e:12
20:55:00:80:e5:24:1e:12
zone: TEMPESXi2_hba3
10:00:00:05:1e:fb:ce:f1
20:54:00:80:e5:24:1e:12
20:55:00:80:e5:24:1e:12
zone: TEMPESXi3_hba3
10:00:00:05:1e:fb:d2:96
20:54:00:80:e5:24:1e:12
20:55:00:80:e5:24:1e:12
Hmmm, the zoning looks good.
When you map a LUN on controller A to the ESX 4.0 server do you see two paths to it or only 1?
If you post the Storage Subsystem Profile from the DS3500 I will take a look at that for you also to see if I can spot the problem.
I can't see anythng wrong.
Do you have a spare SFP to put in port 6 or 7 of the fibre switch the ESX 4.0 host is connected to and cable it to one of the spare FC host ports on Controller B of the DS3500, e.g. port 6, and then change the zoning so that the ESX 4.0 host can see this port on the DS3500. Does that let you use Logical Drives owned by controller B?
If that does not work and you can put the ESX 4.0 host into maintenance made could you try cabling it to the other fibre channel switch, doing the zonng on that switch and see if that gets it working?
Thanking you for your assistance. I will give your suggestions a go and report back when possible.
Cheers
David
Just an update for those that are interested.
Turns out the path setting was set to Fixed, changed to most recently used and all is now working.