doggatas
Enthusiast
Enthusiast

Adding VMFS datastore

I have an issue in my testing environment where I have 3 ESXi 4.1 Update 2 hosts with 2 single port fibre card installed and 1 ESX4.0 host with 1 single port fibre card installed, connected to the SAN in the manner outlined in the graphic.

The 3 ESXi 4.1 hosts are connected in the same manner as the 4.0 host but the extra hba is connected to the other SAN switch. I hope this makes sense, please let me know if it doesn’t!

When adding a LUN as VMFS Datastore to any of the ESXi4.1 hosts it will add no problems regardless of which controller the LUN is using as its preferred path. The SAN by the way is a DS3500, with dual controllers A and B.

There are two HOST groups created on the SAN, 1 for the 3 ESXi 4.1 hosts and 1 with the single ESX4.0 host.

My problem is adding a VMFS Datastore to the ESX 4.0, when the LUN is using Controller B as its path, the Datastore creation fails at the point where it is creating the store. i.e the HOST can see the LUN and it is available for selection with adding storage to the host. When adding the storage to the host and the LUN path is controller B the GUI add wizard is slow and eventually fails at the creating VMFS partition step.

If however the LUN path is via controller A the add storage wizard is snappy and the Datastore is created successfully.

I’m pretty sure my zoning is correct because the HOST can actually see the LUN, it just errors out when adding as a Datastore.

Is my problem the fact that ESX4.0 host only has the one hba? I’m not so sure about this because I have another environment where this is not an issue.

Cheers

David

0 Kudos
11 Replies
john23
Commander
Commander

can you check /var/log/vmkernel in esx4.0 host?

Thanks -A Read my blogs: www.openwriteup.com
0 Kudos
doggatas
Enthusiast
Enthusiast

Hi John

I have copied the message log after attemping to add a datastore using a LUN on controller B. lot of erros.

Cheers

David

Dec 12 09:09:08 TEMP kernel: [313066.903613]   Vendor: IBM       Model: 1746      FAStT   Rev: 1070

Dec 12 09:09:08 TEMP kernel: [313066.910841]   Type:   Direct-Access                      ANSI SCSI revision: 05

Dec 12 09:09:08 TEMP kernel: [313066.946492] SCSI device sdg: 10485760 512-byte hdwr sectors (5369 MB)

Dec 12 09:09:08 TEMP kernel: [313066.954043] sdg: Write Protect is off

Dec 12 09:09:08 TEMP kernel: [313066.962047] SCSI device sdg: drive cache: write back w/ FUA

Dec 12 09:09:08 TEMP kernel: [313066.969731] SCSI device sdg: 10485760 512-byte hdwr sectors (5369 MB)

Dec 12 09:09:08 TEMP kernel: [313066.977325] sdg: Write Protect is off

Dec 12 09:09:08 TEMP kernel: [313066.985083] SCSI device sdg: drive cache: write back w/ FUA

Dec 12 09:09:08 TEMP kernel: [313066.992273]  sdg:end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:08 TEMP kernel: [313067.503176] printk: 11 messages suppressed.

Dec 12 09:09:08 TEMP kernel: [313067.510341] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:09 TEMP kernel: [313068.019626] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:09 TEMP kernel: [313068.026825] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:09 TEMP kernel: [313068.536162] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:09 TEMP kernel: [313068.543383] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:10 TEMP kernel: [313069.052752] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:10 TEMP kernel: [313069.059984] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:10 TEMP kernel: [313069.569496] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:10 TEMP kernel: [313069.576682] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:11 TEMP kernel: [313070.085851] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:11 TEMP kernel: [313070.093015] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:11 TEMP kernel: [313070.100284]  unable to read partition table

Dec 12 09:09:11 TEMP kernel: [313070.107706] sd 3:0:8:0: Attached scsi disk sdg

Dec 12 09:09:11 TEMP kernel: [313070.114958] sd 3:0:8:0: Attached scsi generic sg9 type 0

Dec 12 09:09:12 TEMP kernel: [313071.152288] end_request: I/O error, dev sdg, sector 10485632

Dec 12 09:09:12 TEMP kernel: [313071.159476] Buffer I/O error on device sdg, logical block 1310704

Dec 12 09:09:12 TEMP kernel: [313071.668848] end_request: I/O error, dev sdg, sector 10485632

Dec 12 09:09:12 TEMP kernel: [313071.676014] Buffer I/O error on device sdg, logical block 1310704

Dec 12 09:09:13 TEMP kernel: [313072.185378] end_request: I/O error, dev sdg, sector 10485752

Dec 12 09:09:13 TEMP kernel: [313072.192598] Buffer I/O error on device sdg, logical block 1310719

Dec 12 09:09:13 TEMP kernel: [313072.701984] end_request: I/O error, dev sdg, sector 10485752

Dec 12 09:09:13 TEMP kernel: [313072.709206] Buffer I/O error on device sdg, logical block 1310719

Dec 12 09:09:14 TEMP kernel: [313073.218512] end_request: I/O error, dev sdg, sector 10485752

Dec 12 09:09:14 TEMP kernel: [313073.225714] Buffer I/O error on device sdg, logical block 1310719

Dec 12 09:09:14 TEMP kernel: [313073.735050] end_request: I/O error, dev sdg, sector 10485752

Dec 12 09:09:15 TEMP kernel: [313074.251620] end_request: I/O error, dev sdg, sector 10485752

Dec 12 09:09:15 TEMP sfcb[4082]: storelib Physical Device Device ID           : 0x9

Dec 12 09:09:15 TEMP last message repeated 20 times

Dec 12 09:09:15 TEMP kernel: [313074.768183] end_request: I/O error, dev sdg, sector 10485752

Dec 12 09:09:16 TEMP kernel: [313075.284809] end_request: I/O error, dev sdg, sector 10485696

Dec 12 09:09:16 TEMP kernel: [313075.801274] end_request: I/O error, dev sdg, sector 10485744

Dec 12 09:09:17 TEMP kernel: [313076.317832] end_request: I/O error, dev sdg, sector 10485752

Dec 12 09:09:17 TEMP kernel: [313076.834411] end_request: I/O error, dev sdg, sector 10485752

Dec 12 09:09:18 TEMP kernel: [313077.350938] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:19 TEMP kernel: [313077.867524] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:19 TEMP kernel: [313077.874752] printk: 8 messages suppressed.

Dec 12 09:09:19 TEMP kernel: [313077.881921] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:19 TEMP kernel: [313078.400820] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:20 TEMP kernel: [313078.917292] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:20 TEMP kernel: [313079.433835] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:21 TEMP kernel: [313079.950409] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:21 TEMP kernel: [313080.466961] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:22 TEMP kernel: [313080.983517] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:22 TEMP kernel: [313081.500061] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:23 TEMP kernel: [313082.016644] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:23 TEMP kernel: [313082.533170] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:23 TEMP kernel: [313082.540378] printk: 8 messages suppressed.

Dec 12 09:09:23 TEMP kernel: [313082.547556] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:24 TEMP kernel: [313083.066397] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:24 TEMP kernel: [313083.582944] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:25 TEMP kernel: [313084.099584] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:25 TEMP kernel: [313084.616080] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:26 TEMP kernel: [313085.132638] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:26 TEMP kernel: [313085.649177] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:27 TEMP kernel: [313086.165750] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:27 TEMP kernel: [313086.682332] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:28 TEMP kernel: [313087.194647] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:28 TEMP kernel: [313087.715422] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:28 TEMP kernel: [313087.722586] printk: 9 messages suppressed.

Dec 12 09:09:28 TEMP kernel: [313087.729790] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:29 TEMP kernel: [313088.248635] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:29 TEMP kernel: [313088.765189] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:30 TEMP kernel: [313089.281739] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:30 TEMP kernel: [313089.798443] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:31 TEMP kernel: [313090.314866] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:32 TEMP kernel: [313090.831416] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:32 TEMP kernel: [313091.347979] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:33 TEMP kernel: [313091.981157] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:33 TEMP kernel: [313092.497799] end_request: I/O error, dev sdg, sector 0

Dec 12 09:09:33 TEMP kernel: [313092.504990] printk: 11 messages suppressed.

Dec 12 09:09:33 TEMP kernel: [313092.512178] Buffer I/O error on device sdg, logical block 0

Dec 12 09:09:59 TEMP sfcb[4082]: storelib Physical Device Device ID           : 0x9

Dec 12 09:10:45 TEMP last message repeated 42 times

End of log reached.

0 Kudos
john23
Commander
Commander

I am not able to make any relation on the log messages, Its just prompting for /dev/sdg....

Is it the same volume which you trying to add

Thanks -A Read my blogs: www.openwriteup.com
0 Kudos
doggatas
Enthusiast
Enthusiast

Thanks for looking over the log John.

That log is me trying to add just the one volume. Although other attemps at adding different volumes had failed in the same manner.

Not sure if i mentioned in my original post but the volume that failed to add the the ESX 4.0 host with the one HBA card, added just fine to the esxi 4.1 host that had dual hba cards, one connected to each switch. I'm at a loss, tempted to install another hba into the ESX4.0 host just to see if that fixes it.

Cheers

David

0 Kudos
cjscol
Expert
Expert

Assuming the SAN is cabled as per your diagram, i.e. the switch the ESX 4.0 host is connected to is also connected to both controller A and controller B then it would suggest there is an issue with the zonng.

This is the typical symptoms when the LUN is seen only via the controller NOT ownng the LUN on IBM DS3000 and DS4000 Storage Systems.

Can you post the zoning configuration from the switch, assuming it is an IBM B series switch can you post the output from switchshow and zoneshow and let me know whch ports the DS3500 is conneced to and which port the ESX 4.0 hosts is connected to.

Calvin Scoltock VCP 2.5, 3.5, 4, 5 & 6 VCAP5-DCD VCAP5-DCA http://pelicanohintsandtips.wordpress.com/blog LinkedIn: https://www.linkedin.com/in/cscoltock
0 Kudos
doggatas
Enthusiast
Enthusiast

All the help is much appreciated. switchshow and zoneshow as requested.

  • ESX4.0 host connected on Port 5
  • SAN Controller A connected to port 3
  • SAN Controller B connected to port 4

Once again, thank you.

Cheers

David

IBM_2498_24E:admin> switchshow
switchName:     IBM_2498_24E
switchType:     71.2
switchState:    Online
switchMode:     Native
switchRole:     Principal
switchDomain:   1
switchId:       fffc01
switchWwn:      10:00:00:05:33:4e:21:33
zoning:         ON (TEMP_Citrix_2_SAN)
switchBeacon:   OFF

Index Port Address Media Speed State     Proto
==============================================
  0   0   010000   id    N8   Online      FC  F-Port  10:00:00:05:1e:fb:d2:92
  1   1   010100   id    N8   Online      FC  F-Port  10:00:00:05:1e:fb:ce:f1
  2   2   010200   id    N8   Online      FC  F-Port  10:00:00:05:1e:fb:d2:96
  3   3   010300   id    N8   Online      FC  F-Port  20:54:00:80:e5:24:1e:12
  4   4   010400   id    N8   Online      FC  F-Port  20:55:00:80:e5:24:1e:12
  5   5   010500   id    N8   Online      FC  F-Port  10:00:00:05:1e:fb:ce:ee
  6   6   010600   --    N8   No_Module   FC
  7   7   010700   --    N8   No_Module   FC


Defined configuration:
cfg:   TEMPESXi_Citrix_2_SAN
                TEMPESX01_HBA03; TEMPESXi1_hba3; TEMPESXi2_hba3; TEMPESXi3_hba3
zone:  TEMPESX01_HBA03
                TEMPESX01_vmhba3; TEMPSANA_hba5; TEMPSANB_hba5
zone:  TEMPESXi1_hba3
                TEMPESXi1_vmhba3; TEMPSANA_hba5; TEMPSANB_hba5
zone:  TEMPESXi2_hba3
                TEMPESXi2_vmhba3; TEMPSANA_hba5; TEMPSANB_hba5
zone:  TEMPESXi3_hba3
                TEMPESXi3_vmhba3; TEMPSANA_hba5; TEMPSANB_hba5
alias: TEMPESX01_vmhba3
                10:00:00:05:1e:fb:ce:ee
alias: TEMPESXi1_vmhba3
                10:00:00:05:1e:fb:d2:92
alias: TEMPESXi2_vmhba3
                10:00:00:05:1e:fb:ce:f1
alias: TEMPESXi3_vmhba3
                10:00:00:05:1e:fb:d2:96
alias: TEMPSANA_hba5
                20:54:00:80:e5:24:1e:12
alias: TEMPSANB_hba5
                20:55:00:80:e5:24:1e:12

Effective configuration:
cfg:   TEMPESXi_Citrix_2_SAN
zone:  TEMPESX01_HBA03
                10:00:00:05:1e:fb:ce:ee
                20:54:00:80:e5:24:1e:12
                20:55:00:80:e5:24:1e:12
zone:  TEMPESXi1_hba3
                10:00:00:05:1e:fb:d2:92
                20:54:00:80:e5:24:1e:12
                20:55:00:80:e5:24:1e:12
zone:  TEMPESXi2_hba3
                10:00:00:05:1e:fb:ce:f1
                20:54:00:80:e5:24:1e:12
                20:55:00:80:e5:24:1e:12
zone:  TEMPESXi3_hba3
                10:00:00:05:1e:fb:d2:96
                20:54:00:80:e5:24:1e:12
                20:55:00:80:e5:24:1e:12

0 Kudos
cjscol
Expert
Expert

Hmmm,  the zoning looks good.

When you map a LUN on controller A to the ESX 4.0 server do you see two paths to it or only 1?

If you post the Storage Subsystem Profile from the DS3500 I will take a look at that for you also to see if I can spot the problem.

Calvin Scoltock VCP 2.5, 3.5, 4, 5 & 6 VCAP5-DCD VCAP5-DCA http://pelicanohintsandtips.wordpress.com/blog LinkedIn: https://www.linkedin.com/in/cscoltock
0 Kudos
doggatas
Enthusiast
Enthusiast

Esx is telling me there are 2 paths.

I have attched the storage profile as a txt file, far too much text to read.

Thanks

David

0 Kudos
cjscol
Expert
Expert

I can't see anythng wrong.

Do you have a spare SFP to put in port 6 or 7 of the fibre switch the ESX 4.0 host is connected to and cable it to one of the spare FC host ports on Controller B of the DS3500, e.g. port 6, and then change the zoning so that the ESX 4.0 host can see this port on the DS3500.  Does that let you use Logical Drives owned by controller B?

If that does not work and you can put the ESX 4.0 host into maintenance made could you try cabling it to the other fibre channel switch, doing the zonng on that switch and see if that gets it working?

Calvin Scoltock VCP 2.5, 3.5, 4, 5 & 6 VCAP5-DCD VCAP5-DCA http://pelicanohintsandtips.wordpress.com/blog LinkedIn: https://www.linkedin.com/in/cscoltock
0 Kudos
doggatas
Enthusiast
Enthusiast

Thanking you for your assistance. I will give your suggestions a go and report back when possible.

Cheers

David

0 Kudos
doggatas
Enthusiast
Enthusiast

Just an update for those that are interested.

Turns out the path setting was set to Fixed, changed to most recently used and all is now working.

0 Kudos