I'm currently trying to share out disks via iSCSI from Solaris. I googled around on this problem and came across several discussions around June about a flaw in the implementation that seems to match what I'm seeing.
I had originally created only a single volume, and it worked fine. After I created several others (using shareiscsi=on) I scanned again and saw that all the targets were found, but they all appeared to be clones of the original target, and it wouldn't let me add them as VMFS volumes.
I tried creating a new target with multiple LUNs via iscsitadm, but it only showed up as having a single LUN - again, the same clone.
I have another target on the same server that is being accessed by a Win2008 box fine. It can see all the different targets for what they are.
From what I understood, the problem was related to the GUIDs being the same, but I can see that the GUIDs are different on all the targets.
According to all the posts I read, this problem was resolved in OpenSolaris a while ago, but maybe it's still in the standard Solaris 10 core?
Help would be appreciated. Thanks a lot!
I think you will like OpenSolaris better than Solaris and it's supported by Sun with a contract.
It works and it will always have all the newest enhancements months before Solaris does.
izfs# uname -a
SunOS izfs 5.10 Generic_137138-09 i86pc i386 i86pc
if there's anything else I can get you to help, let me know. Thanks!
edit: Oh, U6 is on the CD.
Ok,
Your using Update 6 which includes the Network Address Authority (NAA) functions so it will work.
Do not use shareiscsi=on
Just use the the following to create a target.
iscsitadm create target -u 0 -b /dev/zvol/rdsk/rp1/iscsi/lun0 ss1-zrp1
iscsitadm create target -u 1 -b /dev/zvol/rdsk/rp1/iscsi/lun1 ss1-zrp1
iscsitadm create target -u 2 -b /dev/zvol/rdsk/rp1/iscsi/lun2 ss1-zrp1
iscsitadm create target -u 3 -b /dev/zvol/rdsk/rp1/iscsi/lun3 ss1-zrp1
What normally causes this issue is the signaturing of the volume. When vmware sees the iqn and you signature the LUN (aka add the volume) it uses initializes the iscsitgt NAA function and creates a GUID for the LUN in the iscsitgt service manifest. If the non NAA iscsitgt was used the volume header would be signatured the same across multiple LUN's which vmware treats as the same LUN over multiple targets.
Are you able to remove the previously created backing stores (zfs destroy) and reset the service manifest (clear the and import the defaults)?
Also the ESXi server will store the previous target bindings and it can be confusing if these are not cleaned up.
Have a look at my blog for more details
FYI I'm using OpenSolaris instead of Solaris 10 U6
Regards,
Mike
I'd prefer not to destroy two of the zvols - one of them never has to be touched by the VMware hosts, though. Is it possible to keep the data on those volumes, or do I need to figure out a way to copy the data to new volumes?
How would I go about clearing/importing the service manifest?
Thanks!
If you need the data just create a new set of zvols for this work. Be sure to thin provision them.
Leave the old one there you can swap out the backing store of a new LUN definition once you have it working with a simple destroy one of the new one and rename the old to it while the iscsitgt service is offlined.
service disable iscsitgt
e.g
Use
svccfg export -a iscsitgt > /export/home/iscsitgt-manifest-backup.xml
before clearing the config and then
svccfg delete iscsitgt
svccfg import /var/svc/manifest/system/iscsi_target.xml
to cleanup ESXi's bindings you need to ssh to it, if you don't have that enabled just goog it - ton's of howto's
then disable the iSCSI software adapter and
rm /var/lib/iscsi/mkbindings
then reboot it and renable the iSCSI software adapter.
Thanks for the instructions.
Before I go off and do all that, though... is there a way I can tell that this will for sure work?
I ask because I did try to create a separate target with multiple LUNs and it didn't work before, and I don't get how this issue would affect a NEW target since it shouldn't be signed yet.
Also, your post implies that the problem is caused by having a previous ("bad") version of the iscsitgtd that doesn't apply the right signatures. I have only run on U6.
Is it also caused by shareiscsi=on, then? Or could there be another cause?
edit: I just destroyed all the additional volumes, disabled shareiscsi, and then created a new volume and target:
izfs# zfs create -sV 250g iscsi/sandbox
izfs# iscsitadm create target -b /dev/zvol/rdsk/iscsi/sandbox vmdatastores
izfs# iscsitadm list target -v
Target: vmdatastores
iSCSI Name: iqn.1986-03.com.sun:02:08173827-ef4b-cab2-90f6-9339dd0a9027.vmdatastores
Connections: 0
ACL list:
TPGT list:
LUN information:
LUN: 0
GUID: 0
VID: SUN
PID: SOLARIS
Type: disk
Size: 250G
Backing store: /dev/zvol/rdsk/iscsi/sandbox
Status: online
so the GUID starts out as 0, then I refresh HBA on the ESXi host and ... the target disappears?
I recreated it, it signs it (new GUID) and then shortly afterward it disappears. What?
edit 2: Yeah, it seems like as soon as anything connects to it, it disappears shortly afterward - did the same thing with a Win2k8 box connecting to it.
Clear it again but do the following in this order.
Make sure you clear the w2k8 initiator connection. It will mess it up if it 's trying to connect.
1. Remove the target ip from the ESXi config and reboot the ESX server.
2. Clear the iscsitgt manifest as follows
svccfg delete iscsitgt
svccfg import /var/svc/manifest/system/iscsi_target.xml
svcadm refresh iscsitgt
svcadm restart iscsitgt
iscsitadm list target
It should have none listed now if you see any then you will also need to clear the file based persistant db file
rm -R /etc/iscsi/*
svcadm restart iscsitgt
iscsitadm create target -u 0 -b /dev/zvol/rdsk/iscsi/sandbox vmdatastores
3. Add the target back to the initiator and then rescan just the vmhba32 adapter
Do you get the target iqn listed in the VIC GUI?
When you say it disappears, where exactly? The ESXi side?
Do you have more than 1 IP address on the Solaris box?
Can you ssh to the ESXi host and list the vmkbindings file contents?
I'm experiencing the same issue as Patrick.
The iscsi target is a sun x4500, running Solaris 10 (SunOS 5.10 Generic_138889-02). ESXi is running on a dell 1950. The thumper has a couple other multi-terabyte iscsi shares on it being accessed by some win2k8 machines. I can turn off the iscsi access for them temporarily but have to save the data on them.
The win2k8 iscsi shares were originally created with
zfs create -V 1000GB -o shareiscsi=on zpool/sharename
but that didn't work with ESXi, so I used Mike's suggestion of creating the luns with (there are luns 0-4 but this shows lun 0)
zfs create -s -b 64K -V 750G zpool/vmshare/lun0
and the target with
iscsitadm create target -u 0 -b /dev/zvol/rdsk/zpool/vmshare/lun0 vmshare
I disabled the initiator in ESXi (storage adapters -> vmhba32 properties -> configure) and rebooted it.
Re-enabled the initiator and did a scan... and nothing appears.
iscsitadm list target on the thumper shows two additional connections to all the targets available (the two win2k8 targets and the ESXi target)
ssh to ESXi machine, vmkiscsi-tool -T vmhba32 shows six targets, two connections each to the two win2k8 targets and the ESXi target
The windows iscsi initiators see all the targets just fine, etc. Is the next step to reset the iscsitgt service manifest on the x4500? Can I reset the service manifest without losing data on the two other win2k8 targets?
cheers!
Hi,
You will need to create some initiator definitions to control access to your targets.
Like the the following
iscsitadm create initiator --iqn iqn.2000-04.com.qlogic:qla4050c.esx1 esx1-initiator1
iscsitadm create initiator --iqn iqn.2000-04.com.qlogic:qla4050c.esx1 esx1-initiator2
iscsitadm modify target --acl esx1-initiator1 ss1-target1
iscsitadm modify target --acl esx1-initiator2 ss1-target1
The will only allow the initiator named esx1-initiator1 & 2 to connect to the target alias named ss1-target1
This is important to prevent corruption and unintentional access.
When i first was testing Solaris I also used an MS based initiator and found that if it was connected or even discovered the LUN it would prevent VMware's iSCSI software initiiator from access the LUN correctly.
Oh. Almost missed your more important question on the data side.
Clearing the manifest does not destroy data. You need to be sure you have the manifest backed up. But in this case I don't think you need to do that part.
Just define the ACL's and see where that takes you.
I suspect that since your patch level is very current that it will be the answer to some of the issues in this thread.
Message was edited by: mike.laspina
Okay, I created initiator definitions for the two win2k8 machines and the ESXi machine, and configured ACLs to limit each target to its respective machine.
iscsitadm list target now shows one connection from each of the 2k8 machines (correct) but two connections from the ESXi machine.
disabled iscsi software initiator via the ESXi management app, and moved * from /var/lib/iscsi before rebooting the ESXi machine.
the ESXi machine shows two LUNS, both with LUN ID 0 (there are 5 LUNS, ids 0-4)
if I go to add storage, I only see one LUN, LUN 0.
sooo.. I'm wondering why ESXi is making two connections. I do have two of the interfaces on the x4500 aggregated together but don't really see how that would be relevant (or even noticable) to the ESXi initiator
svccfg export -a iscsitgt results in a syntax error, but svccfg export iscitgt works. Is that enough of a backup?
when I do
svccfg delete iscsitgt
svccfg import /var/svc/manifest/system/iscsi_target.xml
svcadm refresh iscsitgt
svcadm restart iscsitgt
iscsitadm list target
I assume I will no longer see the win2k8 targets (nor will the win2k8 machines?) what's the appropriate procedure to restore access to these targets. Sorry if this is out-of-scope.
Do you have more than 1 IP at the target ot the initiator?
There are three ways to restore access to the pre existing targets should you lose or delete the manifest.
The one you already know is to import the backup manifest which would overwirte the existing one.
The second is to manually edit the active manifest. You would create a target using the same backing store as was defined before. (that info is in the backup file as plain text)
Then you use the svccfg utility to replace the automatically generated target iqn guid with the correct value as follows( BTW some of this is in my blog )
svccfg -s iscsitgt listprop | grep target
target_ss1-zrp1/iscsi-name astring iqn.1986-03.com.sun:02:eb9c3683-9b2d-ccf4-8ae0-85c7432f3ef6.ss1-zrp1
using the returned propery value issue the following
svccfg -s iscsitgt setprop target_ss1-zrp1/iscsi-name=iqn.1986-03.com.sun:02:thepreviousguid.alias_name
svcadm refresh iscsitgt
svcadm restart iscsitgt
And finally you can edit the backup file to have the correct elements, not a lot of fun but it's a method.
Oh. I missed one element on the export, it needs the >
svccfg export -a iscsitgt > /iscsitgt-backup.xml
Okay, I took some servers down and performed all these steps. I added the first one fine, but then when I add a second LUN it doesn't show up in the VIC GUI.
The vmkbindings file:
Format:
bus target iSCSI
id id TargetName
#
0 0 iqn.2006-01.com.openfiler:tsn.cfbe11328124 0
0 1 iqn.2006-01.com.openfiler:tsn.fa8b6b316e2e 0
0 2 iqn.1986-03.com.sun:02:41bb772a-3249-4589-879f-bd3f6818f70f.vmdatastores 2
I have the following from iscsitadm list target -v:
izfs# iscsitadm list target -v
Target: vmdatastores
iSCSI Name: iqn.1986-03.com.sun:02:41bb772a-3249-4589-879f-bd3f6818f70f.vmdatastores
Connections: 1
Initiator:
iSCSI Name: iqn.1998-01.com.vmware:mercury-04430842
Alias: mercury
ACL list:
TPGT list:
LUN information:
LUN: 0
GUID: 0100001635ab7ab900002a0049752de5
VID: SUN
PID: SOLARIS
Type: disk
Size: 250G
Backing store: /dev/zvol/rdsk/iscsi/sandbox
Status: online
LUN: 1
GUID: 0100001635ab7ab900002a0049752e6c
VID: SUN
PID: SOLARIS
Type: disk
Size: 900G
Backing store: /dev/zvol/rdsk/iscsi/nboaa
Status: online
so it's connecting, and I made sure that all the backing stores were new volume files - but it doesn't work, as it doesn't see the second LUN.
Anything else I can try?
Thanks for the help so far. Seems like everyone uses OpenSolaris instead of the release version ... and maybe I should too, if it would solve this issue.
I went ahead and stuffed OS into a VM. After I got past the PEBKAC problem of selecting 32-bit in the VM creation, I configured up an iSCSI target with 4 LUNs and pointed ESXi at it. All four showed up perfectly.
So, now I have to figure out how to migrate my Solaris 10 box to OpenSolaris...
You were also quite right about liking OS better than Solaris - it feels a lot more like what I'm used to from a *nix standpoint.
Thanks much!
Yes... related to this, right now my Solaris box is running a 5gb rpool on slice 0 and a 995gb slice that's part of the iscsi pool on a 1tb disk mirror.
I'm doing work in a VM right now to test migration, but there's a way to do this that doesn't destroy my entire layout, I hope?
My testing seems to show that 4GB is the bare minimum for OS. The real problem is installing to an existing zfs pool without blowing the whole disk away.
I might be able to repartition one disk, then import the (degraded) pool from the other and replace the previous slice with the new partition, then repartition the other disk and mirror both sides again... it's a huge headache though.