We are currently looking to implement ESXi 4.1 on UCS booting from our CX4-120 array. Not having much experiences with either SAN or ESXi I have a few questions below that I am hoping someone can assist with.
Any advice would be greatly appreciated.
Yes, every LUN that will be used to install ESXi on (and boot from) will need its own storage group inside Navisphere.
Think of these luns as the physical HD's of a server where you normally install the OS. No other machine should have access to these LUNs except the target ESXi installation.
All of your ESXi hosts will need to be contained in a storage group for shared storage.
Thank you for taking the time to respond.
Can a Host be a member of multiple Storage Groups?I thought a Host could only belong to one Storage Group. Therefore I believed I would have to add the "shared storage" to each of the ESXi Storage Groups. I did not think that I could add the ESXi Hosts to multiple Storage Groups.
Do you have any recommendations about the size of the boot LUNs?
You will need to create 1 storage group for each host. the host may only be in 1 group but the LUNs can be in many. in each storage group place the boot LUN and present it as HLU 0, then add all the shared volumes try to keep the HLU numbering consistent particuarly if using VCB.
As regards Sizing, i tend to use 5GB for ESXi and 10GB for full although arguably you could go much smaller with ESXi, i have seen nothing official from any vendor.
Disk layout is a little more difficult to give you, it depends on what you have disk wise, you don't need massive ammounts of IO for the Hypervisor so if possible a couple of 15K FC in a mirror with 4 or 5 hosts booting, i had a customer booting from SATA and it ran fine until 1 failed and the rebuild overhead crippled them so avoid this.
Essentially keep it as simple as possible, interesting you are booting ESXi from SAN, had you considered going for the embedded USB option thus negating this issue?
Missed the last bit, Yes you will need to manually register the Host and also make sure that at the time of installing the ESXi the only LUN presented is the Boot LUN presented as 0. there is an EMC document on this that i have attached, note that it says on page 13 that there is no Boot From SAN support with ESXi (at time of writing the document, i think this may have changed now)
Thank you for taking the time to respond and clarifying a few things for me. I will look into the embedded USB option to see if that would be a better option. There seems to be very little documentation around regarding ESXi and booting from SAN. Therefore I posted here just to see what others are doing.
I am traveling extensivley right now, doing only Cisco UCS implementations. Doing a few Clariion with QLogic FC switches right now and came across this post. EVERYONE is booting from SAN with ESXi I dont have a single customer doing it otherwise right now and the performance is nice. I have issues with this being each UCS blade comes with 2 nice drives minimum. Now that said, there are alot of reason behind closed doors. For us, UCS was created to be stateless, a service profile holds all identity and this allows moving of the host, as well as the guests we are already familiar with in vMotion. Now alot of applications require boot from SAN for the hypervisor, when you want the guest to boot from SAN and being a largely installed UC application, it's pushed more SAN boot installs for the ESXi. Now previously you had to eject the drives and make the blade truley diskless in order to work, this is fixed in ESXi 4.1 U1 and up. Also if using UCS you can download the 4.1 ESXi Cisco OEM image from VMware and boot from SAN issues are resolved pertaining to VIC cards allowing no more use of QLogic or Emulex Chipet and physical HBA's, virtual HBA's work great and drivers are embedded in that .iso image you can download. While boot from SAN has many popularity traits, particuarly when using Cisco UCS because of the true stateless nature you have when the bare-metal OS resides on shared storage and the once physical identities are now all contained in a soft format within the service profile, such as mac-addresses, WWNN's, WWPN's, and UUID's, assigned at SP creation from pre-defined pools, and instantiated at SP association to a computer blade or chassis. Those identities forever follow the service profile so why not make the entire OS or hard drive portable as well.
There are configurations out there for boot ESXi local, guest on SAN, ESXi SAN, local swap space, VM's SAN but specific applications and restrictive support polices are pushing much of this to boot from SAN simply because that testing is completed. While other boot from SAN issues are resolved in 4.1 pertaining to hba drivers or resident local drives on LSI bus, boot from SAN works fine now, however on this particular Clariion system in a active/passive state that it is, I would opt for removing drive's booting from SAN, AND ENSURING that your active SP target/LUN is first in your bios settings, or in UCS just make sure it's the primary for primary, if not, easy to move. No need to fuss with bios any longer, although if you have older hba's or opted for physical emulex/qlogic chipsets you still have the boot room available for backwards brain compatability for the creature of habit.
I have done ALOT of these and the Clariion system has so far been the biggest pain for me, but honestly due to lack of customer knowledge on configuration of it. It integrates well into vCenter making configuation a breeze for many things you'll see in production down the road once the cluster is running. Up front watch out for the Auto-Registration. In the new system with Unisphere (not navisphere mentioned) ESXi no longer needs manually registered, and you must actually disable it in EMC if you dont like it. But ESXi can boot first time, supply all it's WWN's, and see LUNs allowed to him by masking. Customer not knowing this kept manually registering a single hosts 4 WWNs 13 times ontop of the single auto-reg and corrupted the hell out of it, I was seeing LUNs with 0 as a size, disappear, come back. EMC used a special login/password only they know and extracted those entries manually from the DB and all was fine, saw all LUNs with correct size. I had ESXi booted virtually and installed on local drive to look around while he configured new EMC SAN, now will create boot lun and re-install to SAN after scrubbing mbr on disk. While this generally works, again EMC wants the drives ejected or LSI Logic disabled in BIOS due to the active/passive nature and keeping the hba first in boot path. Again in UCS it's all a big text file so I can move those around simply and change the order and invoke a policy so those are always presented as 0 and 1 (first seen by pci bus on blade) and never worry that again. While testing it now (330am) will apparently be ejecting the drives/disabling scsi card in BIOS in order to keep with EMC support for the customer. Some UC apps do allow ESXi local with swap space and guests on SAN, I'm not saying it's never allowed, each app is at a different phase of testing based on hardware platform, application version, and BU pace They're getting there, but aside from "special" applications, real-time applications, boot from SAN definately helps performance and reduces complexity and disaster recovery/avoidance in most instances, particuarly when you dont just move the guest VM's from datacenter to datacenter in emergency, you can move the host OS/Hypervisor, and do so as often as you must with no fear of re-licensing or any "physical" identity attributes changing, unless you want them to. That means moving an entire host, let alone guest vm's within datacenter and no upstream mac re-learning, no black holes, no SAN reconfig, boot from SAN just rocks in Data Center Virtualization, and I'm enjoying getting months of UCS Tour Duty as well as Vblock/VCE datacenters which are huge. Biggest restriction again in boot from SAN is in Clariion here, and others like it that operate active/passive. You just have to move active target in bios, make sure vmware has correct failover/multipathing set to avoid path thrashing. Again Unisphere and ESXi 4.x it's auto discovery, auto registration, and during this process ESXi kernel scans the EMC arrays and sets multipathing/failove methods auto-magically. Just completed one and the server's settings all match what EMC's docs said they would for that setup.
I would like to see this USB embedded spoken of. Been in datacenters too long and have not seen nor tried it. Can it be downloaded as all other images and applications ?? Can I pull my normal ESXi 4.1 Enterprise Plus licenses out of the pool and assign to it as well or is that changed in that build ?? Interested to hear about it but 3 hours until Clariion must have boot luns for 16 ESXi hosts in 2 UCS chassis and I'm not a EMC Storage Guy he left without this part done, so here we go !
ps: If you have UCS boot one local and one from san side by side with both ip kvm's to the CIMC opened up. If you have a good SAN, quality drives, etc. you'll see the boot from SAN excel past the local drives, using 10G paths instead to faster raid striped LUN's. If using a C-Series UCS, you can run RAID anything with 10 or more drives in those boxes with 512GB of ram and 32 cores and climbing so they rock in their own right. Once UCS server priced like most dell/hp/ibm I can run my entire lab infrastructure in vm's on one single server, 2 power cables, 2 network cables plus mgmt nic cable. monitor/keyboard/floppy/CD/DVD etc all over IP on dedicated mgmt link. Even on cheap mgmt network with 10/100 switches and nic's I can boot a new UCS out of box from virtual CD which is .ISO on my usb hdd attached to my laptop and mapped as his physical, install to local or san drives in about 5 minutes. Windows 2k8r2 takes abit longer but I no longer bare-metal windows, it's vm or nothing, putting the entire vm for windows server in ram perks it up abit, beats having own dedicated server to swap memory and bog down I'm spoiled with these things.
Off to Clariion world, I am making long lists to check things off, but still basic SAN stuff, different vendor, prefer the active/active tho.....