VMware Cloud Community
wayne_hollomby
Contributor
Contributor

ESXi 4.1 Boot from Lun design

Hi,

We are deploying a ESXi 4.1 platform and will be using blades that have no local storage so we will be booting from SAN.

We have 16 blades in total that will all need to boot from SAN.

My question is what is the best way to create the Luns on the SAN for these ESXi servers. I know each server will need to have it's own dedicated boot Lun but i wondered if there was any recommendations on best practice way to create the underlying raid sets.

We have 450GB fibre channel disks or 1TB SATA disks available to use.

I am thinking of using the fibre channel as i would be worried about rebuilds on SATA causing possible performance impact to ESXi but what's the best way to carve up the Luns ? use raid 1,5 for example and how many ESXi server per raid group ?

If anyone can make some recommendations on what they have done in the real world that would be very helpful.

Thanks in advance

Wayne

Tags (2)
0 Kudos
16 Replies
Dave_Mishchenko
Immortal
Immortal

When ESXi boots it creates a RAM disk and most of the system I/O is directed to that drive. Thus the I/O requirement on physical storage and very minimal. The boot LUN only has to be 5 GB (you can go as small as 900 MB if you allocate another LUN for the scratch partition) so you don't need a lot of storage nor does it have to be fast.

Have you considered boot from an internal flash device on the blades?




Dave

VMware Communities User Moderator

Now available - vSphere Quick Start Guide

Do you have a system or PCI card working with VMDirectPath? Submit your specs to the Unofficial VMDirectPath HCL.

wayne_hollomby
Contributor
Contributor

Thanks for the reply Dave

I was thinking of having at least 2 raid sets and then splitting 8 servers on each to avoid having all eggs in 1 basket scenario

We have to work with what we already have so there is no option to add any local flash disks into the blades unfortunately at this time.

0 Kudos
Dave_Mishchenko
Immortal
Immortal

I was thinking of having at least 2 raid sets and then splitting 8 servers on each to avoid having all eggs in 1 basket scenario

That's a good plan then. What sort of blades are you using?




Dave

VMware Communities User Moderator

Now available - vSphere Quick Start Guide

Do you have a system or PCI card working with VMDirectPath? Submit your specs to the Unofficial VMDirectPath HCL.

0 Kudos
wayne_hollomby
Contributor
Contributor

we are using Cisco UCS B200 M2 blades

0 Kudos
pauly75
Contributor
Contributor

Hi Wayne,

We are  in the same situation as you (we have four B200 M2 blades) and I was searching for any information regarding ESXi 4.1 boot LUN design. Like you I was wondering what size the LUNs should be (was thinking 10GB), what type of disk to use (like you thought it best to use FC and not SATA) and if we should split the boot LUNS on different RAID groups (we are using a CX4-120 array). Can I ask you what configuration you went for with your implementation? Thanks.

Paul.

0 Kudos
wayne_hollomby
Contributor
Contributor

Hi Paul,

We used FC in the end with 20GB luns just to give us additional space in case we ever need it in the future for upgrades etc.

We just did 2 x Raid1 sets and carved these up into 16 Luns of 20GB and then spead the ESXi host across these for the boot luns

We are using a CX4-480 and this is performing fine for us we did some ESXi host reboots in a single hit and didn't see any performance issues with booting 8 ESXi hosts from a 2 disk Raid 1 set.

I would avoid SATA because of the increase in raid rebuild times will be a lot higher than FC and would put your hosts at a risk of running on a single disk during failed disk rebuild.

I think it's a personal choice of amount disks you want to use compared to how many hosts you want to lose if the whole raid set fails which should be very low if you have hotspares setup correctly with disks spread across different disk shelves and monitoring the usage of the Luns.

Hope this helps

Wayne

0 Kudos
pauly75
Contributor
Contributor

Hi Wayne,

Thank you for taking the time to reply. It is good to know what other people are doing with regards ESXi boot from SAN (especially those people using UCS). One final question if you do not mind answering, did you use Powerpath V/E or did you stick to NMP?

Thanks,

Paul,

0 Kudos
wayne_hollomby
Contributor
Contributor

Yes we did deploy Powerpath V/E and it works really well with spreading the load across all the HBA's , when running esxtop you can do see the difference when using NMP compared to Powerpath that it send traffic down all paths.

0 Kudos
RaymondG
Enthusiast
Enthusiast

Hey, I am about to deploy this same setup soon with the UCS and we are not deciding whether or not to boot off san or local.  I personally would rather local, especially if the disk are there anyway! But everyone seems to be into this BFS concept.

Do you feel there are advantages to booting off san?  isnt san storage more expensive than local disk?  Have you ran into any issues yet?  do you feel that it will be hard to pinpoint the cause of an issue if all the luns are on the san? e.g. is it the os acting up or did the storage guys do a rescan in the middle of the day!  

Also  how do you guys set up your esx host.   The cards in theses blades only have 2 physical nics......do you set everything up with only two nics in VC and just team them on one switch with multiple portgroups.  Or do you configure many vNics on the ucs to make the ESX host think it has more than 2 physical nics.

also how is the ucs treating you guys?  are you happy with it?

thanks a lot!

Raymond Golden VCP3, VCP4, MCSA, A+, Net+, SEC+
0 Kudos
Dave_Mishchenko
Immortal
Immortal

Fewer moving parts in the server (or blade) reduces the chances of failure and the need to visit that blade.  Once ESXi boots most I/O is done in it's RAM disk so local or SAN disk are overkill.  ESXi is heading download the road to stateless booting so eventually you won't have to worry about any disk (well some storage for your PXE or gXE host).  I'd go with a single vSwitch with multiple port groups.  ESXi can balance the traffic and depending on your licensing level you can set traffic priorities.  You'll also have NIC redundancy.

0 Kudos
samansalehi
Enthusiast
Enthusiast

Dear Friend;

For ESX or ESXi, i read that at most 5 GB for log for 5 year + 5 GB for the ESX.

On the other side, when u install the ESXi again, it formated the full lun, so u need lun for each ESXi host.

I think 10 GB at most is enough for ESXi, and in FC is the best practice.

saman

0 Kudos
pauly75
Contributor
Contributor

Hi Raymond,

We are currently implementing UCS at the moment but have not gone "live" yet. It is a bit of a steep learning curve for us as we are trying to do everything ourselves and the technology (UCS, SAN, VMware) is all fairly new to us.

We decided to boot from SAN so we could make the UCS as stateless as possible, having the ability to assign different service profiles to blades when we required (e.g. one minute it is a ESXi server the next it is an application server). I believe, but I might be incorrect here, that if you install anything on the local disk of the blade then you cannot assign different service profiles and hence you lose one of the selling points of the UCS. I like the fact that if a blade fails we can swap it out for a new one then assign a service profile to it (therefore no configuration / installation is required) and it is back up and working.

Cheers,

Paul

0 Kudos
pauly75
Contributor
Contributor

Hi Wayne,

Sorry to keep bothering you but out of interest what did you do with your datastores? Did you spread them across multiple RAID Groups (like the boot luns do you really want all your VMs on disks in the same RAID group), what size did you make them (there seems to be a common response on these forums of keeping them between 500 - 800GB), did you use SATA or FC disks, did you create datastores for "C" drives (OS) and ones for "D" drives (data)?

I hope you do not mind all these questions? We are in the middle of trying to design our environment and every time we hold meetings to discuss this we end up going away with more questions than when we started. It is good to know what others are doing. Thanks.

Paul.

0 Kudos
RaymondG
Enthusiast
Enthusiast

Hi, Pauly,  thanks for the response.

I think you are getting USC profiles confused with vm clones or backups.   A services profile only holds the hardware state of the blades, e.g bios policy, raid policy, nic policy etc...   it does nothing for the data that is on the disk.   If you take out an ESX blade and then put in another blade and apply that same profile you will still need to reinstall the OS applications etc...    But!  if you boot from SAN and have a working ESXi OS on the san and then remove the blade and put another in its place.....you can zone that OS LUN to the new blade and be able to boot up normally.   But you can also do that by taking the local drives from one blade and putting it in another.

cisco purposely made UCS confusing, in my opinion.....and I told them this too.   But hey, we got a few chasis' and are getting more so Im forced to get with the program.

but I still dont see the advantage of booting from San unless where is som replication going on in the san for DR.   the service profiles dont affect the data on the disk.

Raymond Golden VCP3, VCP4, MCSA, A+, Net+, SEC+
0 Kudos
RaymondG
Enthusiast
Enthusiast

Hi, I know this question was for someone else, but I thought I'd share my setup.

I use 750GB vmdk datastore.  VMware use to say smaller sizes were better, but they have since said they can handle 1tb fine now....so I went done the middle!    I use FC drives.   Most of my Windows VMs were for apps that didnt require much space, so I just make is 30-40GB and call it a day!  But if i were you have a VM that needed say 100gb of data, or if an existing VM needs an extra drive, I give is a disk from another datastore.   But I would not say have a separate datastore for OS and data......the only way i would do that is if I plans to use slower disk or a different Raid group for the data.

All my LUNs are Raid 5.

I think that was all your questions.   If you want more info from me, let me know!

Raymond Golden VCP3, VCP4, MCSA, A+, Net+, SEC+
0 Kudos
ildanish
Contributor
Contributor

Hi Raymond,

actually there is an advantage in booting from SAN in a UCS environment, because if you do, when a blade goes down for any reason you can just associate the service profile to another blade and start it, without even having to access the data centre. If you need to swap the internal disks and you're data centre is 200km away you will start thinking about BFS... Smiley Wink

Daniele

0 Kudos