VMware Cloud Community
dinny
Expert
Expert

Risks of overwriting a LUN when doing a scripted build on a HP server?

Hiya,

I've just completed a scripted build process for our ESX 3.01 servers.

I am using HP DL580 G4 servers, each containing a HP SAS Smart Array P400 Controller and two Emulex LP10000DC HBAs.

I appreciate that general ESX best practice is to either pull out the SAN fibre cables or disconnect the SAN switch ports before an ESX build.

Some of my servers are in a remote site - and I have no direct access to the SAN management tools - so I would like to be able to build/rebuild my ESX servers with the SAN cables still "live" - as long as I can do so safely.

I can certainly appreciate the need for caution on a server where the local disk and the LUNs are all referred to as sda, sdb etc...

Clearly if the local disk did not exist or was not detected first then the build process could try to overwrite a SAN disk.

As I understand it though, HP servers[/b] will always see the local disk(s) as cciss/c0d0 etc - and will always see the VMFS SAN disks as sda, sdb etc.

If this is indeed the case - presumably if I only do a clearpart on the cciss/c0d0 drive - and only create each partition via --ondisk=cciss/c0d0 then there should be no possible way I could accidently overwrite one of my SAN LUNs.

Is this the case - or am I missing something?

Cheers

Dinny

My kickstart build currently contains the following lines:

\# Partitioning

clearpart --all --drives=cciss/c0d0 --initlabel

part /boot --fstype ext3 --size 250 --ondisk=cciss/c0d0

part swap --size 1600 --ondisk=cciss/c0d0

part / --fstype ext3 --size 8192 --ondisk=cciss/c0d0

part /var --fstype ext3 --size 4096 --ondisk=cciss/c0d0

part /tmp --fstype ext3 --size 4096 --ondisk=cciss/c0d0

part /opt --fstype ext3 --size 4096 --ondisk=cciss/c0d0

part /home --fstype ext3 --size 4096 --ondisk=cciss/c0d0

part None --fstype vmkcore --size 100 --ondisk=cciss/c0d0

\# next line partly fails during the kickstart build - but need it in so can use vmkfstools on the half existing partition later on in the build process...

part None --fstype vmfs3 --size 1 --grow --ondisk=cciss/c0d0

part /vmimages --fstype ext3 --size 10000 --ondisk=cciss/c0d0

0 Kudos
18 Replies
bister
Expert
Expert

To go sure you could use "clearpart linux" instead of "all". But IMO it's better to install without SAN-LUNs attached.

williambishop
Expert
Expert

There is an option to leave vmfs formatted filesystems, this is typically what we use....It will not format our luns this way. Thinking it through however, if you have raw file systems, that might be dangerous. You could always just remove the zoning for the server and then reactivate it after. Far easier however, is what the above user suggested, just unplug it.

--"Non Temetis Messor."
0 Kudos
TomHowarth
Leadership
Leadership

gotta agree here, to cover yourself it is better to just unplug it.

Tom Howarth VCP / VCAP / vExpert
VMware Communities User Moderator
Blog: http://www.planetvm.net
Contributing author on VMware vSphere and Virtual Infrastructure Security: Securing ESX and the Virtual Environment
Contributing author on VCP VMware Certified Professional on VSphere 4 Study Guide: Exam VCP-410
0 Kudos
kix1979
Immortal
Immortal

I agree with everyone else, there is no risk if you leave it unplugged. If you plug it in, even a though small, it is still a risk that could give you a lot of headaches. Granted a scripted install should be 100% repeatable, but I've had issues sometimes even with a scripted install if I cut and paste wrong etc...

Thomas H. Bryant III
0 Kudos
bister
Expert
Expert

An other (cosmetic) issue we had was the numbering of SCSI-devices in an IBM x366: The onboard SCSI-controller was not configured as first device. ESX configured (also while installation) the two HBAs as first and second SCSI device. In our opinion that didn't look very nice.

0 Kudos
dinny
Expert
Expert

Thanks for everyone's comments.

I do appreciate that to be completely safe I would have to disconnect the SAN cables.

What I was trying to do was to understand exactly what the specific risks of not doing so were - specifically with regard to the use of "cciss" on HP hardware - and then make an informed decision based on that.

At the moment my understanding is that unless I make a typo or do not specifically specify cciss in my script, or something similar - it ought to be safe.

Can anyone think of (or has anyone experienced) any scenario whereby, if no typos are made and if cciss is correctly specified, that a LUN could still be overwritten?

Bister - thanks for the suggestion on "clearpart --linux" - I had read about that setting and thought the only downside would be that it would not overwrite the local vmfs volume - but on the other hand as I don't plan to use that particularly - (it just seemed a shame not to do something with the space) - so it could be worth me using that too.

Williambishop - was the option that you use to leave VMFS volumes alone different to Bister's?

(I don't use any raw LUNs currently so that would not be an issue at present).

Dinny

0 Kudos
Michelle_Laveri
Virtuoso
Virtuoso

I think there maybe a 3rd way of protecting yourself - beyond unplugging cables and masking LUNs away...

The kickstart install has a method of switching off the default probing for devices - noprobe, nonet... you can then manually load the correct drivers (cciss, tg3 and so on). If a driver is not loaded by anaconda for Qlogic or emulex then you wouldn't see any SAN LUNs. This shouldn't stop the SAN being visable after an installation. The UDA has the ability to configure this in the hardware section (experimentally support only!) Smiley Wink

Regards

Mike

Regards
Michelle Laverick
@m_laverick
http://www.michellelaverick.com
dinny
Expert
Expert

Hi Mike,

I remember looking at that section in the UDA app a couple of weeks ago - and thinking I'd look into that later...

Of course I then forgot all about it ...

I clearly didn't read your rtfm guide attentively enough either Smiley Sad

Sounds like that would be a very sensible way for me to approach the issue.

I guess the approach would be to use the "skip detecting storage devices only"

Then manually load the driver that detects the local cciss disk (you don't happen to know what that is do you?)

I presume the option below is then to load this local disk driver from NFS/http etc?

Or do any specifically allowed drivers load as normal from the iso - and is the option below where to load the HBA drivers from later?

If not - then presumably I would then need to load the actual HBA driver in the %post install or in rc.local?

I'll give it a try this evening....

Dinny

0 Kudos
Michelle_Laveri
Virtuoso
Virtuoso

Actually, your drivers are already on the esx ISO - you only need a driver disk if you have a nic/hba which is supported by vmware and not in the iso... usually fixed by maintanence release

So the reference to loading drivers is in the kickstart file... I would be interested in your testing this. Carl says it works for him. When I try it I find my qla2000_707.o driver gets loaded anyway, even though I asked kickstart not too... (hence I used the phrase experimental!!!)

I have a feeling that this approach is probably NOT supported by VMware - so beware! Smiley Happy

Regards

Mike

Regards
Michelle Laverick
@m_laverick
http://www.michellelaverick.com
0 Kudos
dinny
Expert
Expert

Cheers Mike,

I'll let you know.

Mine are emulex so praps it's different?

I imagine a fair percentage of my build script is unsupported ...

As far as I can tell you'd have to do everything via the VI client if you wanted full support Smiley Happy

Dinny

0 Kudos
Michelle_Laveri
Virtuoso
Virtuoso

Yeah tell me about...

about the time I wrote the command-line guide full of lovely esxcfg- stuff i was told quite firmly that these tools were support-only, not for configuration. It was very much anti-COS then, vi-client only...

Last week I had my hands gentle slapped for promoting vimsh to enable VMotion - as this is an largely undocumented, unsupported, support tool...

http://www.rtfm-ed.co.uk/?p=372

Regards

Mike

Regards
Michelle Laverick
@m_laverick
http://www.michellelaverick.com
0 Kudos
dinny
Expert
Expert

Yup,

About 80% of my build script is made up of esxcfg and vimsh commands.

I can see why they don't necessarily want to endorse them as they could limit their ability to change their functionality in the future - but on the otherhand if you have more than about four ESX servers, with a complex configuration you do desperately need a way of reliably configuring them all consistently and in the same manner.

If they want a wide take up of ESX in datacentres - then they really do need a supported scripting tool...

I guess they would advocate the API/perl toolkit - and I do intend to have a look at that - but from what I've seen so far - it's a lot more complex and a lot more in depth than single line commands such as vimsh and esxcfg....

Dinny

PS when I first saw the link you posted above I tried to enable vmotion in my script in a similar way - it didn't work as I didn't follow my esxcfg commands to configure the vswitches with a sleep and a mgmt-vmware restart.

I only realised that I needed to do that when I read the whitepaper you and Gavin J worked on, that was posted on www.xtravirt.com

It may be worth specifying that in your post?

Dinny

0 Kudos
Michelle_Laveri
Virtuoso
Virtuoso

Yeah - if you speak to Richard Garsengen - he will tell you should really learn C# and the virtualcenter SDK. That's a tall order if your a lowly CLI admin like me with only bash/bat/cmd scripting abilities...

On the subject of vmish - this is an odd one. I found I didn't need to restart and sleep for VMotion - but if you want to do more fancy work (which I discussed with Gavin) such as nic teaming, policy settings and so on you do... actually we knocked this about when I was in Eindhoven the other week... and xtravirt.com updated the doc appropriately - I didn't really get that involved...

I've been meaning to update the post - but want to retest my kickstart scripts before doing so... I'm not teaching next week so I will be something I will get round too...

Regards

Mike

Regards
Michelle Laverick
@m_laverick
http://www.michellelaverick.com
0 Kudos
dinny
Expert
Expert

Strange - maybe it was a timing issue...

vimsh for vmotion definitely didn't work without it for me - it was only after I got that working that I added all the other vimsh commands to configure load balancing etc on my portgroups.

Dinny

0 Kudos
BUGCHK
Commander
Commander

If there is access to the Fibre Channel switches, you or somebody else could selectively disable the server's ports. Then go into the Fibre Channel adapter's BIOS and check if the storage array has become inaccessible - just in case the wrong ports were disabled Smiley Wink

0 Kudos
williambishop
Expert
Expert

"Honestly sir, I can't imagine how your database got offline." Smiley Wink

--"Non Temetis Messor."
0 Kudos
Michelle_Laveri
Virtuoso
Virtuoso

How about:

"er,,, you know that 2TB worth of exchange mail boxes we "[i]had[/i]".... erm, we do have a backup of that... don't we?"

Smiley Happy

Regards
Michelle Laverick
@m_laverick
http://www.michellelaverick.com
0 Kudos
dinny
Expert
Expert

Hiya,

I never really got an answer that made me thing there was much of a danger in just using the cciss statement.

However I ended up finally removing the HBA drivers from the build files on the ESX ISO - I am happy that this is the surest way to get round the issue.

I've awarded a few helpful points to those that contributed.

I've now written a whitepaper on exactly how to go about removing the HBA drivers from the ESX 3.01 ISO - and http://www.Xtravirt.com have kindly agreed to host it.

The whitepaper is a lot more specific than the examples given above and covers building ESX 3.01 direct from a CD/ISO as well as PXE BOOT via UDA.

The following link goes straight to their "downloads" section - but please check out the rest of their site...

http://www.xtravirt.com/index.php?option=com_remository&Itemid=75&func=select&id=3

This link goes straight to my whitepaper:

http://www.xtravirt.com/index.php?option=com_remository&Itemid=75&func=fileinfo&id=13

Hope it's useful...

Dinny

0 Kudos