VMware Cloud Community
usr345
Contributor
Contributor
Jump to solution

add additional memory esxi

Hello,

I am planning to add additional memory on a production esxi 4.1 R910 server in a cluster. I am planning to do following steps:

Change DRS to manual

vmotion all vm's off the host

Put host in maintainance mode

Shutdown host

Add memory

Start the host

Check if additional memory is visible

Change DRS  to automatic

We did memory test on the server when it was first put in the cluster.

Is it necessary to do a memory test again while adding additional memory?

Thanks

Reply
0 Kudos
1 Solution

Accepted Solutions
RParker
Immortal
Immortal
Jump to solution

usr345 wrote:

Hello,

I am planning to add additional memory on a production esxi 4.1 R910 server in a cluster. I am planning to do following steps:

Change DRS to manual

vmotion all vm's off the host

Put host in maintainance mode

Shutdown host

Add memory

Start the host

Check if additional memory is visible

Change DRS  to automatic

We did memory test on the server when it was first put in the cluster.

Is it necessary to do a memory test again while adding additional memory?

Thanks

Nope.  You don't need to take it out of DRS, just migrate VM's manually (put it in maintenance mode).  Now power it off while in maintenance mode. At that point when you add memory and boot it.. it's not on the cluster, giving you a chance to test the memory.. (to see if its visible).  Memory test upon BIOS start is sufficient...

When you see the memory is fine, then you can exit maintenance mode...

View solution in original post

Reply
0 Kudos
6 Replies
RParker
Immortal
Immortal
Jump to solution

usr345 wrote:

Hello,

I am planning to add additional memory on a production esxi 4.1 R910 server in a cluster. I am planning to do following steps:

Change DRS to manual

vmotion all vm's off the host

Put host in maintainance mode

Shutdown host

Add memory

Start the host

Check if additional memory is visible

Change DRS  to automatic

We did memory test on the server when it was first put in the cluster.

Is it necessary to do a memory test again while adding additional memory?

Thanks

Nope.  You don't need to take it out of DRS, just migrate VM's manually (put it in maintenance mode).  Now power it off while in maintenance mode. At that point when you add memory and boot it.. it's not on the cluster, giving you a chance to test the memory.. (to see if its visible).  Memory test upon BIOS start is sufficient...

When you see the memory is fine, then you can exit maintenance mode...

Reply
0 Kudos
gh0stwalker
Enthusiast
Enthusiast
Jump to solution

It's always best practice to test the memory before placing the servers back in production. Whether you do it or not is another matter 🙂

Reply
0 Kudos
rickardnobel
Champion
Champion
Jump to solution

usr345 wrote:

Change DRS to manual

vmotion all vm's off the host

Put host in maintainance mode

Is your DRS mode "Fully Automated"? Then you should not change it, but instead use it. Just put the host into Maintance mode and DRS will vMotion the VMs away to the most suitable other hosts.

My VMware blog: www.rickardnobel.se
Reply
0 Kudos
RParker
Immortal
Immortal
Jump to solution

gh0stwalker wrote:

It's always best practice to test the memory before placing the servers back in production. Whether you do it or not is another matter 🙂

Only if you buy cheap off the shelf non-vendor memory.. besides memory has error correction (or it should for server).  Otherwise the testing is not necessary.. and it's a CHOICE not a practice...

When you buy a server, do you test the server before you put into production?  If you do, I have to question why you would by that server from a vendor if you can't trust them... The good vendors, HP, Dell, IBM ALL fully test their memory, machines, and components BEFORE they get to the customer.

Reply
0 Kudos
gh0stwalker
Enthusiast
Enthusiast
Jump to solution

We actually do check the memory of every physical server before we put it in to production, and I agree that the memory should be checked by the vendor before it's delivered to the end customer, but as they say in the classics, stuff happens. It's not unheard of to receive hardware DOA from any of the big vendors.

My motto...better to be safe than sorry, and in the scheme of things one day running memtest is not going to blow the project timelines or budget out the window.

I have no problem with people not performing the check, so I hope you can respect our decision for doing it, even if you don't agree with it.

Reply
0 Kudos
Dracolith
Enthusiast
Enthusiast
Jump to solution

usr345 wrote:

We did memory test on the server when it was first put in the cluster.

Is it necessary to do a memory test again while adding additional memory?

You don't have to, but I would strongly recommend running a good memory test tool for at least a few hours.

Just test a few hours, and your test will be more exhaustive than anything your hardware

vendors does just before shipping you the memory.

You should weigh the cost/risks involved in testing, against the cost/risks involved in not

testing.      Your hardware vendors are not gods,  they may have tested your 'brand new part'

6 months ago,  but when you order it -- they will get it to you ASAP,  which means they aren't

going to spend 5 days stress testing it before shipping to you.

Things happen in 6 months -- things like solar flares.    Sometimes things get shaken a bit

in shipping;  sometimes technicians make errors installing memory,   sometimes a speck of dust

lands in the wrong place and gets lodged in the DIMM slot as memory is being installed.

You cannot be certain the part is just as pristine as when they tested it, when it arrives in your server.

IF it will not compromise your failover capacity, run memtest86+ or a similarly exhaustive memory

test  for at least 72 hours after changing the configuration. It doesn't hurt to test memory.

It  WILL hurt later if a couple weeks from now, you have your server crash during peak hours due to a

detected double-bit memory error.

It doesn't matter if your memory came directly from Dell, IBM, HP,  or your server vendor.

Test. Test. Test.

I've on occasion gotten parts from a trusted hardware vendor,  that turned out to be faulty

after a burn-in test.

Given all the above... there's really only two reasons not to do additional memory testing.....

(1) laziness/apathy     (don't tell your boss)

(2) Can't afford to keep that host down,  because of how long it will actually take to run the test.

E.g.   if putting that host in maintenance mode for 72 hours,  means HA is lost for your cluster,

or proper performance of Tier 1 apps will be at risk, this could be unacceptable.

Reply
0 Kudos