VMware Cloud Community
tssk
Contributor
Contributor
Jump to solution

Why has VMWare ESXi 5 slow access to IBM M1015 / LSI 9240-8i RAID?

I am building server for virtualization and wanted to go with  VMWare ESXi 5. I configured RAID10 on 4 disks connected to internal  RAID controller IBM M1015 (identical to LSI 9240-8i) and installed ESXi  without any problems. Only problem is that copying data to datastore (on  the RAID array) on host is slow - around 20MB/s. And I get about the  same speed when trying to copy data to shared folder on guest virtual  machine. Host network is autonegotiated 1000Full with 1000Mbit switch  and I used vmxnet3 vm card in guest machine.

When I install MS Hyper-V on the exact same server with the exact  same RAID10 array I get speed around 110MB/s when copying data to the  Hyper-V host.

I used latest available drivers for every system:

ESXi 5 - LSI_5_34-455140.zip\scsi-megaraid-sas-5.34-1vmw.500.0.0.406165.x86_64.vib

Windows - 5.2.112

I even updated card firmware to 20.10.1-0077 .

This card is entry level but recommmended on different places for ESXi. What am I doing wrong? Am I missing something?

Thanks

1 Solution

Accepted Solutions
a_p_
Leadership
Leadership
Jump to solution

As I mentioned earlier, ESXi does not do caching on the OS level, whereas HyperV does.

André

View solution in original post

Reply
0 Kudos
24 Replies
a_p_
Leadership
Leadership
Jump to solution

Welcome to the Community,

How did you configure the RAID set/logical volumes. Are they set to write-trough (that's what I assume) or write-back? Write-back should only be enabled if the controller has battery-buffered cache.

VMware does not do caching itself (due to data security reasons in case of a host failure) but fully relies on the RAID controller's capablilities.

André

Reply
0 Kudos
tssk
Contributor
Contributor
Jump to solution

This raid-controller model allows only write-through. Can I ask how is it related?

Thanks

Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

In write-through mode, the controller acknowledges a write operation back to the OS after the data has been written to disk, which is significantly slower than write-back mode where the controller acknowledges the operation once the data is in the write cache. To ensure the data is save even in case the host crashes, write-back mode requires either battery or flash backed cache which allows to keep the data in the cache until the system is rebooted to then flush the buffered data to disk.

André

Reply
0 Kudos
cdc1
Expert
Expert
Jump to solution

The 1015 card does not support a battery pack, which is why you only have the option for write-through.

You need to get either a 5014 and the optional battery and cable, or the 5015 which comes standard with a battery.

Another difference between the 5014 and 5015 are the cache.

5014 has 256MB.

5015 has 512MB.

Reply
0 Kudos
tssk
Contributor
Contributor
Jump to solution

@a.p. @cdc You both are talking about possible caching options on controllers but how does it explain that ESXi is slow and Hyper-V is fast on the same controller and array?

Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

As I mentioned earlier, ESXi does not do caching on the OS level, whereas HyperV does.

André

Reply
0 Kudos
cdc1
Expert
Expert
Jump to solution

Andre stated earlier ...   ESXi relies in the hardware controller for write-back.  Windows does it via software in the OS.  It is not hardware-based in that setup..  So, if your hyper-v system were to go down, you risk losing data.

Ideally, you're going to want some level of hardware-based write-back cache that's battery-backed.


Reply
0 Kudos
tssk
Contributor
Contributor
Jump to solution

OK I understand it now.

Sorry but I am not able to find the answer that Hyper-V does caching in on the OS level in your previous posts.

Still do not understand why people are using this controller with ESXi at all.

Thanks

Reply
0 Kudos
cdc1
Expert
Expert
Jump to solution

It was more inferred by saying that ESXi relies on the hardware to do it (inferring that hyper-v doesn't.)

I agree ... I never recommend a raid controller that doesn't have some type of battery-backed cache on it. The performance hit, and risks for data loss, pretty much completely rule it out as an option as far as I'm concerned.

They're alright as a workstation level controller though, or if you're just doing raid1 or raid0 and using the local disk for non-critical data like ISO storage or something of that nature.

Reply
0 Kudos
tssk
Contributor
Contributor
Jump to solution

Another thing is that 20MB/s seem pretty slow to me for RAID10 even uncached...

Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

From my experience, 20MB/s are the most you may get without cache. With RAID 5 I saw ~10-15MB/s without and >90MB/s with write cache on HP hosts even with fast SAS disks.

André

Reply
0 Kudos
caraboy
Contributor
Contributor
Jump to solution

Hello,

I was just searching the internet for a solution to my "dead slow" upload to my System x 3550 running ESXi 5 (with the entry level M1015). Smiley Sad

I have 2 x IBM SAS-NL 500GB disks, in raid1. I can copy a virtual machine from the server with 60-70MB/s, but when I try to upload a backup to the server, it`s capped at 5-6MB/s.

I understand I have no cache, it`s in write-throught mode, but is write so much slower than read on these near-line hdds (70megs compared to 5 :smileyconfused:)?

Should I at least enable "disk cache policy"? I am trying to copy a 200GB machine back to the server, it will take forever!

Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

I don't recommend to enable any write-cache without being backed by a battery. In case of a power failure you risk a file system corruption on all running VMs. You may - although I also do not recommend this - consider to enable write-back operation in case your host is backed by a UPS.

André

Reply
0 Kudos
caraboy
Contributor
Contributor
Jump to solution

Strange, I cannot select write-back from the Megaraid Storage Manager on the current group. I can only enable disk cache. I have an UPS.

Do I need to drop the group and recreate the raid?

Thank you!

Reply
0 Kudos
tssk
Contributor
Contributor
Jump to solution

@caraboy M1015 knows it does not have battery (BBU) so it will not let you choose another cache policy afaik. For example I have another controller (adaptec asr-2405) that also does not have BBU and it lets me do it. But I believe M1015 is trying to protect you from yourself Smiley Happy

Reply
0 Kudos
blckgrffn
Enthusiast
Enthusiast
Jump to solution

You guys are joking about 20MB max with out BBU, right?

RAID 5, maybe, but any non-parity RAID set should fly.

I've got six year old drives attached to the SATA2 ports in one of my whiteboxes that will do ~60-70MB/s write, np.  VMFS and all that.  With SSDs the bus is the limit, VMFS doesn't need BBU for good write speeds.  It just needs good drives.

And a raid controller that doesn't have issues, evidently.

I am having performance issues with my M1015 (now flashed to 9210-8i IR) and seeing simply horrible write speeds on small block sizes.  It took nearly twenty minutes to create a 50GB eager zero'd .vmdk!  The drives are "only" Momentus XTs in RAID 1, but that is no excuse for performance that is that bad.

ATTO Benchmark shows horrible, I mean horrible performance on small writes, and it doesn't really improve much until we are up to 32k block size+.   It is about an order of magnitude slower than it should be with smaller writes.  Larger write chunks bring performance up to what it should be.

Latency is also atrocious, 100ms+ being the norm for writes, as viewed via the VMware client.

Again, testing in my other white box shows performance in line with what it should be.

I am going to try the async driver later to see if that improves performance at all.

If not... well.  I am not sure what I'll do at that point.  Be very disapointed?

******

More education on my part - I now know what you are talking about.

It appears that with many onboard adapters that *work* with ESXi, write caching using the onboard volatile memory (8-64MB or so) is left enabled.  Same is true with Windows Server.

Now, you take this particular controller, for example, and when you use it the cache on the drives is disabled entirely.

This severely retards performance of the drives in nearly all I/O situations.  If you had an SSD RAID group, it would likely still be very functional.

With spindles, however, it is going to perform very poorly.  As evidenced by the OP and myself.

Note to the internet - only buy this adapter for use with SSDs.  Spindle peformance with the IR firmware and RAID is horrible on the 9210-8i.

alexandrucovali
Contributor
Contributor
Jump to solution

Right now I'm looking to buy a M1015 and still confused by some people posts. This card will be for home test drive with different labs and nothing critical will be running on it. The only one question is about performance. I have couple sata3 SSDs and couple Momentus XT hybrid drives. I know the bottleneck will be the bus, but I want to avoid any performance issues on RAID card.

So... the question.... is M1015 still good from performance point of view or is better to with a RAID with BBU on it? like ServeRAID M5015...

Reply
0 Kudos
JarryG
Expert
Expert
Jump to solution

A real problem of M1015 (aka LSI 9240) is it does not have cache. I/O suffers badly, especially under mixed concurent load (read/write). So what I recommend most of all: get controller with its own cache!

BBU has nothing to do with performance (well, at least not directly). It is there just for the case of unexpected power interruption, to prevent lossing half (or one) GB of data that is still in controller's cache. My LSI-controller allows me to activate write-cache even without BBU. It issues one big fat warning, but allows me to do it. Actually, I have my server backed up by quite strong UPS, so BBU would be pretty much useless...

So from performance point of view: get controller with on-board cache, and activate both read and write-caching.

From security point of view: get either BBU or UPS (or accept the risk of possible data-lost)...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉
Reply
0 Kudos
alexandrucovali
Contributor
Contributor
Jump to solution

Ok. So for me BBU is useless as you said. This is home test environment and I don't suffer any power outage. I'm stressed about performance and I'm trying to avoid bottleneck as much as possible.

Can you recommend me a good RAID controller that can be used at home but without losing in performance. I know, there were plenty of RAID controllers, one better than another, but I would like to see customer review in a home lab.

Thanks.

Reply
0 Kudos