VMware Cloud Community
rayvd
Enthusiast
Enthusiast

Linux NFS server as datastore -- tips?

Hi all; I've spent some time searching through the forums here as well as Google for tips on getting the most performance out of the Linux NFS daemon (in my case the one that comes with RHEL5) when used to back an ESX/ESXi datastore.

This type of setup works fantastically with a NetApp, but performance leaves a lot to be desired -- at least out of the box, with Linux acting as the NFS server.

I've read that ensuring that I'm exporting with "async" set will help significantly, and am doing so, but is there anything else I should be doing to eek out a bit more performance? Found a few sysctl settings as well that I'll be giving a try.

Thanks in advance.

0 Kudos
15 Replies
Texiwill
Leadership
Leadership

Hello,

I would start to disable unneeded services on the Linux host, set cache sizes and perhaps enable jumbo frames. Not sure if that helps as ESX would need to be able to do this as well. Also, perhaps use a separate network just for NFS.

Other than those, there is not much else available.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

Blue Gears and SearchVMware Pro Blogs: http://www.astroarch.com/wiki/index.php/Blog_Roll

Top Virtualization Security Links: http://www.astroarch.com/wiki/index.php/Top_Virtualization_Security_Links

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
0 Kudos
boschb1
Contributor
Contributor

Yah I am having the same problems. Right now i am using RHEL5 and am not happy with my results. I've been reading a lot of posts on various forums about this, and the jist of what i read is that Linux based NFS doesn't work very well with ESX. Even the "supported" RHEL3u5 from http://www.vmware.com/resources/compatibility/docs/vi35_san_guide.pdf supposidly has problems someone reported on a post i read. They said RHEL4 was better but still not what it should be. NetApp of course i have heard nothing but good results from.

It kills me because I have two arrays on my HP DL380 8cores, 8GB mem, (24x 15k disk) SAS and (12x 15k disk) SAS on a 10Gbps network, 512MB battery backed PCI-X SAS controller. Theoretically i should be able to achieve the full 3Gbps SAS channel limit. Especially in reading mode. I can definetaley copy from array to array at this speed locally so i can safely say the bottleneck is at NFS (either my side or ESX side) So with a 10Gbps network why shouldn't i be able to on my host, or at least close to it. I understand overhead, but really there is something else for sure not right here.

I have tried async mode, many other options, even had Windows Storage Server 2003 going with NFS services. At best i got mediocer results. Meaning about 1/4 to 1/3 the performance i was expecting. I've read NetApp will push 200+MB/s, i usually get between 50 and 70, and even sometimes just 4-5 on write depending on my mood, the servers, and the setup.

So I guess my only question is has anyone achieved good results with a Linux based NFS? I am talking 200MB/s or better on a 10Gbps network or even a teamed 2,3,or 4Gbps network?

Please share a successful linux distro/ver, thank you!

0 Kudos
einjen
Contributor
Contributor

I previously used nfs on debian to backup VM's.

Worked very well, on 1GB network. Separate network though.

Speed was not top notch, but acceptabe. Never copied more than 1-1.5 TB once a day so didn't need more.

0 Kudos
larstr
Champion
Champion

NFS will not give you too good performance out of the box unless you have a NetApp box. For tuning NFS performance there are several tricks available, but I'm unsure if you'll meet your goals without extensive tuning & testing as I haven't seen a definite howto for optimal performance in such an environment.

Lars

0 Kudos
rayvd
Enthusiast
Enthusiast

The best performance I could eek out was primarily achieved by using the "async" export option -- which I don't think is the best route to go. Smiley Happy I'm trying various iSCSI target software as well on Linux to see if I can get acceptable performance that way. Nothing has come close to our NetApp in terms of performance on NAS or SAN (via iSCSI).

I'm curious how Solaris or OpenSolaris' NFS implementation would do as a datastore? Anyone tried?

0 Kudos
boschb1
Contributor
Contributor

So what i have read is ESX 3.5 is based on RHEL3u5 kernel, hence that is the supported version to use for an nfs server. Has anyone sucessfully tried that? I also read that there is kind of sub kernel ontop that is the VMKernel. This is what mounts NFS actually i believe. I was thinking there might be a way to mount via the linux side and then tell the VMKernel about the datastore as potentially a local vmfs partition or something. Basically login to ESX as root, do the mounting manually with its linux base, and then "trick" esx into discovering another datastore...

Thats the plan right now, i will let you know it goes. It will also be nice to test the performance of that mount just to isolate the problem to either esx vmkernel or nfs server.

This is really kind of a buzz kill since i just ordered more drives and array hardware only to find out that i should have used NetApp from the start. What the hell is so special about NetApp NFS?!!!

0 Kudos
boschb1
Contributor
Contributor

UPDATE:

OK so i added service console to my VMKernel

NIC (The 10Gig one) so that i can easily communicate with the NFS

server directly from the ESX shell. I also had to use esxcfg-firewall

to enable nfsClient through the wall. Finally after a bunch of poking

around i was able to mount via the MOUNT command my nfs server directly from the ESX local filesystem. hurray! Now for the test:

For Reference here is the NFS Server facts:

Array is 16 disks, all 15k, 72GB, 2.5, RAID10, stripe 128k, with 512mb BB cached SAS x4 lane controller (only 1 lane is used)

RHEL5

Filesystem is LVM, EXT3, 128mb extend size, /vmfs01

HP NC510C 10Gig Ethernet (on both servers)

Cisco 3750 Switch with two CX4 mods (10Gig ports)

Here is my export:

/vmfs01 192.168.0.0/24(rw,insecure,async,all_squash,anonuid=-2,anongid=-2)

Here is my mount command (on ESX):

mount -t nfs 192.168.0.9:/vmfs01 /nfs-test -o soft,rsize=131072,wsize=1310732

Here is the test commands:

time dd if=/dev/zero of=/nfs-test/test bs=128k count=32768 (write 4GB file)

time dd of=/dev/null if=/nfs-test/test bs=128k count=32768 (read 4GB file)

MAXIMUM THROUGHPUT CAN BE ~300MB/s or 3Gbps (3Gbps is SAS, which should be the only limiting factor theoretically)

RESULTS:

WRITE:

32768+0 records in

32768+0 records out

real 0m22.491s

user 0m0.000s

sys 0m0.900s

READ:

32768+0 records in

32768+0 records out

real 0m20.747s

user 0m0.000s

sys 0m0.130s

Sooooo.... 4GB divided by 20 seconds ~200MB/s, not 300MB/s but

at this point i am calling it par for the course since that seems to be

the NetApp max as well. Someone please tell me i shouldn't point the

finger at VMWare NFS Mounting Kernel?

Next steps see if i can turn this mount into a datastore...and get my 200MB/s from my VM

just ran the same test on the /vmfs/volumes/nfs-vmfs01 directory (which is the NFS mounted by ESX/vCenter)...

root@itqa-vm-esx24 nfs-vmfs01# time dd if=/dev/zero of=/vmfs/volumes/nfs-vmfs01/test bs=128k count=32768

32768+0 records in

32768+0 records out

real 0m42.424s

user 0m0.000s

sys 0m2.730s

root@itqa-vm-esx24 nfs-vmfs01# time dd of=/dev/null if=/vmfs/volumes/nfs-vmfs01/test bs=128k count=32768

32768+0 records in

32768+0 records out

real 0m43.236s

user 0m0.010s

sys 0m2.640s

IT TAKES TWICE AS LONG! Only ~100MB/s throughput! I guess its time to open a case... hopefully they can support this since its obvious what part is failing...

0 Kudos
boschb1
Contributor
Contributor

Ohh and check this out...

If I run the test in parallel with two shells running i can get up to 270MB/s throughput via regular mounting, however if run the same test agains the VMWare Kernel mounted directory, it sits happily still at 100MB/s still. I wonder if someone somewhere coded it to cap out at 1Gbps? because that is exactly what looks like is happening now!!!

0 Kudos
rayvd
Enthusiast
Enthusiast

Wow, that's some weird stuff. I'll have to try some tests on my end.

0 Kudos
larstr
Champion
Champion

Testing from the service console is often quite meaningless since IO is done at low priority there. You should rather do your testing from inside a VM to get more usefull results.

Lars

0 Kudos
boschb1
Contributor
Contributor

I agree the VM is the place to test, and i have done this, however i get the same result. 100MB/s from IOMeter. I was just digging around to see if i could find the bottleneck for this, if it was the VM, Kernel, NFS or the NFS Server. From what i found it looks like it is Kernel NFS implementation that is holding it all back. The only reason i used a service console was so i would have an interface that i could use. Anyways its all there you can see for yourself.

0 Kudos
boschb1
Contributor
Contributor

So i have run a few more tests. It seems the performance is mostly related to TCP. If do normal linux mounting with TCP i get a similar performance hit. Granted i can tune it to make up some performance with different r/wsize options, but still is about 50% slower than udp. Anyone know a way to add my own mount options?

Next i am going to turn on Jumbo frames to see if that helps, should help take some of the hit off the extra TCP handling. Also a handy tool is esxtop to show cpu utilization during the access. Luckily neither test really sends the cpu above 50-60%.

I have a case open now and hopefully can get some answers, or at least a way to turn configure my own mount options for vmkernel to use. I don't know what they don't expose that?

0 Kudos
rayvd
Enthusiast
Enthusiast

Wow, thanks for all your testing on this. I'd be curious to know if the NFS mount done when talking with a NetApp NFS datastore is using TCP vs UDP as well (I would guess it is as you'd think the mounting mechanism wouldn't have the first clue about what type of NFS device it's talking to).

Please keep us posted on what you find out and hear from VMware.

0 Kudos
boschb1
Contributor
Contributor

I can't say for sure, but everything i've read states the VMWare NFS is TCP only. You must have a TCP NFS server for it to work. Also I don't know how to confirm, but the default rsize/wsize of mount is 32k, i'd imagine vmkernel is using this, but i could be wrong.

not that my cpu is pegged or anything (its 24cpu server for crying out loud) but hopefully as soon as vmware decides that 10 Gig networking is important they might add support for things like the TOE, mount options, etc. Mostly I am just a little disappointed in the lack of supported settings/drivers. I mean vmware saying that only TOE is only supported on one NIC, and then only for iSCSI and not vmkernel, and things like NFS only works well with NetApp, and only specific 10Gig cards (though the rest have drivers for esx), is like saying esx doesnt support linux.

my 2cents.

0 Kudos
ncolton
Contributor
Contributor

So what i have read is ESX 3.5 is based on RHEL3u5 kernel, hence that is the supported version to use for an nfs server. Has anyone sucessfully tried that? I also read that there is kind of sub kernel ontop that is the VMKernel. This is what mounts NFS actually i believe. I was thinking there might be a way to mount via the linux side and then tell the VMKernel about the datastore as potentially a local vmfs partition or something. Basically login to ESX as root, do the mounting manually with its linux base, and then "trick" esx into discovering another datastore...

To clarify, ESX is not based on any form of RHEL. If it were, legally, VMware would be required to release source code. The Service Console is a modified version of RHEL, which is the management interface to ESX, but is not, itself, ESX. On ESX versions 3.0.x and 3.x, you'll see that when the host is booted as an ESX system, instead of Service Console only mode, there is no eth0 and related, but instead vswif and vmnic adapters. The Service Console gains access to devices through the vmkernel. I find it best to consider the Service Console a highly priviliged virtual machine with lots of deep hooks into vmkernel interfaces.

0 Kudos