VMware Cloud Community
deltajoka
Contributor
Contributor

Per-VM I/O Monitoring On NFS Datastores

I realize that this issue has been up on the forums before, but no real alternative solution has been posted. So the purpose of this post is to hopefully stir up some discussion.

We have recently migrated from local VMFS datastores to NFS datastores on a IBM System Storage N3300 Model A20 filer (basically a rebranded NetApp). It is no longer possible to get statistics of per-VM I/O in vCenter or by using esxtop on the ESX hosts. We have filed a Support Case to VMware about this, but there is apparently no solution on the VMware side of things. VMware just refers to monitoring this on the NFS filer side. On the IBM filer, we can only monitor I/O per volume (and having a volume per VM is just not feasible). There is no way (that we know of) to monitor I/O per file on the NFS filer.

We think this is a huge drawback for NFS datastores (which is quite popular). Quite frankly, we are surprised that VMWare don't seem to be interesting in implementing this.

Has anyone managed to implement an alternative solution to monitor this? We are currently trying to build a solution that promiscuously captures NFS traffic on the storage network with tools like tshark/wireshark. But progress is slow due to the hassle of tracking back filehandlers to specific files and directories.

Any constructive input is most welcomed!

Regards,

Johan Karlsson

Reply
0 Kudos
9 Replies
frank_wegner
VMware Employee
VMware Employee

> We think this is a huge drawback for NFS datastores (which is quite popular). Quite frankly, we are surprised that VMWare don't seem to be interesting in implementing this.

AFAIK, VMware does look into this topic very deeply right now. It has been recognized as a missing feature. And I see more and more customers move towards NFS-based storage for various reasons. I am not sure when exactly VMware will actually ship a solution, but I'd suggest raising this topic whenever you talk to a VMware representative (Sales, Professional Services, whatever). And I wish it would be sooner than later. You can help driving the priorities. Until then I guess we have to live with workarounds.

astorrs
Enthusiast
Enthusiast

I had a conversation last month at VMworld with John Blumenthal about this exact issue. He mentioned that VMware is well aware of customer concerns in this area and is working with some of the vendors (presumably NetApp and EMC, perhaps others) - but the implication was that any solution is >12 months out.

The best I've come up with is monitoring I/O per volume and watching which host is talking the most to a specific NFS export, anything more like you said requires wireshark. I greatly miss esxtop everytime I'm troubleshooting an NFS environment. Smiley Happy

Reply
0 Kudos
deltajoka
Contributor
Contributor

Thanks for the replies!

Good to know that VMware is aware of the issue and seem to be working on a solution at least. Too bad that it seems so far away though.

I've been able to trace the file handlers used in NFS Lookup request/replies to specific files/directories (and thus determine the VMs). I've managed to filter out packets matching these file handlers with tshark in a way that i can see (and measure) the specific traffic when a VM powers on/off. However, further operations to the "disk" in the OS of the VMs don't use these file handlers. I can't trace the actual I/O, because most of that traffic is NFS FILE_SYNC packages (which have no human readable payload displaying file or directory, like the lookup packages have) and they use different file handlers than the NFS lookup packages. Too bad. Maybe there are some NFS gurus around that can nudge me in the right direction? Or suggestions about other possible workarounds?

Reply
0 Kudos
karunk
Contributor
Contributor

Hi

Is there any methodology to identify the I/O intensive VMs that were running on NFS Datastores or on both NFS and VMFS datastores.If there exists something plz let me know.

Thank youSmiley Happy

Reply
0 Kudos
rickardnobel
Champion
Champion

kmaster wrote:

Is there any methodology to identify the I/O intensive VMs that were running on NFS Datastores or on both NFS and VMFS datastores.

You should be able to analyze this through ESXTOP in the "v" view to see the IOs per VM.

My VMware blog: www.rickardnobel.se
Reply
0 Kudos
karunk
Contributor
Contributor

Thanks but How do we get those results in csv file. In bash mode I was able to generate a csv file but it was giving all the fields approximatly 1000, where it was highly difficult to get those fields which belong to VM's IOPS. If there is any possibility to get it plese let me know.

thanks Smiley Happy

Reply
0 Kudos
mcowger
Immortal
Immortal

Have you considered vCenter Operations?

--Matt VCDX #52 blog.cowger.us
Reply
0 Kudos
durganet
Contributor
Contributor

That sounds like a Job for esxplot !

Reply
0 Kudos
jklick
Enthusiast
Enthusiast

The good news is that as of vSphere 4.1, you should be able to see most - if not all - the IO data for NFS datastores inside vCenter. However, even then, I think we're all aware vCenter isn't the easiest solution for aggregating/monitoring storage performance data. These are probably the reports you're looking for:

DatastorePerformance.jpeg

DatastoreDrillIn.jpeg

If so, the latter is simply a drill-in from the former and both are default functionality in VKernel vOPS (feel free to abuse a 30-day free trial). Additionally, keep an eye out for a free tool in the near future which will also help tackle this problem area.

Full disclosure: I can't forget to let everyone know I'm a VKernel employee or else the powers that be will be unhappy with me. Smiley Happy

@JonathanKlick | www.vkernel.com
Reply
0 Kudos