Souad90
Enthusiast
Enthusiast

vROPs History is being saved for only 2 months

Jump to solution

Hi everyone,

I have in the global setting that :

timeddd.PNG

But when I create a trend view I only get 2 months data, starting from 1 July. Any explanation ?

Thank you.

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
mark_j
Virtuoso
Virtuoso

You're looking at the SINGLE NODE maximum of 3,500,000 metrics. For multiple nodes, or 1 master and 1 data node,  each node has a max # metrics of 2,500,000, 10k objects. Further, your cluster management view of # metrics isn't the place to look (it is wrong). Open a System Audit view and look at # metrics collecting and # objects collecting.

Also what you haven't said is if you're using HA ad you have a Master and a Master Replica.. doing so would HALF your # metrics and # objects supported.

Basically, just generate the system audit and append it here, along with a more complete screenshot of your cluster management view including role names.

In your screenshot, you can see that you had a purge of data Aug-1. We need more visibility, though, in to both nodes disk space %-util. Give us this view, going back 90 days (I'm laying it out so you know how to make it):

Screen Shot 2016-08-18 at 9.46.05 AM.png

If you find this or any other answer useful please mark the answer as correct or helpful.

View solution in original post

6 Replies
mark_j
Virtuoso
Virtuoso

Yup, you don't have enough disk space on your vR Ops nodes. When you hit 85% utilization, it'll do an automatic purge and free up space for you. The smaller the disk you have, the larger chunk of data it'll carve off. I've seen 400 GB vR Ops nodes do a purge at 85% and bring the utilization down to 65%, so ~100-150 GB.

vR Ops doesn't just randomly do this - it'll generate alerts first at the 80% or so point (I don't recall the symptom threshold offhand). As long as you have notifications set up for the vR Ops Self Monitoring, you'll know before it happens and have an opportunity to add disk space.

If you find this or any other answer useful please mark the answer as correct or helpful.
0 Kudos
Souad90
Enthusiast
Enthusiast

Hi Mark.j,

Thanks a lot for your reply. I don't think that's the main cause. The master node has over 400GB and it has over 150 GB free space. But yes the workload is strangely high 91%.

Same for the Data Node: CapacityRemainnig 75GB and 91% Workload .

Is this still not enough? You think we should add more space?

Thank you again.

Regards

0 Kudos
Souad90
Enthusiast
Enthusiast

By the way, which Workload metric should we take in consideration ? Disk Space | Workload % or Disk Space - Total Usage | Workload % ???

Here is the workload trend for the Master node:

workload.png

Moreover, according to vROps sizing table:

obj.PNG

As we have two large nodes, we are way too far from the maximum :

size.PNG

Thank you in advance.

0 Kudos
mark_j
Virtuoso
Virtuoso

You're looking at the SINGLE NODE maximum of 3,500,000 metrics. For multiple nodes, or 1 master and 1 data node,  each node has a max # metrics of 2,500,000, 10k objects. Further, your cluster management view of # metrics isn't the place to look (it is wrong). Open a System Audit view and look at # metrics collecting and # objects collecting.

Also what you haven't said is if you're using HA ad you have a Master and a Master Replica.. doing so would HALF your # metrics and # objects supported.

Basically, just generate the system audit and append it here, along with a more complete screenshot of your cluster management view including role names.

In your screenshot, you can see that you had a purge of data Aug-1. We need more visibility, though, in to both nodes disk space %-util. Give us this view, going back 90 days (I'm laying it out so you know how to make it):

Screen Shot 2016-08-18 at 9.46.05 AM.png

If you find this or any other answer useful please mark the answer as correct or helpful.

View solution in original post

Souad90
Enthusiast
Enthusiast

Hi Mark.j

I need your help please to understand. In fact, for us, the maximum objects per Node (Multi-Node mode) is 10000. and the maximum collected metric per Node is 2500000. So as we have two nodes: Master and Data, our real limit is 20000 for objects and 5000000 for the metrics.

In the Audit report, Resources configured is 10809, and Resources Collecting is 10780 ( that is for both nodes).  Regarding the metrics: Metric Configured is 16401370,  Metric Collecting is 2173375 , Super Metrics is 737683 and vCenter Operations Generated is 105690.

My questions are:

  • What do we take in consideration: the configured metrics/resources or the collecting ones?
  • According to the numbers I provided above, Do we have sizing problem or not ??
  • Having only 2 Months history, is it about sizing or about disk space?

Thank you very much.

0 Kudos
sxnxr
Commander
Commander

The best place to look to see if you are sized correctly is the attachment on vRealize Operations Manager 6.1, 6.2, and 6.2.1 Sizing Guidelines (2130551) | VMware KB download it and go to the advanced tab and plug in the info from your audit report. I spent months arguing with GSS about right sizing our environment because they were using the admin UI to determine what was being collected. Once they realised they were wrong and used the audit report and conceded that i was right sized they fixed the performance problems i was having (this didn't include 6 nodes with 16 vcpus as they originally suggested) Long story short this guide is very accurate. As mark stated make sure that you put HA enabled if you have a master replica.

On the data retention as far as i know it it all disk space required. The more history the more disk space you need. Dont forget to extend all nodes to be the same if using HA