How to freeing up space in VDP with the capacity h...

racom · ‎12-05-2013

Used Capacity on one of our VDP reached 96.46% and Backup Scheduler was stopped. I'm able to start Backup Scheduler. But although I deleted many backups the used capacity is still the same. Is any way to freeing up space or should I better do rollback to the last validated checkpoint?

snekkalapudi · ‎12-05-2013

See if you can delete some of your older backups and give enough maintenance window for the garbage to clean up.

Note -

If you are using VDP 5.1, then increase the blackout window temporarily - For garbage collection (cleanup deleted backups from storage)

If you are using VDP 5.5, then just increase the maintenance window temporarily.

On longer term you may have to rework on the retention policies so that you'll not end up with too many backups.

-Suresh

racom · ‎12-05-2013

Thanks for an aswer.

I'm using VDP 5.1.10.32 so I've increased the blackout window. Must I wait to next beginning of the blackout window or can I push garbage collection anyway? I can see it like failed this morning:

505923 2013-12-05 08:15:06 CET ERROR 4202 SYSTEM PROCESS / failed garbage collection with error MSG_ERR_DISKFULL

I'm not sure if "ConnectEMC is not running." error is related to it too. Can I try to start it running "dpnctl start mcs"? I'm not familiar with Avamar too much.

snekkalapudi · ‎12-05-2013

Check reply-6 in this thread - https://community.emc.com/thread/116610

-Suresh

racom · ‎12-05-2013

Thanks again. But it looks I was caught in a trap.

Checkpoints are validating but old:

root@vm-vdp:~/#: cplist

cp.20131202100237 Mon Dec 2 11:02:37 2013 valid rol --- nodes 1/1 stripes 1916

cp.20131202103632 Mon Dec 2 11:36:32 2013 valid rol --- nodes 1/1 stripes 1916

No hfscheck yet (since last reboot?) and gsan status is degraded:

root@vm-vdp:~/#: status.dpn|less

Čt pro 5 15:26:42 CET 2013 [vm-vdp.racom.cz] Thu Dec 5 14:26:42 2013 UTC (Initialized Wed Nov

7 19:55:37 2012 UTC)

Node IP Address Version State Runlevel Srvr+Root+User Dis Suspend Load UsedMB Errlen

%Full Percent Full and Stripe Status by Disk

0.0 192.168.20.17 6.1.81-130 ONLINE fullaccess mhpu+0hpu+0000 2 false 0.28 3594 27424967

62.6% 62%(onl:644) 62%(onl:648) 62%(onl:642)

Srvr+Root+User Modes = migrate + hfswriteable + persistwriteable + useraccntwriteable

All reported states=(ONLINE), runlevels=(fullaccess), modes=(mhpu+0hpu+0000)

System-Status: ok

Access-Status: admin

No checkpoint yet

No GC yet

No hfscheck yet

Maintenance windows scheduler capacity profile is active.

WARNING: Scheduler is WAITING TO START until Fri Dec 6 08:00:00 2013 CET.

Next backup window start time: Fri Dec 6 20:00:00 2013 CET

Next blackout window start time: Fri Dec 6 08:00:00 2013 CET

Next maintenance window start time: Fri Dec 6 16:00:00 2013 CET

root@vm-vdp:~/#: dpnctl status

Identity added: /home/dpn/.ssh/dpnid (/home/dpn/.ssh/dpnid)

dpnctl: INFO: gsan status: degraded

dpnctl: INFO: MCS status: up.

dpnctl: INFO: Backup scheduler status: down.

dpnctl: INFO: axionfs status: up.

dpnctl: INFO: Maintenance windows scheduler status: enabled.

dpnctl: INFO: Unattended startup status: enabled.

I've tried to get GC in active state along to reply-5 but it looks like used capacity 96.4% is too high. I suppose GC not to start in the morning, am I right?

root@vm-vdp:~/#: avmaint config --ava | grep diskrep disk

disknocreate="90"

disknocp="96"

disknogc="85"

disknoflush="94"

diskwarning="50"

diskreadonly="65"

disknormaldelta="2"

freespaceunbalancedisk0="30"

diskfull="30"

diskfulldelta="5"

balancelocaldisks="true"

root@vm-vdp:~/#: avmaint config disknogc=97 --ava

2013/12/05-13:55:10.94029 [avmaint] ERROR: <0949> Command failed because these config values do not meet the following criteria:

2013/12/05-13:55:10.94040 [avmaint] ERROR: <0001> 0 < diskwarning(50) < diskreadonly(65) < disknogc(97) < disknocreate(90) < disknoflush(94) < disknocp(96) < 100

ERROR: avmaint: config: server_exception(MSG_ERR_INVALID_PARAMETERS)

root@vm-vdp:~/#: avmaint config disknocp=99 --ava

2013/12/05-13:55:41.90331 [avmaint] ERROR: <0949> Command failed because these config values do not meet the following criteria:

2013/12/05-13:55:41.90342 [avmaint] ERROR: <0001> disknocp(99) <= diskfulldelta(5 -> 96.5) < diskfull(30 -> 97.0) < poolnocreate(20 -> 98.0) < 100

ERROR: avmaint: config: server_exception(MSG_ERR_INVALID_PARAMETERS)

basteku73 · ‎04-22-2014

Hi,

Have You resolved your issue ??

I'm asking because I have the same problem, 96 % used capacity. Is there only solution to open a support case ??

Regards,

Sebastian Ulatowski

racom · ‎04-22-2014

I've deployed new VDP and started new backup jobs. It was the most simple and fast way for me. I didn't open a support case.

Andre443 · ‎07-10-2014

Hi,

I have the same problem with VDP 5.5.

There is solution to freeing up space ?

racom · ‎07-10-2014

I'm afraid it's avalaible for VDPA only. Try to conntact support if deploying of new VDP isn't possible for you.

wreigle2 · ‎08-30-2016

FYI, this is still an issue in VDP 6.1.2.

My appliance hit 96.15% capacity. Backups failed. The 'Backup Scheduler' service will not start.

I have a case open with VMware. We have spent 2+ hours on a WebEx trying to get this thing back online. Finally escalated the case to EMC. Waiting on a resolution.

AntonKr · ‎11-14-2016

Here goes the sad story with happy end about VDP 6.1.2.19. One unlucky day I have fed a couple of OLAP VMs to VDP and it choked failing to deduplicate properly. It have had ended up with nodes full at 98, 97 and 98 % respectively.

I do not know what exactly helped but here is a full list of my actions (add reboots as needed):

1. Delete big backups, run manual checkpoint, integrity check and garbage collection. Everything failed with MSG_DISK_FULL.

2. Rollback to earlier checkpoint, run manual checkpoint, integrity check and garbage collection. Everything failed with MSG_DISK_FULL.

3. Modify configuration threshold amounts to allow garbage collection run (as described above). Same errors when trying to set values to 99%.

4. Expand storage! Wizard went successfully. Only node1(/dev/sdc1) expanded . Run manual checkpoint, integrity check and garbage collection. Only checkpoint succeeded, hfscheck and gc failed with MSG_DISK_FULL.

5. At this point I have up and let the system run for a weekend.

6. On Monday it has magically repaired itself. Thare was a good checkpoint, good hfscheck and good gc. Admin mode persisted although.

7. Several times rebooting and running manual checkpoint including unmounting disks and running xfs_check helped at last. Fullaccess mode was there.

8. The last bit was using xfs_growfs on /dev/sdb1 and /dev/sdd1 to fix wrong size of nodes.

Edit: I think that checkpoint rollback and waiting was enough...

All

How to freeing up space in VDP with the capacity health check limit reached?