Re: "No more disc space" error, despite 1TB free!

gr99 · ‎08-31-2017

Overnight, one of my VMs stopped. vCentre was reporting "No more disc space for virtual disc XXXX.vmdk". The datastore where the VM & VMDKs are stored had over 1TB of free disc space.

Googling only shows answers telling me to free disc space on my datastore which, clearly, isn't needed. All my VMs are thick provisioned.

With the VM powered off, I just moved it to another host (leaving all the files on the same datastore) and it powered up fine.

What is going on here?

vijayrana968 · ‎08-31-2017

What is the size of memory this VM is assigned with ? What is the size of swap file its occupying in Powered On state ? I have seen this error mostly when VM is unable to power on or stopped due to no space available for vswp file which depends on allocated memory.

gr99 · ‎08-31-2017

The VM has a 2GB RAM allocation.

I wondered about possible swap file issues so tried powering off some other VMs on the same host, but that didn't improve things. There was barely 100GB of allocated VM RAM on the affected host.

vijayrana968 · ‎08-31-2017

You may need to check up 'vmname.log' file in the VM folder on datastore then. Also check datastore events and alarms.

gr99 · ‎08-31-2017

The best I can find in vmware.log is:

2017-08-31T07:11:06.828Z| vmx| I125: Msg_Question:

2017-08-31T07:11:06.828Z| vmx| I125: [msg.hbacommon.outofspace] There is no more space for virtual disk XXXX-000001.vmdk. You might be able to continue this session by freeing disk space on the relevant volume, and clicking _Retry. Click Cancel to terminate this session.

2017-08-31T07:11:06.828Z| vmx| I125: ----------------------------------------

2017-08-31T07:11:23.828Z| vcpu-0| I125: Tools: Tools heartbeat timeout.

2017-08-31T07:15:05.052Z| vmx| I125: VigorTransportProcessClientPayload: opID=aa9ffa5-c3-c7d2 seq=2862685: Receiving Bootstrap.MessageReply request.

2017-08-31T07:15:05.053Z| vmx| I125: Vigor_MessageRevoke: message 'msg.hbacommon.outofspace' (seq 92281844) is revoked

2017-08-31T07:15:05.053Z| vmx| I125: VigorTransport_ServerSendResponse opID=aa9ffa5-c3-c7d2 seq=2862685: Completed Bootstrap request.

2017-08-31T07:15:05.053Z| vmx| I125: MsgQuestion: msg.hbacommon.outofspace reply=1

2017-08-31T07:15:05.053Z| vmx| I125: Exiting because of failed disk operation.

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace:

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[0] 000003fff678d450 rip=0000000023b9b9ae rbx=0000000023b9b490 rbp=000003fff678d470 r12=0000000000000000 r13=0000000024683300 r14=000000002403aa3e r15=0000000000000001

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[1] 000003fff678d480 rip=00000000236344da rbx=000000002486eb08 rbp=000003fff678d970 r12=0000000000000001 r13=0000000024683300 r14=000000002403aa3e r15=0000000000000001

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[2] 000003fff678d980 rip=00000000237bac2b rbx=00000000323fb1a0 rbp=000003fff678d9c0 r12=00000000325c0c30 r13=0000000024683300 r14=000000002403aa3e r15=0000000000000001

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[3] 000003fff678d9d0 rip=00000000236f1351 rbx=00000000323fb1a0 rbp=000003fff678d9e0 r12=0000000000000000 r13=000000003227d440 r14=0000000000000001 r15=000003fff678da2c

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[4] 000003fff678d9f0 rip=00000000236434de rbx=000003fff69bd010 rbp=000003fff678da60 r12=0000000000000000 r13=000000003227d440 r14=0000000000000001 r15=000003fff678da2c

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[5] 000003fff678da70 rip=000000002364410d rbx=0000000000000000 rbp=000003fff678db10 r12=000008649cd4a417 r13=000003fff69bd010 r14=000000003227d440 r15=0000000032617550

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[6] 000003fff678db20 rip=0000000023635326 rbx=000000002486eb40 rbp=000003fff678dc80 r12=000000003253ad40 r13=00000000320742d0 r14=000000002486eb08 r15=0000000000000000

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[7] 000003fff678dc90 rip=0000000023631f36 rbx=0000000000000003 rbp=000003fff678dd10 r12=0000000000000000 r13=000000002402b10d r14=0000000000000000 r15=000000002464c860

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[8] 000003fff678dd20 rip=0000000025baa8cd rbx=0000000000000000 rbp=0000000000000000 r12=0000000023632558 r13=000003fff678dde8 r14=0000000000000000 r15=0000000000000000

2017-08-31T07:15:05.053Z| vmx| I125: Backtrace[9] 000003fff678dde0 rip=0000000023632581 rbx=0000000000000000 rbp=0000000000000000 r12=0000000023632558 r13=000003fff678dde8 r14=0000000000000000 r15=0000000000000000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[0] 000003fff678d450 rip=0000000023b9b9ae in function (null) in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[1] 000003fff678d480 rip=00000000236344da in function (null) in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[2] 000003fff678d980 rip=00000000237bac2b in function (null) in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[3] 000003fff678d9d0 rip=00000000236f1351 in function (null) in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[4] 000003fff678d9f0 rip=00000000236434de in function (null) in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[5] 000003fff678da70 rip=000000002364410d in function (null) in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[6] 000003fff678db20 rip=0000000023635326 in function (null) in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[7] 000003fff678dc90 rip=0000000023631f36 in function main in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[8] 000003fff678dd20 rip=0000000025baa8cd in function __libc_start_main in object /lib64/libc.so.6 loaded at 0000000025b8a000

2017-08-31T07:15:05.053Z| vmx| I125: SymBacktrace[9] 000003fff678dde0 rip=0000000023632581 in function (null) in object /bin/vmx loaded at 0000000023494000

2017-08-31T07:15:05.053Z| vmx| I125: Exiting

vijayrana968 · ‎08-31-2017

Not much information in these logs apart from same disk space error. Can you check space on ESXI console using vdf -h

gr99 · ‎08-31-2017

Tardisk Space Used

sb.v00 140M 140M

s.v00 317M 317M

net_i40e.v00 424K 420K

mtip32xx.v00 244K 241K

ata_pata.v00 40K 37K

ata_pata.v01 28K 26K

ata_pata.v02 32K 28K

ata_pata.v03 32K 29K

ata_pata.v04 36K 33K

ata_pata.v05 32K 30K

ata_pata.v06 28K 26K

ata_pata.v07 32K 30K

block_cc.v00 80K 76K

ehci_ehc.v00 92K 89K

elxnet.v00 456K 452K

emulex_e.v00 24K 22K

weaselin.t00 5M 5M

esx_dvfi.v00 416K 414K

esx_ui.v00 11M 11M

ima_qla4.v00 1M 1M

ipmi_ipm.v00 36K 34K

ipmi_ipm.v01 80K 77K

ipmi_ipm.v02 100K 96K

lpfc.v00 1M 1M

lsi_mr3.v00 272K 272K

lsi_msgp.v00 464K 463K

lsu_hp_h.v00 64K 61K

lsu_lsi_.v00 240K 237K

lsu_lsi_.v01 420K 417K

lsu_lsi_.v02 240K 237K

lsu_lsi_.v03 508K 504K

lsu_lsi_.v04 304K 302K

misc_cni.v00 24K 20K

misc_dri.v00 5M 5M

net_bnx2.v00 280K 276K

net_bnx2.v01 1M 1M

net_cnic.v00 144K 142K

net_e100.v00 308K 305K

net_e100.v01 344K 342K

net_enic.v00 140K 139K

net_forc.v00 120K 117K

net_igb.v00 316K 312K

net_ixgb.v00 400K 397K

net_mlx4.v00 340K 337K

net_mlx4.v01 228K 227K

net_nx_n.v00 1M 1M

net_tg3.v00 304K 303K

net_vmxn.v00 100K 99K

nmlx4_co.v00 576K 575K

nmlx4_en.v00 420K 418K

nmlx4_rd.v00 172K 171K

nvme.v00 172K 171K

ohci_usb.v00 60K 57K

qlnative.v00 2M 2M

rste.v00 796K 794K

sata_ahc.v00 80K 79K

sata_ata.v00 52K 51K

sata_sat.v00 60K 59K

sata_sat.v01 40K 38K

sata_sat.v02 40K 39K

sata_sat.v03 32K 30K

sata_sat.v04 28K 27K

scsi_aac.v00 172K 169K

scsi_adp.v00 428K 425K

scsi_aic.v00 284K 282K

scsi_bnx.v00 272K 268K

scsi_bnx.v01 200K 196K

scsi_fni.v00 228K 226K

scsi_hps.v00 172K 169K

scsi_ips.v00 100K 98K

scsi_meg.v00 92K 91K

scsi_meg.v01 168K 166K

scsi_meg.v02 88K 87K

scsi_mpt.v00 448K 445K

scsi_mpt.v01 492K 489K

scsi_mpt.v02 420K 416K

scsi_qla.v00 272K 271K

uhci_usb.v00 60K 57K

vmware_f.v00 47M 47M

vsan.v00 23M 23M

vsanheal.v00 2M 2M

vsanmgmt.v00 6M 6M

xhci_xhc.v00 228K 226K

xorg.v00 3M 3M

imgdb.tgz 452K 450K

state.tgz 32K 30K

onetime.tgz 60K 58K

-----

Ramdisk Size Used Available Use% Mounted on

root 32M 240K 31M 0% --

etc 28M 296K 27M 1% --

opt 32M 0B 32M 0% --

var 48M 492K 47M 1% --

tmp 256M 4K 255M 0% --

iofilters 32M 0B 32M 0% --

hostdstats 1553M 7M 1545M 0% --

And this looks similar on other hosts in the cluster.

vijayrana968 · ‎08-31-2017

what about df -h , correlate mounted volume where VM was residing before migration.

gr99 · ‎08-31-2017

Ahhh....

Three nodes have the correct size and usage, whereas two nodes don't. I've had this before where the hosts have not agreed on a datastore size. What was even worse, was that the vCentre tools couldn't fix it. You have to use the Windows client to talk direct to the ESXi host to get it to do a rescan.

Anyway, problem now solved.

Thanks,

dskwared · ‎09-01-2017

Just out of curiosity, are there any RDM LUNs attached to these hosts? I've seen this sort of thing happen if the RDM LUNs aren't set to Perennially Reserved = True.