VMware Cloud Community
P4thos
Enthusiast
Enthusiast
Jump to solution

vCenter & Veeam : Linux machines are loosing network due to Checkpoint_Unstun

Dear Commuties,

Since few day we are facing issues.

We are backuping VM using Veeam ( v9.5.0.8823 ) via veeam proxy.

The VM are on a Vmware infrastructure v6. Host run on v6 build 3620759 and vCenter is running version 6 build 3634794

When veeam is perfoming a backup, it create a snapshot, backup the VM and then consolidate the disk. The problem seems related to the consolidation.

In the file vmware.log, we have such log :

2017-02-14T19:11:00.677Z| vmx| I120: SnapshotVMX_TakeSnapshot start: 'VEEAM BACKUP TEMPORARY SNAPSHOT', deviceState=0, lazy=0, logging=0, quiesced=0, forceNative=0, tryNative=1, saveAllocMaps=0 cb=B563C00, cbData=3260D6D0

2017-02-14T19:11:00.705Z| vmx| I120: DISKLIB-LIB_CREATE   : DiskLibCreateCreateParam: vmfsSparse grain size is set to 1 for '/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-000001.vmdk'

2017-02-14T19:11:00.706Z| vmx| I120: DISKLIB-LIB_CREATE   : DiskLibCreateCreateParam: vmfsSparse grain size is set to 1 for '/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-000001.vmdk'

2017-02-14T19:11:00.706Z| vmx| I120: SNAPSHOT: SnapshotPrepareTakeDoneCB: Prepare phase complete (The operation completed successfully).

2017-02-14T19:11:00.706Z| vcpu-0| I120: Destroying virtual dev for scsi0:0 vscsi=21917

2017-02-14T19:11:00.706Z| vcpu-0| I120: VMMon_VSCSIStopVports: No such target on adapter

2017-02-14T19:11:00.707Z| vcpu-0| I120: Destroying virtual dev for scsi0:1 vscsi=21918

2017-02-14T19:11:00.707Z| vcpu-0| I120: VMMon_VSCSIStopVports: No such target on adapter

2017-02-14T19:11:00.726Z| vcpu-0| I120: SnapshotVMXTakeSnapshotWork: Transition to mode 0.

2017-02-14T19:11:00.726Z| vcpu-0| I120: Closing all the disks of the VM.

2017-02-14T19:11:00.726Z| vcpu-0| I120: Closing disk scsi0:1

2017-02-14T19:11:00.727Z| vcpu-0| I120: DISKLIB-CBT   : Shutting down change tracking for untracked fid 2265566514.

2017-02-14T19:11:00.727Z| vcpu-0| I120: DISKLIB-CBT   : Successfully disconnected CBT node.

2017-02-14T19:11:02.022Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-flat.vmdk" : closed.

2017-02-14T19:11:02.022Z| vcpu-0| I120: Closing disk scsi0:0

2017-02-14T19:11:02.023Z| vcpu-0| I120: DISKLIB-CBT   : Shutting down change tracking for untracked fid 2393230632.

2017-02-14T19:11:02.023Z| vcpu-0| I120: DISKLIB-CBT   : Successfully disconnected CBT node.

2017-02-14T19:11:03.423Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-flat.vmdk" : closed.

2017-02-14T19:11:03.614Z| vcpu-0| I120: SNAPSHOT: SnapshotConfigInfoReadEx: Creating new snapshot dictionary, '/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine.vmsd.usd'.

2017-02-14T19:11:04.285Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-flat.vmdk" : open successful (1114133) size = 0, hd = 0. Type 3

2017-02-14T19:11:04.285Z| vcpu-0| I120: DISKLIB-LIB_BLOCKTRACK   : Resuming change tracking.

2017-02-14T19:11:04.805Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-flat.vmdk" : closed.

2017-02-14T19:11:04.840Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-flat.vmdk" : open successful (1114133) size = 0, hd = 0. Type 3

2017-02-14T19:11:04.840Z| vcpu-0| I120: DISKLIB-LIB_BLOCKTRACK   : Resuming change tracking.

2017-02-14T19:11:05.236Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-flat.vmdk" : closed.

2017-02-14T19:11:07.858Z| vcpu-0| I120: SNAPSHOT: SnapshotDumperOpenFromInfo: Creating checkpoint file /vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-Snapshot24.vmsn

2017-02-14T19:11:07.913Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-flat.vmdk" : open successful (29) size = 53687091200, hd = 0. Type 3

2017-02-14T19:11:07.913Z| vcpu-0| I120: DISKLIB-LIB_BLOCKTRACK   : Resuming change tracking.

2017-02-14T19:11:07.964Z| vcpu-0| I120: DISKLIB-LIB_CREATE   : DiskLibCreateCreateParam: vmfsSparse grain size is set to 1 for '/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-000001.vmdk'

2017-02-14T19:11:07.965Z| vcpu-0| I120: DISKLIB-LIB_CREATE   : CREATE CHILD: "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-000001.vmdk" -- vmfsSparse cowGran=1 allocType=0 policy=''

2017-02-14T19:11:07.965Z| vcpu-0| I120: DISKLIB-LIB_CREATE   : CREATE-CHILD: Creating disk backed by 'default'

2017-02-14T19:11:08.048Z| vcpu-0| I120: DISKLIB-VMFS_SPARSE : VmfsSparseExtentCreate: "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-000001-delta.vmdk" : success

2017-02-14T19:11:08.053Z| vcpu-0| I120: DISKLIB-DSCPTR: "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-000001.vmdk" : creation successful.

2017-02-14T19:11:08.066Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-000001-delta.vmdk" : open successful (17) size = 106496, hd = 0. Type 8

2017-02-14T19:11:12.085Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-000001-delta.vmdk" : closed.

2017-02-14T19:11:12.086Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-flat.vmdk" : closed.

2017-02-14T19:11:12.100Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-flat.vmdk" : open successful (29) size = 26843545600, hd = 0. Type 3

2017-02-14T19:11:12.100Z| vcpu-0| I120: DISKLIB-LIB_BLOCKTRACK   : Resuming change tracking.

2017-02-14T19:11:12.122Z| vcpu-0| I120: DISKLIB-LIB_CREATE   : DiskLibCreateCreateParam: vmfsSparse grain size is set to 1 for '/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-000001.vmdk'

2017-02-14T19:11:12.122Z| vcpu-0| I120: DISKLIB-LIB_CREATE   : CREATE CHILD: "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-000001.vmdk" -- vmfsSparse cowGran=1 allocType=0 policy=''

2017-02-14T19:11:12.122Z| vcpu-0| I120: DISKLIB-LIB_CREATE   : CREATE-CHILD: Creating disk backed by 'default'

2017-02-14T19:11:12.339Z| vcpu-0| I120: DISKLIB-VMFS_SPARSE : VmfsSparseExtentCreate: "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-000001-delta.vmdk" : success

2017-02-14T19:11:12.669Z| vcpu-0| I120: DISKLIB-DSCPTR: "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-000001.vmdk" : creation successful.

2017-02-14T19:11:12.705Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-000001-delta.vmdk" : open successful (17) size = 53248, hd = 0. Type 8

2017-02-14T19:11:14.352Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-000001-delta.vmdk" : closed.

2017-02-14T19:11:14.353Z| vcpu-0| I120: DISKLIB-VMFS  : "/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-flat.vmdk" : closed.

2017-02-14T19:11:14.360Z| vcpu-0| A115: ConfigDB: Setting displayName = "machine"

2017-02-14T19:11:14.366Z| vcpu-0| A115: ConfigDB: Setting scsi0:0.fileName = "machine-000001.vmdk"

2017-02-14T19:11:14.366Z| vcpu-0| A115: ConfigDB: Setting scsi0:1.fileName = "machine_1-000001.vmdk"

2017-02-14T19:11:14.392Z| vcpu-0| I120: Checkpoint_Unstun: vm stopped for 13685654 us

2017-02-14T19:11:14.392Z| vcpu-0| I120: SCSI: switching scsi0 to push completion mode

2017-02-14T19:11:14.393Z| vcpu-0| A115: ConfigDB: Setting scsi0:0.redo = ""

...

...

...

2017-02-14T19:11:14.539Z| vcpu-0| I120: Creating virtual dev for scsi0:1

2017-02-14T19:11:14.539Z| vcpu-0| I120: DumpDiskInfo: scsi0:1 createType=11, capacity = 52428800, numLinks = 2, allocationType = 0

2017-02-14T19:11:14.539Z| vcpu-0| I120: SCSIDiskESXPopulateVDevDesc: Using FS backend

2017-02-14T19:11:14.539Z| vcpu-0| I120: DISKUTIL: scsi0:1 : geometry=3263/255/63

2017-02-14T19:11:14.540Z| vcpu-0| I120: SnapshotVMXTakeSnapshotWork: Transition to mode 1.

2017-02-14T19:11:14.540Z| vcpu-0| I120: SnapshotVMXTakeSnapshotComplete: Done with snapshot 'VEEAM BACKUP TEMPORARY SNAPSHOT': 24

2017-02-14T19:11:14.540Z| vcpu-0| I120: VigorTransport_ServerSendResponse opID=2c0f8a40-99-9359 seq=597082: Completed Snapshot request.

2017-02-14T19:11:14.603Z| vcpu-2| I120: HBACommon: First write on scsi0:1.fileName='/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine_1-000001.vmdk'

2017-02-14T19:11:14.603Z| vcpu-2| I120: DDB: "longContentID" = "fecb76f4a3d9d08ff500b9c39a2891a7" (was "1b2fded3c6a56db768e5814fa8b00c9d")

2017-02-14T19:11:14.941Z| vcpu-2| I120: DISKLIB-CHAIN : DiskChainUpdateContentID: old=0xa8b00c9d, new=0x9a2891a7 (fecb76f4a3d9d08ff500b9c39a2891a7)

2017-02-14T19:11:14.996Z| vcpu-1| I120: HBACommon: First write on scsi0:0.fileName='/vmfs/volumes/54e46352-93a1a651-6920-18a905c531aa/machine/machine-000001.vmdk'

2017-02-14T19:11:14.996Z| vcpu-1| I120: DDB: "longContentID" = "7cebb95f30cc654eb95c05c9d000893d" (was "1b76481db13142a2b2a04986fe5d9c3f")

2017-02-14T19:11:15.170Z| vcpu-1| I120: DISKLIB-CHAIN : DiskChainUpdateContentID: old=0xfe5d9c3f, new=0xd000893d (7cebb95f30cc654eb95c05c9d000893d)

2017-02-14T19:12:14.542Z| vmx| I120: GuestRpcSendTimedOut: message to toolbox-dnd timed out.

2017-02-14T19:20:00.315Z| vmx| I120: VigorTransportProcessClientPayload: opID=44c3a4c8-62-9a73 seq=597408: Receiving Snapshot.Delete request.

Regarding Checkpoint_Unstun, 13685654 us = 13 secondes. 13 secondes is to much regarding our filesystem ( DRBD ). Whis this time is so long ? What are the recommendation to reduce this time ?

What is GuestRpcSendTimedOut ?

Thanks in advance for help.

0 Kudos
1 Solution

Accepted Solutions
P4thos
Enthusiast
Enthusiast
Jump to solution

After a lot of test, we saw this was due to a wrong way of backuping VM using Veeam.

Virtual appliance mode was used.

We have decided to use network mode which has improved performance and reduce the Checkpoint_Unstun.

View solution in original post

0 Kudos
1 Reply
P4thos
Enthusiast
Enthusiast
Jump to solution

After a lot of test, we saw this was due to a wrong way of backuping VM using Veeam.

Virtual appliance mode was used.

We have decided to use network mode which has improved performance and reduce the Checkpoint_Unstun.

0 Kudos