VMware Cloud Community
continuum
Immortal
Immortal

Who deleted this datastore ? - a script ? - a bug ? - misconfigured tool ?

I have a quite strange issue - I need to find out what or who deleted a VMFS volume by first deleting the partitiontable - then recreating it and formatting it

the following entries can be found in hostd.log

2013-08-05T13:23:39.310Z [FFFB4B90 verbose 'Partitionsvc' opID=291C9D06-000000D0] InvokePartedUtil /sbin/partedUtil "getptbl" "/vmfs/devices/disks/naa.6848f690eeeb95001903ddd4183d53ef"

2013-08-05T13:23:39.311Z [FFFB4B90 info 'SysCommandPosix' opID=291C9D06-000000D0] ForkExec(/sbin/partedUtil)  1950716

2013-08-05T13:23:39.364Z [FFFB4B90 verbose 'Partitionsvc' opID=291C9D06-000000D0] Output Stream from partedUtil while getting partitions for /vmfs/devices/disks/naa.6848f690eeeb95001903ddd4183d53ef:

--> gpt

--> 240614 255 63 3865468800

-->

2013-08-05T13:23:39.364Z [FFFB4B90 verbose 'Partitionsvc' opID=291C9D06-000000D0] InvokePartedUtil /sbin/partedUtil "getptbl" "/vmfs/devices/disks/naa.6848f690eeeb95001903ddd4183d53ef"

2013-08-05T13:23:39.365Z [FFFB4B90 info 'SysCommandPosix' opID=291C9D06-000000D0] ForkExec(/sbin/partedUtil)  1950717

2013-08-05T13:23:39.416Z [FFFB4B90 verbose 'Partitionsvc' opID=291C9D06-000000D0] Output Stream from partedUtil while getting partitions for /vmfs/devices/disks/naa.6848f690eeeb95001903ddd4183d53ef:

--> gpt

--> 240614 255 63 3865468800

-->

2013-08-05T13:23:39.416Z [FFFB4B90 verbose 'Partitionsvc' opID=291C9D06-000000D0] InvokePartedUtil /sbin/partedUtil "getUsableSectors" "/vmfs/devices/disks/naa.6848f690eeeb95001903ddd4183d53ef"

2013-08-05T13:23:39.417Z [FFFB4B90 info 'SysCommandPosix' opID=291C9D06-000000D0] ForkExec(/sbin/partedUtil)  1946622

2013-08-05T13:23:39.467Z [FFFB4B90 verbose 'Partitionsvc' opID=291C9D06-000000D0] Output Stream from partedUtil while getting usable sectors for /vmfs/devices/disks/naa.6848f690eeeb95001903ddd4183d53ef:

--> 34 3865468766

-->

2013-08-05T13:23:39.468Z [FFFB4B90 verbose 'Default' opID=291C9D06-000000D0] Available primary : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, ]

2013-08-05T13:23:39.468Z [FFFB4B90 verbose 'Default' opID=291C9D06-000000D0] Setting start sector to 2048 for 1MB alignment

2013-08-05T13:23:39.486Z [64881B90 info 'TaskManager' opID=291C9D06-000000D1] Task Created : haTask-ha-host-vim.host.DatastoreSystem.createVmfsDatastore-104632042

2013-08-05T13:23:39.487Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] RescanVmfs called

2013-08-05T13:23:39.554Z [64481B90 verbose 'Partitionsvc' opID=291C9D06-000000D1] InvokePartedUtil /sbin/partedUtil "setptbl" "/vmfs/devices/disks/naa.6848f690eeeb95001903ddd4183d53ef" "gpt" "1 2048 3865463909 AA31E02A400F11DB9590000C2911D1B8 0"

2013-08-05T13:23:39.554Z [64C4DB90 verbose 'Hostsvc::DatastoreSystem'] VmfsUpdate: got VMFS message [N11HostdCommon18VmkernelUpdateVmfsE:0x8310ab0] timestamp=3208504497337 specific=0 name= label=

2013-08-05T13:23:39.555Z [64481B90 info 'SysCommandPosix' opID=291C9D06-000000D1] ForkExec(/sbin/partedUtil)  1950719

2013-08-05T13:23:39.572Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] RescanVmfs called

2013-08-05T13:23:39.630Z [64481B90 info 'Hostsvc::DatastoreSystem' opID=291C9D06-000000D1] Formatting vmfs partition

2013-08-05T13:23:39.630Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] FormatVmfs called

2013-08-05T13:23:39.631Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] LookupVmfs: Cannot find VMFS volume with disk name naa.6848f690eeeb95001903ddd4183d53ef and number 1

2013-08-05T13:23:39.631Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] LookupVmfs: Cannot find VMFS volume with id <unset>, path, <unset>, or name data

2013-08-05T13:23:39.631Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] LookupNasByName: Cannot find NasVolume with volume name data

2013-08-05T13:23:42.598Z [64840B90 info 'VmkVprobSource'] VmkVprobSource::Post event: (vim.event.EventEx) {

-->    dynamicType = <unset>,

-->    key = 1693445704,

-->    chainId = 1681851588,

-->    createdTime = "1970-01-01T00:00:00Z",

-->    userName = "",

-->    datacenter = (vim.event.DatacenterEventArgument) null,

-->    computeResource = (vim.event.ComputeResourceEventArgument) null,

-->    host = (vim.event.HostEventArgument) {

-->       dynamicType = <unset>,

-->       name = "de-insevm12.ri.local",

-->       host = 'vim.HostSystem:ha-host',

-->    },

-->    vm = (vim.event.VmEventArgument) null,

-->    ds = (vim.event.DatastoreEventArgument) null,

-->    net = (vim.event.NetworkEventArgument) null,

-->    dvs = (vim.event.DvsEventArgument) null,

-->    fullFormattedMessage = <unset>,

-->    changeTag = <unset>,

-->    eventTypeId = "esx.audit.vmfs.lvm.device.discovered",

-->    severity = <unset>,

-->    message = <unset>,

-->    objectId = "ha-eventmgr",

-->    objectType = "vim.HostSystem",

-->    objectName = <unset>,

-->    fault = (vmodl.MethodFault) null,

--> }

2013-08-05T13:23:42.600Z [64840B90 info 'ha-eventmgr'] Event 300 : One or more LVM devices have been discovered on this host.

2013-08-05T13:23:42.600Z [65046B90 verbose 'Hostsvc::DatastoreSystem'] VmfsVobUpdate: vmfs VOB: discovery: msg: [N11HostdCommon6VobMsgE:0x64385310] timestamp=3208507543638

2013-08-05T13:23:42.601Z [65046B90 verbose 'Hostsvc::DatastoreSystem'] Automount: scheduling thread NOW

2013-08-05T13:23:43.142Z [64C0CB90 verbose 'Cimsvc'] Ticket issued for CIMOM version 1.0, user root

2013-08-05T13:23:56.447Z [64FC2B90 info 'VmkVprobSource'] VmkVprobSource::Post event: (vim.event.EventEx) {

-->    dynamicType = <unset>,

-->    key = 102397564,

-->    chainId = 1681847216,

-->    createdTime = "1970-01-01T00:00:00Z",

-->    userName = "",

-->    datacenter = (vim.event.DatacenterEventArgument) null,

-->    computeResource = (vim.event.ComputeResourceEventArgument) null,

-->    host = (vim.event.HostEventArgument) {

-->       dynamicType = <unset>,

-->       name = "de-insevm12.ri.local",

-->       host = 'vim.HostSystem:ha-host',

-->    },

-->    vm = (vim.event.VmEventArgument) null,

-->    ds = (vim.event.DatastoreEventArgument) null,

-->    net = (vim.event.NetworkEventArgument) null,

-->    dvs = (vim.event.DvsEventArgument) null,

-->    fullFormattedMessage = <unset>,

-->    changeTag = <unset>,

-->    eventTypeId = "esx.audit.vmfs.volume.mounted",

-->    severity = <unset>,

-->    message = <unset>,

-->    arguments = (vmodl.KeyAnyValue) [

-->       (vmodl.KeyAnyValue) {

-->          dynamicType = <unset>,

-->          key = "1",

-->          value = "[data, 51ffa76c-96d6efb1-76e4-90b11c496b72]",

-->       },

-->       (vmodl.KeyAnyValue) {

-->          dynamicType = <unset>,

-->          key = "2",

-->          value = "51ffa76b-46dd7eee-714f-90b11c496b72",

-->       },

-->       (vmodl.KeyAnyValue) {

-->          dynamicType = <unset>,

-->          key = "3",

-->          value = "rw",

-->       }

-->    ],

-->    objectId = "ha-eventmgr",

-->    objectType = "vim.HostSystem",

-->    objectName = <unset>,

-->    fault = (vmodl.MethodFault) null,

--> }

2013-08-05T13:23:56.450Z [64FC2B90 info 'ha-eventmgr'] Event 301 : File system [data, 51ffa76c-96d6efb1-76e4-90b11c496b72] on volume 51ffa76b-46dd7eee-714f-90b11c496b72 has been mounted in rw mode on this host.

2013-08-05T13:23:56.508Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] AttachVmfsExtent called on /vmfs/volumes/51ffa76c-96d6efb1-76e4-90b11c496b72

2013-08-05T13:23:56.508Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] Attach extent is NO-OP, it is already the head of volume.

2013-08-05T13:23:56.508Z [64481B90 info 'Hostsvc::DatastoreSystem' opID=291C9D06-000000D1] Added VMFS volume [/vmfs/volumes/51ffa76c-96d6efb1-76e4-90b11c496b72].

2013-08-05T13:23:56.509Z [64481B90 verbose 'Hostsvc::DatastoreSystem' opID=291C9D06-000000D1] ReconcileVMFSDatastores called: refresh = true, rescan = false

2013-08-05T13:23:56.509Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] RefreshVMFSVolumes called

2013-08-05T13:23:56.540Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] RefreshVMFSVolumes: refreshed volume, id 51ffa76c-96d6efb1-76e4-90b11c496b72, name data

2013-08-05T13:23:56.540Z [64481B90 verbose 'FSVolumeProvider' opID=291C9D06-000000D1] RefreshOneVmfsVolume on /vmfs/volumes/51ffa76c-96d6efb1-76e4-90b11c496b72

2013-08-05T13:23:56.572Z [64481B90 info 'Hostsvc::Datastore' opID=291C9D06-000000D1] Refresh: VMFS Datastore name was changed from '51ffa76c-96d6efb1-76e4-90b11c496b72' to 'data'.

2013-08-05T13:23:56.572Z [64481B90 verbose 'Hostsvc::DatastoreSystem' opID=291C9D06-000000D1] DiscoverNewDatastores: data created.

2013-08-05T13:23:56.572Z [64481B90 verbose 'Hostsvc::DatastoreSystem' opID=291C9D06-000000D1] Removing issue NoDatastoresConfigured

2013-08-05T13:23:56.572Z [64481B90 verbose 'Hostsvc::DatastoreSystem' opID=291C9D06-000000D1] ReconcileVMFSDatastores: Done discovering new filesystem volumes.

2013-08-05T13:23:56.573Z [64481B90 info 'TaskManager' opID=291C9D06-000000D1] Task Completed : haTask-ha-host-vim.host.DatastoreSystem.createVmfsDatastore-104632042 Status success

2013-08-05T13:23:56.573Z [64C4DB90 verbose 'Hostsvc::DatastoreSystem'] VmfsUpdate: got VMFS message [N11HostdCommon18VmkernelUpdateVmfsE:0x64509668] timestamp=3208504573901 specific=0 name= label=

2013-08-05T13:23:56.573Z [648C2B90 verbose 'Hostsvc::DatastoreSystem'] ReconcileVMFSDatastores called: refresh = true, rescan = false

2013-08-05T13:23:56.573Z [648C2B90 verbose 'FSVolumeProvider'] RefreshVMFSVolumes called

2013-08-05T13:23:56.637Z [648C2B90 verbose 'FSVolumeProvider'] RefreshVMFSVolumes: refreshed volume, id 51ffa76c-96d6efb1-76e4-90b11c496b72, name data

2013-08-05T13:23:56.637Z [648C2B90 verbose 'Hostsvc::Datastore'] SetVolume: Datastore 51ffa76c-96d6efb1-76e4-90b11c496b72 has changed provider volume pointer

2013-08-05T13:23:56.637Z [648C2B90 verbose 'Hostsvc::DatastoreSystem'] ReconcileVMFSDatastores: Done discovering new filesystem volumes.

2013-08-05T13:23:56.638Z [FFF10A90 verbose 'FSVolumeProvider'] AutomountVolumes called

2013-08-05T13:23:56.638Z [64CDEB90 warning 'PropertyProvider' opID=291C9D06-000000D3] _RetrieveContents took 12599905 microseconds to lock vim.host.DatastoreSystem:ha-datastoresystem

2013-08-05T13:23:56.655Z [FFF10A90 info 'Hostsvc::DatastoreSystem'] DatastoreSystemImpl.AutomountVolumes completed


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
6 Replies
peetz
Leadership
Leadership

Hi Ulli,

this looks like a regular VMFS datastore creation to me. The logs do not show the deletion, right?

Is this host managed through vCenter? If yes are there any related messages in vxpa.log or in the vCenter event log and task list?

If this was done remotely through a script then you should see related messages like "User xxx@1.2.3.4 logged in as ..." in hostd.log.

You might also look at /var/log/shell.log to see if there were any commands run interactively in an ESXi shell to re-create the datastore.

- Andreas

Twitter: @VFrontDe, @ESXiPatches | https://esxi-patches.v-front.de | https://vibsdepot.v-front.de
0 Kudos
continuum
Immortal
Immortal

Hi Andreas
I think I can rule out "normal" operations. - The same user or task or script deleted all VMs on 3 hosts with 6 datastores before resetting the partitiontables, reformatting the vmfs-volumes and resetting the ESXi configs. When he or ite was done 3 hosts rebooted and came up without network config and blank datastores.
On at least one of those hosts random Windows files were written into the section where we would expect VMFS metadata - I found a MFT-mirror in the place where I would expect the .vh.sf

Weird - really weird - I need to find out if the commands were entered manually via ssh or wether it was a cronjob or interaction via some API.

Here is another example:

2013-08-05T12:58:01.192Z [64FC2B90 verbose 'Vmsvc' opID=9042E070-00000053] Released Vm Id: 4.

2013-08-05T12:58:01.193Z [64FC2B90 verbose 'HostsvcPlugin' opID=9042E070-00000053] RemoveEntry '4'

2013-08-05T12:58:01.193Z [64FC2B90 verbose 'HostsvcPlugin' opID=9042E070-00000053] RemoveEntry succeeded

2013-08-05T12:58:01.193Z [64FC2B90 verbose 'ResourcePool ha-root-pool' opID=9042E070-00000053] Removed child 4 from pool

2013-08-05T12:58:01.193Z [64FC2B90 verbose 'HostsvcPlugin' opID=9042E070-00000053] Security domain hostd4 not found

2013-08-05T12:58:01.195Z [64C8FB90 info 'GuestFileTransferImpl'] VmOperationListener: unregister notification received for VM: 4

2013-08-05T12:58:01.195Z [64C8FB90 info 'GuestFileTransferImpl'] VmOperationListener succeeded

2013-08-05T12:58:01.196Z [64C8FB90 info 'Hbrsvc'] Replicator: UnregisterListener triggered for config VM 4

2013-08-05T12:58:01.196Z [64C8FB90 verbose 'Statssvc'] EntityRemovedListener: Deleting stats for entity 4

2013-08-05T12:58:01.210Z [64A58B90 info 'Libs'] Vix: [6796 foundryVMPowerOps.c:973]: FoundryVMPowerStateChangeCallback: /vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx, vmx/execState/val = poweredOff.

2013-08-05T12:58:01.352Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/564d3d8a-33cb-3aeb-1385-5c2b15747102.vmem'.

2013-08-05T12:58:01.352Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/564d3d8a-33cb-3aeb-1385-5c2b15747102.vmem.lck'.

2013-08-05T12:58:01.872Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.kstats'.

2013-08-05T12:58:01.873Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.kstats32'.

2013-08-05T12:58:01.873Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.kstats64'.

2013-08-05T12:58:01.873Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.monitor'.

2013-08-05T12:58:01.874Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.monitor32'.

2013-08-05T12:58:01.874Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.monitor32.1'.

2013-08-05T12:58:01.874Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.monitor64'.

2013-08-05T12:58:01.874Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.monitor64.1'.

2013-08-05T12:58:01.875Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.perf'.

2013-08-05T12:58:01.875Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.perf_kstats'.

2013-08-05T12:58:01.875Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.monitor32.0'.

2013-08-05T12:58:01.875Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.monitor64.0'.

2013-08-05T12:58:01.875Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/status'.

2013-08-05T12:58:01.876Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vprintproxy.log'.

2013-08-05T12:58:01.876Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/autoinst.flp'.

2013-08-05T12:58:01.876Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/autoinst.iso'.

2013-08-05T12:58:01.876Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/quicklook-cache.png'.

2013-08-05T12:58:01.877Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname-7d38e0c8.vswp'.

2013-08-05T12:58:01.877Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vmware-stats.log'.

2013-08-05T12:58:01.888Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vmware.log'.

2013-08-05T12:58:01.890Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vmware-6.log'.

2013-08-05T12:58:01.909Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vmware-7.log'.

2013-08-05T12:58:01.911Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vmware-5.log'.

2013-08-05T12:58:01.912Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vmware-4.log'.

2013-08-05T12:58:01.914Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vmware-3.log'.

2013-08-05T12:58:01.915Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/vmware-2.log'.

2013-08-05T12:58:01.916Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmxf'.

2013-08-05T12:58:01.917Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.nvram'.

2013-08-05T12:58:01.921Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmsd'.

2013-08-05T12:58:01.922Z [64A58B90 info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx'.

2013-08-05T12:58:01.932Z [64FC2B90 info 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Failed to unset VM medatadata: FileIO error: Could not find file  : /vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname-aux.xml.tmp.

2013-08-05T12:58:01.934Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] DeleteVmDirectory: Deleting vm dir (as superuser) '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname'

2013-08-05T12:58:01.934Z [64F81B90 verbose 'Hostsvc::DatastoreSystem'] Datastore-Vdisk refresh: scheduling thread

2013-08-05T12:58:01.934Z [64FC2B90 info 'TaskManager' opID=9042E070-00000053] Task Completed : haTask-4-vim.ManagedEntity.destroy-104631949 Status success

2013-08-05T12:58:01.934Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Close Handle called

2013-08-05T12:58:01.935Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Shutting down VMDB service...

2013-08-05T12:58:01.935Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Unregistering callback...

2013-08-05T12:58:01.935Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] ...done

2013-08-05T12:58:01.935Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Unsubscribed from events.

2013-08-05T12:58:01.935Z [64FC2B90 info 'Libs' opID=9042E070-00000053] Vix: [20705 foundryVM.c:10650]: Error VIX_E_INVALID_ARG in VixVM_CancelOps(): One of the parameters was invalid

2013-08-05T12:58:01.935Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Canceled outstanding Foundry operations.

2013-08-05T12:58:01.935Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Released VM handle.

2013-08-05T12:58:01.936Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Closed VM handle.

2013-08-05T12:58:01.936Z [64FC2B90 info 'ha-eventmgr' opID=9042E070-00000053] Event 272 : Removed replacedname (Backup) on blablabla.local from ha-datacenter

2013-08-05T12:58:01.936Z [64FC2B90 info 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] State Transition (VM_STATE_DELETING -> VM_STATE_GONE)

2013-08-05T12:58:01.936Z [64FC2B90 verbose 'ha-host' opID=9042E070-00000053] ModeMgr::End: op = normal, current = normal, count = 5

2013-08-05T12:58:01.936Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx' opID=9042E070-00000053] Destroy VM complete

2013-08-05T12:58:01.937Z [64FC2B90 info 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx'] Virtual machine object cleanup

2013-08-05T12:58:01.937Z [64FC2B90 verbose 'vm:/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/replacedname.vmx'] Closed VM handle.

pam_per_user: create_subrequest_handle(): doing map lookup for user "root"

pam_per_user: create_subrequest_handle(): creating new subrequest (user="root", service="system-auth-generic")

Accepted password for user root from 10.49.2.58

2013-08-05T12:58:16.889Z [FFF10A90 info 'Vimsvc'] [Auth]: User root

2013-08-05T12:58:16.889Z [FFF10A90 info 'ha-eventmgr'] Event 273 : User root@10.49.2.58 logged in

pam_per_user: create_subrequest_handle(): doing map lookup for user "root"

pam_per_user: create_subrequest_handle(): creating new subrequest (user="root", service="system-auth-generic")

Accepted password for user root from 10.49.2.58

2013-08-05T12:58:17.228Z [64840B90 info 'Vimsvc'] [Auth]: User root

2013-08-05T12:58:17.228Z [64840B90 info 'ha-eventmgr'] Event 274 : User root@10.49.2.58 logged in

2013-08-05T12:58:17.370Z [FFFB4B90 verbose 'Default'] CloseSession called for session id=520a9289-5eb6-acb7-50ab-b1042027528d

2013-08-05T12:58:17.370Z [FFFB4B90 info 'ha-eventmgr'] Event 275 : User root logged out

...

I have not seen a hostd log entry like:
info 'Libs'] SNAPSHOT: SnapshotDeleteFile deleted '/vmfs/volumes/518ccb24-ba8547cb-132a-90b11c496b72/A4/replacedname/gmon.monitor32'.
before.


This issue is keeping me busy since a few weeks - in that time I have restored. about 100 critical VMs from unreadable vmfs-volumes.

Monday ESXi host number 7, 8 and 9 were affected. It almost always happens following this pattern:
the hosts lose their network config, reboot and come up with empty datastores or unpartitioned volumes.

Good practice for me by the way but the customer is getting sick of it - this is the first time I was able find the IP of the host that seems to be responsible - needless to say that the IP is no longer active.


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
peetz
Leadership
Leadership

This sounds like a very interesting criminal case ...

I cannot think of any bug that causes such behavior, but really tend to believe that your client is under attack by a very rogue human or a very malicious and sophisticated and yet unknown software.

Are the hosts using shared (SAN) storage or only local disks? Are they managed by vCenter?

Are there any other products interacting with the hosts or vCenter? vCloud Director, Lab Manager or something?

Have you checked the hosts for unusual crontab entries or suspicious VIB files installed?

Have you considered re-installing all hosts?

How about limiting remote access (https for the API and ssh) through the hosts' firewall to well known and clean workstations?

Twitter: @VFrontDe, @ESXiPatches | https://esxi-patches.v-front.de | https://vibsdepot.v-front.de
0 Kudos
JarryG
Expert
Expert

"...I need to find out if the commands were entered manually via ssh or wether it was a cronjob or interaction via some API..."

If you look at timestamps, I think you can exclude the first option. No one would enter corresponding commands manually so fast. I dare to say even scripted ssh-connection or client-server api connection on local network could not be that fast.

I think someone gained shell access to esxi host (remote or local), uploaded malicious script, and let it run. Check the time of the first suspicious log-entry: if it has some "rounded" time (i.e. 14:25:00.00), it was probably started by cron. If time seems to be random (i.e. 13:23:39.310), script was started manually, while attacker was logged in.

If you still have original datastore, try to analyse it. In our company standard procedure is to grab images of all affected disks before doing anything else. Content of disk-images is then parsed, looking for ascii-strings. That might reveal important evidence, if disk was not zeroed, random-wiped, or encrypted. Also check all other log-files (I hope you collected them with central log collector). Find what esxi-version was running and if it was not the latest one (at the time of incident), what security problems were known and not fixed. I know this all is tedious work, but that's what forensic analysis is...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉
0 Kudos
admin
Immortal
Immortal

Do you see anything in the messages logs  since that does logs an event for ssh... what do you notice in the vmkernel or vmkwarning during that point of time...Check the vobd log as well...

0 Kudos
peetz
Leadership
Leadership

Just a curious follow-up, Ulli ...

How did this story continue? Were you able to find out what happened (or is still happening?) at this customer?

Were any of the hints given here useful?

Thanks

Andreas

Twitter: @VFrontDe, @ESXiPatches | https://esxi-patches.v-front.de | https://vibsdepot.v-front.de
0 Kudos