VMware Cloud Community
davehii
Contributor
Contributor
Jump to solution

ESXi 7.0 virtual machine and datastore error

Hi I am considerable new to ESXi and to this forum. Background of the problem, I have 4 virtual machine: Windows 7, Windows 10, Linux and etc. I was copying a 30GB iso file to datastore last night. When I woke up I found the datastore browser in webclient is timing out so I'm not sure the 30GB upload had been completed. Then I put to maintenance mode so I can restart the host. However after restarting then I found that all of my 4 virtual machine become in accessible, see below screenshot. Please assist what is the next step?  Thanks.

pastedImage_0.png

Reply
0 Kudos
1 Solution

Accepted Solutions
continuum
Immortal
Immortal
Jump to solution

Update:

today we  meet via skype and I explained how to create a VMFS-header dump in a tricky environment using my Linux LiveCD.

We had to split the dump into 3 pieces as we were running out of free space in /tmp

Once I had downloaded the dump I analysed it and offered Dave a short list of VMs that looked recoverable.

He told me that a Windows7-64 VM was his top priority.

Ok - I created a small archive with the vmx-file, nvram, vmdk-descriptorfile and so on.

Those small files are contained in such a vmfs header dump.

The larger Windows7-64-flat.vmdk obviously is not included in a 2gb dump so Dave gets a scripts like this one:

### script to manually extract a flat file to a new location

### please check the IF parameter

### please check the OF parameter

### put script into a new directory for the VM - no spaces in the name

### make script executable and run it

### if you run the script via putty launch it as "nohup script &"

IF="/dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_______________________WD2DWXC1A4040288:1"

OF="Windows7-64-flat.vmdk"

dd if=$IF of=$OF bs=1M conv=notrunc seek=0 skip=83985 count=1

exit

dd if=$IF of=$OF bs=1M conv=notrunc seek=0 skip=83985 count=6144

dd if=$IF of=$OF bs=1M conv=notrunc seek=6144 skip=81937 count=2048

dd if=$IF of=$OF bs=1M conv=notrunc seek=8192 skip=90129 count=24576

echo  "### script finished" >> copy.log

When that script runs it first extracts the very first MB only.

I do that to check if the parameters are ok.

If they are ok the Windows7-64-flat.vmdk

can be tested with the Linux-command

sgdisk -p Windows7-64-flat.vmdk

If my numbers are good sgdisk should display the partitiontable and that should match the expected size of the flat.vmdk - 32gb in this case.

When the test is successful Dave can remove the exit from the script and run the full version.

So times the flat.vmdk is a tiny bit to short - which can be fixed by adding one or two empty MBs at the end.

Hope that helps other folks who consider calling me for help in similar cases ...

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

View solution in original post

Reply
0 Kudos
26 Replies
Ardaneh
Enthusiast
Enthusiast
Jump to solution

Hi

A VM with invalid tag means there is something wrong with the VMX file (or maybe other files), and according to my own experience, the file has been corrupted (the VMX file)

You can do the following steps:

- Browse the datastore and try to upload something into your datastore to be sure that everything is ok (or you can use the cp command to copy a VM folder from the source datastore to another one and see if there is an error)

- Remove VMs from inventory and register them again (if VMX file is corrupted, then you can see the related error)

- Create a new VM manifest for each one of your VMs and then attach the VMDK files of old VMs to the new ones (if you find out that your VMX file is corrupted)

Hope this could be helpfule

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

Thansk for the reply. I see the below at 'Datastores' (No item to display?) but once I press 'Datastore browser' immediately I see an error message.

pastedImage_0.png

pastedImage_1.png

Reply
0 Kudos
Ardaneh
Enthusiast
Enthusiast
Jump to solution

Your datastore may be corrupted, you can check by connecting via ssh (or WinSCP) and try to browse your /vmfs/volumes/... without any problem, but if there is a problem related to your datastore, you can see the error.

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

I had tried to SSH but look like cannot do so with a root account? Sorry I did not manage to capture the screenshot of error message.

Is there a guide for ESXi 7.0 to do SSH? Thanks.

Reply
0 Kudos
Ardaneh
Enthusiast
Enthusiast
Jump to solution

You have to enable the SSH by using Web Client or Console:

Web Client >>> Manage >>> Services >>> SSH

Console >>> Troubleshooting Options >>> Enable SSH

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

I have enabled SSH but got the "permission denied" error when tried to login with command prompt and powershell. Please advise?

pastedImage_2.png

pastedImage_1.png

Reply
0 Kudos
Ardaneh
Enthusiast
Enthusiast
Jump to solution

Check these items:

- your user name must have proper privilege to connect to your ESXi via SSH

- use some other ssh tools such as putty or SecureCRT

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

Ok putty works now, here's the screenshot. Please advise what is the next step? Thanks.

pastedImage_0.png

Reply
0 Kudos
continuum
Immortal
Immortal
Jump to solution

Your datastore is in such a bad shape that ESXi does not detect it as the datastore it remembered.

This can happen after a power failure for example.

Next check wether the physical device that was used for your datastore is available at all.

To do that look with putty in /dev/disks.

If the physical device is listed there create a VMFS header dump - see

Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay

With such a dump I can tell you what to do next.

By the way ... I highly recommend to install WinSCP now and learn how to navigate inside the ESXi filesystem.

Datastore are listed under

/vmfs/volumes

and physical devices are listed under

/dev/disks


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

Please see screenshot below:

Following the link Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay

I'm not sure what is the "device name" that I need to use for the "dd" command. Please advise? Thanks.

pastedImage_0.png

Reply
0 Kudos
continuum
Immortal
Immortal
Jump to solution

Looks like the device used by that datastore is  the one starting with t10.ATA ....

So first try to run this command:

dd if=/dev/disks/DEVICE  bs=1M count=2500 skip=0  | gzip -c >  /tmp/fulldump.gz

Replace DEVICE with the full string starting with t10.ATA ....

ESXi often does not have enough free space in /tmp

so watch the command during execution.

If it sends the message "short write" the dump failed.

Delete the created file.

Create empty directory "davehii-esxi7-dump" and connect to ESXi with WinSCP.

Edit all lines so that you have dd if=/dev/disks/t10.ATA... - you need to use the full name here.

Then run this commands one by one - after each command  download the new file to "davehii-esxi7-dump"

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=0  | gzip -c >; /tmp/partdump-0.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=250  | gzip -c > /tmp/partdump-250.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=500  | gzip -c > /tmp/partdump-500.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=750  | gzip -c > /tmp/partdump-750.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=1000  | gzip -c > /tmp/partdump-1000.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=1250  | gzip -c > /tmp/partdump-1250.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=1500  | gzip -c > /tmp/partdump-1500.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=1750  | gzip -c > /tmp/partdump-1750.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=2000  | gzip -c > /tmp/partdump-2000.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=2250  | gzip -c > /tmp/partdump-2250.gz

dd if=/dev/disks/DEVICE  bs=1M count=250 skip=2500  | gzip -c > /tmp/partdump-2500.gz  

Next time you create a post like this dont display the content of a putty session as an imagefile.

Instead just copy the original text and paste it here.

Then I could have send commands with the correct filename ... helps to avoid errors ...

Add a textfile to "davehii-esxi7-dump" and give me some details about the content you need most urgently ( vm-directory-name, size of vmdks , priority list ...

Then zip the directory and upload it somewhere - make sure I do not need to create an account to be able to access the dump.

Ulli


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

Hi I had typed the below but now has an input/output error, please advise?

[root@localhost:~] dd if=/dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_________

______________WD2DWXC1A4040288 bs=1M count=2500 skip=0  | gzip -c >  /tmp/fulldu

mp.gz

dd: /dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_______________________WD2DWXC1A4040288: Input/output error

[root@localhost:~]

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

Any help? I had also tried to run the command one by one but also see the 'input/output error'

[root@localhost:~] dd if=/dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_________

______________WD2DWXC1A4040288 bs=1M count=250 skip=0  | gzip -c > /tmp/partdump

-0.gz

dd: /dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_______________________WD2DWXC1A4040288: Input/output error

[root@localhost:~]

Reply
0 Kudos
continuum
Immortal
Immortal
Jump to solution

Hi Dave

sorry I was unavailable for a few days ...

do you have another still working datastore where we could create a Linux VM ?

If the dump attempt causes an I/O error we need a Linux VM  to work around the problem


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

Hi the datastore that has the error is fully partition so no storage left. What I can think of is add another hdd to the machine and install a Linux VM?

Or can I have another machine with Linux (not necessary VM)?

Reply
0 Kudos
continuum
Immortal
Immortal
Jump to solution

All I need is a Linux VM that has ssh access to the ESXI - the Linux in theory can be a physical machine on the other side of the planet.

I prefer to use this iso:

http://sanbarrow.com/livecds/moa64-nogui/MOA64-nogui-incl-src-111014-efi.iso

or

http://sanbarrow.com/files/isohybrid-VMsickbay180404-032520-efi.iso


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

Ok I had setup the Linux VM under another machine which only have ESXi 5.0 I hope this is ok. Please advise what is the next step?

pastedImage_0.png

Reply
0 Kudos
continuum
Immortal
Immortal
Jump to solution

Call me via skype.

I expect a teamviewer session to your Windows admin host.

Install putty and winscp if you dont have those tools already.


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
davehii
Contributor
Contributor
Jump to solution

Thanks. I had sent you a skype invite, please accept?

Reply
0 Kudos