Hi I am considerable new to ESXi and to this forum. Background of the problem, I have 4 virtual machine: Windows 7, Windows 10, Linux and etc. I was copying a 30GB iso file to datastore last night. When I woke up I found the datastore browser in webclient is timing out so I'm not sure the 30GB upload had been completed. Then I put to maintenance mode so I can restart the host. However after restarting then I found that all of my 4 virtual machine become in accessible, see below screenshot. Please assist what is the next step? Thanks.
Update:
today we meet via skype and I explained how to create a VMFS-header dump in a tricky environment using my Linux LiveCD.
We had to split the dump into 3 pieces as we were running out of free space in /tmp
Once I had downloaded the dump I analysed it and offered Dave a short list of VMs that looked recoverable.
He told me that a Windows7-64 VM was his top priority.
Ok - I created a small archive with the vmx-file, nvram, vmdk-descriptorfile and so on.
Those small files are contained in such a vmfs header dump.
The larger Windows7-64-flat.vmdk obviously is not included in a 2gb dump so Dave gets a scripts like this one:
### script to manually extract a flat file to a new location
### please check the IF parameter
### please check the OF parameter
### put script into a new directory for the VM - no spaces in the name
### make script executable and run it
### if you run the script via putty launch it as "nohup script &"
IF="/dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_______________________WD2DWXC1A4040288:1"
OF="Windows7-64-flat.vmdk"
dd if=$IF of=$OF bs=1M conv=notrunc seek=0 skip=83985 count=1
exit
dd if=$IF of=$OF bs=1M conv=notrunc seek=0 skip=83985 count=6144
dd if=$IF of=$OF bs=1M conv=notrunc seek=6144 skip=81937 count=2048
dd if=$IF of=$OF bs=1M conv=notrunc seek=8192 skip=90129 count=24576
echo "### script finished" >> copy.log
When that script runs it first extracts the very first MB only.
I do that to check if the parameters are ok.
If they are ok the Windows7-64-flat.vmdk
can be tested with the Linux-command
sgdisk -p Windows7-64-flat.vmdk
If my numbers are good sgdisk should display the partitiontable and that should match the expected size of the flat.vmdk - 32gb in this case.
When the test is successful Dave can remove the exit from the script and run the full version.
So times the flat.vmdk is a tiny bit to short - which can be fixed by adding one or two empty MBs at the end.
Hope that helps other folks who consider calling me for help in similar cases ...
Ulli
Hi
A VM with invalid tag means there is something wrong with the VMX file (or maybe other files), and according to my own experience, the file has been corrupted (the VMX file)
You can do the following steps:
- Browse the datastore and try to upload something into your datastore to be sure that everything is ok (or you can use the cp command to copy a VM folder from the source datastore to another one and see if there is an error)
- Remove VMs from inventory and register them again (if VMX file is corrupted, then you can see the related error)
- Create a new VM manifest for each one of your VMs and then attach the VMDK files of old VMs to the new ones (if you find out that your VMX file is corrupted)
Hope this could be helpfule
Thansk for the reply. I see the below at 'Datastores' (No item to display?) but once I press 'Datastore browser' immediately I see an error message.
Your datastore may be corrupted, you can check by connecting via ssh (or WinSCP) and try to browse your /vmfs/volumes/... without any problem, but if there is a problem related to your datastore, you can see the error.
I had tried to SSH but look like cannot do so with a root account? Sorry I did not manage to capture the screenshot of error message.
Is there a guide for ESXi 7.0 to do SSH? Thanks.
You have to enable the SSH by using Web Client or Console:
Web Client >>> Manage >>> Services >>> SSH
Console >>> Troubleshooting Options >>> Enable SSH
I have enabled SSH but got the "permission denied" error when tried to login with command prompt and powershell. Please advise?
Check these items:
- your user name must have proper privilege to connect to your ESXi via SSH
- use some other ssh tools such as putty or SecureCRT
Ok putty works now, here's the screenshot. Please advise what is the next step? Thanks.
Your datastore is in such a bad shape that ESXi does not detect it as the datastore it remembered.
This can happen after a power failure for example.
Next check wether the physical device that was used for your datastore is available at all.
To do that look with putty in /dev/disks.
If the physical device is listed there create a VMFS header dump - see
Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay
With such a dump I can tell you what to do next.
By the way ... I highly recommend to install WinSCP now and learn how to navigate inside the ESXi filesystem.
Datastore are listed under
/vmfs/volumes
and physical devices are listed under
/dev/disks
Please see screenshot below:
Following the link Create a VMFS-Header-dump using an ESXi-Host in production | VM-Sickbay
I'm not sure what is the "device name" that I need to use for the "dd" command. Please advise? Thanks.
Looks like the device used by that datastore is the one starting with t10.ATA ....
So first try to run this command:
dd if=/dev/disks/DEVICE bs=1M count=2500 skip=0 | gzip -c > /tmp/fulldump.gz
Replace DEVICE with the full string starting with t10.ATA ....
ESXi often does not have enough free space in /tmp
so watch the command during execution.
If it sends the message "short write" the dump failed.
Delete the created file.
Create empty directory "davehii-esxi7-dump" and connect to ESXi with WinSCP.
Edit all lines so that you have dd if=/dev/disks/t10.ATA... - you need to use the full name here.
Then run this commands one by one - after each command download the new file to "davehii-esxi7-dump"
dd if=/dev/disks/DEVICE bs=1M count=250 skip=0 | gzip -c >; /tmp/partdump-0.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=250 | gzip -c > /tmp/partdump-250.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=500 | gzip -c > /tmp/partdump-500.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=750 | gzip -c > /tmp/partdump-750.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=1000 | gzip -c > /tmp/partdump-1000.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=1250 | gzip -c > /tmp/partdump-1250.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=1500 | gzip -c > /tmp/partdump-1500.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=1750 | gzip -c > /tmp/partdump-1750.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=2000 | gzip -c > /tmp/partdump-2000.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=2250 | gzip -c > /tmp/partdump-2250.gz
dd if=/dev/disks/DEVICE bs=1M count=250 skip=2500 | gzip -c > /tmp/partdump-2500.gz
Next time you create a post like this dont display the content of a putty session as an imagefile.
Instead just copy the original text and paste it here.
Then I could have send commands with the correct filename ... helps to avoid errors ...
Add a textfile to "davehii-esxi7-dump" and give me some details about the content you need most urgently ( vm-directory-name, size of vmdks , priority list ...
Then zip the directory and upload it somewhere - make sure I do not need to create an account to be able to access the dump.
Ulli
Hi I had typed the below but now has an input/output error, please advise?
[root@localhost:~] dd if=/dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_________
______________WD2DWXC1A4040288 bs=1M count=2500 skip=0 | gzip -c > /tmp/fulldu
mp.gz
dd: /dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_______________________WD2DWXC1A4040288: Input/output error
[root@localhost:~]
Any help? I had also tried to run the command one by one but also see the 'input/output error'
[root@localhost:~] dd if=/dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_________
______________WD2DWXC1A4040288 bs=1M count=250 skip=0 | gzip -c > /tmp/partdump
-0.gz
dd: /dev/disks/t10.ATA_____WDC_WD2500BEVT2D22A23T0_______________________WD2DWXC1A4040288: Input/output error
[root@localhost:~]
Hi Dave
sorry I was unavailable for a few days ...
do you have another still working datastore where we could create a Linux VM ?
If the dump attempt causes an I/O error we need a Linux VM to work around the problem
Hi the datastore that has the error is fully partition so no storage left. What I can think of is add another hdd to the machine and install a Linux VM?
Or can I have another machine with Linux (not necessary VM)?
All I need is a Linux VM that has ssh access to the ESXI - the Linux in theory can be a physical machine on the other side of the planet.
I prefer to use this iso:
http://sanbarrow.com/livecds/moa64-nogui/MOA64-nogui-incl-src-111014-efi.iso
or
http://sanbarrow.com/files/isohybrid-VMsickbay180404-032520-efi.iso
Ok I had setup the Linux VM under another machine which only have ESXi 5.0 I hope this is ok. Please advise what is the next step?
Call me via skype.
I expect a teamviewer session to your Windows admin host.
Install putty and winscp if you dont have those tools already.
Thanks. I had sent you a skype invite, please accept?