Seventh77's Posts

I'm having an odd issue after having to rebuild my VCSA that crashed. I did a fresh install for stability, and am simply trying to add my hosts to the vCenter as usual. Relevant info:   - All syste... See more...
I'm having an odd issue after having to rebuild my VCSA that crashed. I did a fresh install for stability, and am simply trying to add my hosts to the vCenter as usual. Relevant info:   - All systems have static IP addresses, are on same subnet, can all communicate. - No firewalls or other traffic filtering inbetween.  - Credentials are 100% correct across the board.  - DNS is working between all hosts (host to VCSA, VCSA to host)   In my VCSA I go through the Add Host dialog as normal. I add the host using FQDN, give it credentials, it connects to the host and gives me the host summary correctly (showing ESXi version, Virtual Machines w/names, server model, etc). This tells me that the credentials are correct - it pulls all of the info as it should. I then assign the license normally, Lockdown mode disabled, give it a location in the one default datacenter. I get to the "Ready to Complete" screen normally - it shows the summary, datastores on the hypervisor, and networks.    At this point I click Finish, and immediately get: Operation Failed! Task name: Add standalone host Status: Cannot complete login due to an incorrect user name or password.   This happens regardless if I try to add the ESXi host by FQDN or IP address. It happens in Firefox, Safari, Microsoft Edge and Chrome, and I have no ad blockers or content filters installed on the browers I'm trying. I'm using a dedicated machine on the same subnet as the VCSA.    I am 100% positive that my credentials are correct, and I am using the root user to add the host. I have logged in as root to the host via the console and via SSH with these credentials and have verified them. It only happens with one specific host - I have added others normally with no issue.   What could the problem be? Any insight would be appreciated - thank you.
Marco's questions from another thread: - Have you changed anything on the cluster? - Added HA/DRS, created more VMs, etc? - What is your VM growth tax per month? - Have you changed log/stat... See more...
Marco's questions from another thread: - Have you changed anything on the cluster? - Added HA/DRS, created more VMs, etc? - What is your VM growth tax per month? - Have you changed log/statistics settings for vCenter? - Can you check if you don't have a lot of snapshots on the environment? (on SSH, do a "find /vmfs/volumes/ -name *delta*") Answers: - No, nothing has changed on the cluster. - No changes to HA/DRS, and no new VMs in the last month. - Not sure off the top of my head, but my cluster is running about 50% of it's total potential. - I have not changed logs/stats (or anything else) - There are actually no snapshots at all on the environment.
I started a new thread for this, so that this one can stay as a complete/answered thread. I'll answer you there - thanks! Multiple host disconnects with "failed to crossdup fd xxx" errors in v... See more...
I started a new thread for this, so that this one can stay as a complete/answered thread. I'll answer you there - thanks! Multiple host disconnects with "failed to crossdup fd xxx" errors in vmkernel.log
I lost two hosts in my ESXi 5.0 enterprise cluster a few days ago, and found the solution to a related issue in this thread here: https://communities.vmware.com/thread/464597 However, now anot... See more...
I lost two hosts in my ESXi 5.0 enterprise cluster a few days ago, and found the solution to a related issue in this thread here: https://communities.vmware.com/thread/464597 However, now another host in that same cluster has the same problem - making two blades and one physical server, all in the same cluster, all on ESXi 5.0 running the standard VMWare build that have gone down within a matter of a few days. vmkernel.log on each of the offending hosts is full of this: --- 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 12, /vmfs/devices/char/vob/VM type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 13, /vmfs/devices/char/vob/External type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 14, /vmfs/devices/char/vob/iScsi type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 15, /vmfs/devices/char/vob/Migrate type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 16, /vmfs/devices/char/vob/PageReti type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 17, /vmfs/devices/char/vob/Visorfs type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 18, /vmfs/devices/char/vob/Hardware type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 19, /vmfs/devices/char/vob/Vfat type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 3232: Unimplemented operation on 0x4100233874b0/SOCKET_UNIX_SERVER 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 20, /var/run/vmware/vobd-user-ctx.s type SOCKET_UNIX_SERVER: Not implemented Once that happens, the host cannot be reconnected to the cluster until I make the vpxa.cfg edits in the thread I linked above. Obviously this not an acceptable solution, because the host is down until I am around to make the edits, restart the services and reconnect the host. Why is this happening, and how can I further troubleshoot it?
Update to this - I've now had another server in this same cluster go down with the exact same problem - so that makes two blades and one physical server all with the same issue (and the same fix)... See more...
Update to this - I've now had another server in this same cluster go down with the exact same problem - so that makes two blades and one physical server all with the same issue (and the same fix). While it's nice to know how to fix this - why is this happening? This is way too much downtime.
Aha! I found this KB article which sorted it out: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2031894 It's worth mentioning that (for helpf... See more...
Aha! I found this KB article which sorted it out: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2031894 It's worth mentioning that (for helpful Googling) you'll need to chmod 777 /etc/vmware/vpxa/vpxa.cfg before you can edit it, and then chmod 444 it once finished. Restarted the vpxa service and I was able to add the host again.
I don't see an IP address at all in /etc/vmware/hostd/config.xml, or even a field where it looks like it should be. How do I go about rebuilding it? I'll try anything at this point.
Thanks for the replies. Here's a tail vpxa.log on one of the two hosts that have the issue: --- 2013-12-02T15:03:34.474Z [3E819B90 verbose 'Default' opID=WFU-4e735eb6] [VpxaInvtHost] Incremen... See more...
Thanks for the replies. Here's a tail vpxa.log on one of the two hosts that have the issue: --- 2013-12-02T15:03:34.474Z [3E819B90 verbose 'Default' opID=WFU-4e735eb6] [VpxaInvtHost] Increment master gen. no to (111): VmSnapshot:CreateMoVm 2013-12-02T15:03:34.474Z [3E819B90 verbose 'Default' opID=WFU-4e735eb6] [VpxaInvtHost] Increment master gen. no to (112): VmLayout:CreateMoVm 2013-12-02T15:03:34.474Z [3E819B90 verbose 'Default' opID=WFU-4e735eb6] [VpxaInvtHost] Increment master gen. no to (113): VmStorage:CreateMoVm 2013-12-02T15:03:34.475Z [3E819B90 verbose 'Default' opID=WFU-4e735eb6] [VpxaInvtHost] Increment master gen. no (114): VmAdded 2013-12-02T15:03:34.475Z [3E819B90 info 'Default' opID=WFU-4e735eb6] [VpxaMoHost::QueryOverheadEx] Found file backing info for device 2000 of type vim.vm.device.VirtualDisk, removing vpxd moref vim.Datastore:10.86.254.251:/vol/nfs_fas2020a before passing to hostd 2013-12-02T15:03:34.475Z [3E819B90 info 'Default' opID=WFU-4e735eb6] [VpxaMoHost::QueryOverheadEx] Found network backing info for device 4000 of type vim.vm.device.VirtualE1000, removing vpxd moref vim.Network:HaNetwork-INSOC-W-VLAN before passing to hostd --- Here's vmkernel.log: --- 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 12, /vmfs/devices/char/vob/VM type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 13, /vmfs/devices/char/vob/External type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 14, /vmfs/devices/char/vob/iScsi type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 15, /vmfs/devices/char/vob/Migrate type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 16, /vmfs/devices/char/vob/PageReti type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 17, /vmfs/devices/char/vob/Visorfs type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 18, /vmfs/devices/char/vob/Hardware type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 19, /vmfs/devices/char/vob/Vfat type CHAR: Busy 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 3232: Unimplemented operation on 0x4100233874b0/SOCKET_UNIX_SERVER 2013-12-02T15:05:29.561Z cpu14:3200)WARNING: UserObj: 675: Failed to crossdup fd 20, /var/run/vmware/vobd-user-ctx.s type SOCKET_UNIX_SERVER: Not implemented --- No zeroes on the ramdisk, plenty of space open. This is a standard VMWare build, and as I said these two hosts have worked fine for months now. They simply showed up disconnected, and I'm trying to re-add them to the cluster. I've tried it with EVC enabled and disabled, no change - same error.
I have a cluster of 15 ESXi 5.0 hosts, with a 5.1 vCenter / Enterprise Plus license. This has been running well for quite some time, but two of my hosts were disconnected tonight and I am trouble... See more...
I have a cluster of 15 ESXi 5.0 hosts, with a 5.1 vCenter / Enterprise Plus license. This has been running well for quite some time, but two of my hosts were disconnected tonight and I am troubleshooting it now. When I go to reconnect them, I get an error saying that "A general system error has occured: Timed waiting for vpxa to start". I did some searching and found that this was generally related to snapshots, but none of the VMs on either of my un-connectable hosts have any snapshots at all. I've tried: - Restarting vCenter services - Rebooting vCenter - Restarting services on the hosts - Warm reboot of hosts - Hard/cold reboot of hosts - Powering off all VMs on hosts and entering maintenance mode - DNS is working between hosts and vCenter, and vice versa - Time is correct on vCenter and hosts - Network connectivity is good between vCenter and hosts (all on the same switch) However nothing seems to work, and I still can't add these hosts back to my cluster. What's odd is that I can connect to them directly with vSphere, but I just can't get them back into my vCenter. I get through the usual prompts when adding it (where it asks you to assign a license, etc) and it sees the VMs on the host as I'm adding it, but times out with this error after about 5 minutes. Any insight would be very much appreciated.
This helped me as well, thank you.
Also worth noting, if I run a 'netstat -antp' on my VDR appliance, I see a lot of connections from my vCenter IP, with the status FIN_WAIT2. Edit: I've also tried migrating the VDR appliance s... See more...
Also worth noting, if I run a 'netstat -antp' on my VDR appliance, I see a lot of connections from my vCenter IP, with the status FIN_WAIT2. Edit: I've also tried migrating the VDR appliance so that it's on the same physical host as my vCenter, no change.
Yes - I reinstalled the plug-in, and even stood up vSphere on a completely fresh client and did a full vSphere/plug-in installation there. No change - I have 3 different systems now (my usual vSp... See more...
Yes - I reinstalled the plug-in, and even stood up vSphere on a completely fresh client and did a full vSphere/plug-in installation there. No change - I have 3 different systems now (my usual vSphere console, vCenter itself, and the fresh install) and I get the same error message in all three of them.
Hello, I'm having a heck of a time getting VDR working on one of my clusters and have been searching these forums trying just about everything I can think of. Hopefully someone can shed some l... See more...
Hello, I'm having a heck of a time getting VDR working on one of my clusters and have been searching these forums trying just about everything I can think of. Hopefully someone can shed some light on things for me. I'm running vCenter 5 with an Enterprise Plus license, with VDR 2.0.1, connecting with vSphere 5.0.0. My hosts are ESXi 5.0. The error I get in vSphere is "The Data Recovery Service did not start up. If the problem persists, pleaes restart or redeploy the Data Recovery Appliance". Here are some details: - I have restarted it several times - Redeployed it about a dozen times - Tried both version 2.0.0 and 2.0.1 of the appliance itself. - I logged into the VDR appliance and can verify that the service is definitely started. - DNS is working correctly between my vCenter, VDR and vSphere client (both ways) - VDR has a a static IP and I can ping back and forth between vCenter/VDR/vSphere with no problems - I tried running vSphere with the VDR Plug-in on my actual vCenter machine, no change - Ensured that the plug-in is the latest, correct version (reinstalled it and installed it on a fresh client, just in case) I'm looking at the logs, and I do see this: "Rejecting non-SSL connection from client (my vSphere IP)" I found a VMware KB article that says this error is caused by an outdated plug-in version, but I've already covered that step and both my VDR appliance and my Plug-In are from the same installation media. I'm at a loss at to what I can try next - any insight would be appreciated.
Should also note: My hosts are on an Enterprise Plus license, and are currently running ESXi 4.
Hello, My vCenter is version 5.0.0, running on Windows 2008 Server R2. I recently upgraded my vCenter from 4.1, and am trying to update VUM so that I can patch my hosts. I have downloaded and ... See more...
Hello, My vCenter is version 5.0.0, running on Windows 2008 Server R2. I recently upgraded my vCenter from 4.1, and am trying to update VUM so that I can patch my hosts. I have downloaded and tried every version of the VIM ISO that I can find: 4.0.0 4.1.0 5.0.0 5.1.0 And no matter which of the installers I use, I get the error message "the vcenter server is incompatible with the vmware update manager". If none of those versions are the correct one, which one do I need to update VUM on vCenter 5.0.0? And why aren't updated releases backwards compatible? Very frustrating - any insight would be appreciated. Thanks!
I have the latest VDR running on vCenter 5, with a single NFS mount as a destination. I'm backing up about 100 VMs spread across about a dozen backup jobs. Inside of those jobs, VDR is only backi... See more...
I have the latest VDR running on vCenter 5, with a single NFS mount as a destination. I'm backing up about 100 VMs spread across about a dozen backup jobs. Inside of those jobs, VDR is only backing up about 2/3 of the selected machines. if I look at the Reports tab, it says that the job completed successfully with no errors, but many VMs are missing from the Restore tab. When I check the Virtual Machines section of that tab, I see a ton of the VMs that I have selected in my backup jobs, but "Last Backup" states Never. All of my jobs "completed" according to VDR, but out of 100 VMs I still have about 25 of them that aren't backed up. They are in various states (some are powered off, some are running). I tried Backup Now > All Sources and Backup Now > Out of Date sources, but nothing happens - no new backup task starts up, nothing. Even though some of the VMs did not back up at all, the jobs do not have any out of date sources listed. I've tried recreating and re-running the jobs as well, but the same problem persists. My logfiles have no warnings or errors. What am I missing? Why is VDR skipping these VMs? Any insight would be very much appreciated.
I have a two system network running under ESXi5 setup as the following: Physical server > 192.168.1.100 > cabled to a layer 2 switch Port 10 VM on a vSwitch > 192.168.1.101 > host vmnic > cab... See more...
I have a two system network running under ESXi5 setup as the following: Physical server > 192.168.1.100 > cabled to a layer 2 switch Port 10 VM on a vSwitch > 192.168.1.101 > host vmnic > cabled to the same layer 2 switch, Port 11 From the physical machine, I can ping the VM. From the VM, I can't ping the physical machine. I have Wireshark running on both, and I see the ICMP get from the VM to the physical machine, the ACK gets sent, but the VM never receives it. So VM > ICMP > vSwitch > Physical Switch > Server = Good Ack from Server > Physical Switch > Dies at the vSwitch Both servers are Windows 2008, with the windows firewall disabled and the service stopped. Subnet mask is 255.255.255.0 on both. What would stop the replies from my physical server from getting to the VM on the vswitch? I can ping it, so the round trip: Physical ICMP > Virtual > ACK to Physical = Works Virtual ICMP > Physical > ACK back to Virtual = Dies there I have Promiscuous mode, MAC changed and forged transmits all set to Accept on the vSwitch properties. Any insight would really be appreciated - I'm out of ideas.
I restarted the appliance, reclaim went through, but now all of my backup jobs won't run... I've verified that the VM selection is correct and that the destinations are good, but any attempt t... See more...
I restarted the appliance, reclaim went through, but now all of my backup jobs won't run... I've verified that the VM selection is correct and that the destinations are good, but any attempt to run a job gets me this error: "This backup job does not contain any sources"
Hi folks, I've had nothing but problems with VDR since I upgraded my licenses specifically to use it. I have two brand new 10Gb SAN targets, and I can't get this software to work for more than... See more...
Hi folks, I've had nothing but problems with VDR since I upgraded my licenses specifically to use it. I have two brand new 10Gb SAN targets, and I can't get this software to work for more than 2-3 days in a row without having to troubleshoot some random problem. Right now one of my targets has been stuck at Reclaim 0% for two days now, and the other target is at 0% on an Integrity Check after the same amount of time.  For every two good backups I get, I have two sit through one or two integrity checks that take almost 24 hours. My backup store is about 80 VMs total, and none of them are over 100GB in size. What's the procedure for stopping the reclaim/integrity check so that my backups can proceed? I tried stopping the process but after 24 hours it does nothing.
I managed to get past this (ignored the add hard disk part, just added the destination). VDR picked up that there was an existing catalog, but now I am stuck again. The add of the catalog imme... See more...
I managed to get past this (ignored the add hard disk part, just added the destination). VDR picked up that there was an existing catalog, but now I am stuck again. The add of the catalog immediately fails the integrity check, saying that the destination index is invalid/damaged, and that it will be locked until integrity check succeeds. So I found this KB: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1018060 But again... I cannot log into the (just deployed) console using root/vmw@are, as the instructions say. I just get a login incorrect at the console. What the heck am I missing here? I just deployed the OVF 5 minutes ago and this is my first login. From the 2.0 admin guide: "If this is the first time logging on to the backup appliance, the default credentials are username: root, password: vmw@re." Why isn't that working? I can log into the web console with those credentials, but not the actual linux console so that I can remove the locks. (Edit: I got past this part... If I use the vSphere console with those credentials, I can't log in. If I SSH to the appliance remotely, I can. Hopefully this helps someone. (But that makes no sense to me) However I'm back to my original problem. the KB there says to go here, but I don't have that location on my appliance, assumedly because I can't find the right hard drive to add to the machine... I'mstuck again. # cd /SCSI-0\:1/VMwareDataRecovery/BackupStore/ # rm -rf store.lck"