TheBobkin's Accepted Solutions

@CarlPower, Latest Witness is 7.0 U3l, available here: https://customerconnect.vmware.com/downloads/details?downloadGroup=VC70U3O&productId=974#drivers_tools   Witnesses don't get a new package for... See more...
@CarlPower, Latest Witness is 7.0 U3l, available here: https://customerconnect.vmware.com/downloads/details?downloadGroup=VC70U3O&productId=974#drivers_tools   Witnesses don't get a new package for every ESXi release so will need to patch that with a bare VMware ESXi patch to get it later than that.
@MikeSmarz If it is a 3-node cluster and all nodes have Disk-Groups/Storage-Pools then that is sufficient for creating RAID1,FTT=1 objects already and doesn't require any additional node or Witness.
Have a read through this - it covers a lot more details than the VMware docs page: https://core.vmware.com/resource/vsan-encryption-services "What happens if vCenter is offline/ failed?" Actuall... See more...
Have a read through this - it covers a lot more details than the VMware docs page: https://core.vmware.com/resource/vsan-encryption-services "What happens if vCenter is offline/ failed?" Actually vCenter is only used for initial configuration and KMS trust establishment - after this the hosts communicate directly with the KMS and thus vCenter being down has no consequences other than can't make changes to the KMS configuration.   "What happens if 1 KMS is offline Failed?" This depends entirely on the KMS-side configuration - ideally this should be done properly and it be a redundant KMS cluster with all nodes being able to provide all keys, however I have seen situations where administrators thought this was the case but sadly it was not and keys were not available as one KMS was down and it was the only one with specific keys.   "What happens if both KMS are offline/failed?" Nothing unless vSAN nodes are rebooted or any change that unmounts and remounts Disk-Groups, obviously don't do this if at all possible until KMS issue is resolved, if this is done then that/those Disk-Groups will be locked until the keys are available again.
@MJMVCIX  Looking at the screenshot better (not on a phone helps ), I don't think this is due to the issue I mentioned as that issue resulted in both R5 and R6 calculation being 1. the same and 2.... See more...
@MJMVCIX  Looking at the screenshot better (not on a phone helps ), I don't think this is due to the issue I mentioned as that issue resulted in both R5 and R6 calculation being 1. the same and 2. approximately what an FTT=0 usage would be.   Can you confirm 100% that the policy 'R5 FTT1 Thin' does actually have Object Space Reservation set to 0, asking as policy names are just labels and the policy rules might differ from what the name suggests (though yes, of course they should match, but people can of course change them so they don't). Reason I am asking is that if you consider everything OSR=100 then the calculations look to be roughly accurate. Can you also confirm what policies all the data in this cluster are currently using, you indicated likely RAID1,FTT=1 but please check   "How can there be effective free space of 298.66TB for new Workloads with RAID5 when there is only 193.86 TB Free?" This is saying how much space would be free in the cluster if you used 'R5 FTT1 Thin' policy for all current data instead of what is currently used.   You mentioned "142.2 TB provisioned Disk" - just to clarify, 'provisioned' generally means the full size of the vmdk e.g. as VM see it e.g. a 500GB vmdk, this doesn't inform of the actual physically written data which could be 1MB or 500GB, this is important as you indicate want to store this with a thin-policy.   The % for Host rebuild reserve and Operations reserve and what they are based on is well explained here - these won't change by you moving more data to this cluster so 142.2 x 1.33 = 189TB is how much that new data should use (assuming by provisioned you meant physically written): https://core.vmware.com/blog/understanding-reserved-capacity-concepts-vsan
@MrCheesecake, a couple of options that might suit: Create a VM, attach all of these vmdks to it, then back it up, then detach the vmdks. Create a VM, attach all these vmdks to it, clone it, then S... See more...
@MrCheesecake, a couple of options that might suit: Create a VM, attach all of these vmdks to it, then back it up, then detach the vmdks. Create a VM, attach all these vmdks to it, clone it, then SvMotion the clone to other vSAN cluster, then detach the vmdks from both VMs. It is also possible to use vmkfstools to clone a vSAN object to a file or another object (but can't be across vsanDatastores unless used HCI Mesh I guess).
@SamArun "Can we size a vSAN Cluster having 5-Nodes with RAID6 and FTT=1?" There is no such thing as RAID6,FTT=1 - RAID6 is an FTT=2 policy and requires a minimum of 6 nodes in the cluster for compo... See more...
@SamArun "Can we size a vSAN Cluster having 5-Nodes with RAID6 and FTT=1?" There is no such thing as RAID6,FTT=1 - RAID6 is an FTT=2 policy and requires a minimum of 6 nodes in the cluster for component placement, whereas a RAID5,FTT=1 requires a minimum of 4 nodes in the cluster for component placement. "But the design is currently with 4-Nodes which allows us to use RAID-5 whereas my intention is to use RAID-6 by adding 1-additional host" You would need to add at least 2 nodes (total 6 nodes) to use RAID6,FTT=2, also note that you should always have N+1 where possible e.g. if a policy requires minimum 6 nodes then ideally you should have 6+1 (7) nodes in the cluster.
@abaucom555 ESXi licenses are applied at the host-level, so if they are running ESXI 7 then a version 7 license is fine, but vSAN is licensed at the cluster-level and the cluster is a vCenter-entity ... See more...
@abaucom555 ESXi licenses are applied at the host-level, so if they are running ESXI 7 then a version 7 license is fine, but vSAN is licensed at the cluster-level and the cluster is a vCenter-entity and thus the required license version for vSAN is governed by the vCenter version - you cannot apply a version 7 vSAN license if the vCenter managing the cluster is on version 8, your issue here is completely expected behaviour.
@cemalettin, the vsanDatastore showing 0B usable might indicate the cluster was never fully formed e.g. the nodes were not able to communicate with one another over the vSAN network (e.g. perhaps it ... See more...
@cemalettin, the vsanDatastore showing 0B usable might indicate the cluster was never fully formed e.g. the nodes were not able to communicate with one another over the vSAN network (e.g. perhaps it was not configured or not configured properly).   If the disks still have vSAN partitions on them, you don't need to remove these if the Disk-Groups are intact and are usable - this is easily checked either via Cluster > Configure > vSAN > Disk Management or by running this on the node via SSH: # vdq -Hi  If you the Disk-Group is not intact (e.g. Cache-tier has partitions wiped but Capacity-tier devices still have partitions) and you are sure you don't need any data from that Disk-Group then these can be erased via multiple means such as using 'dd' or esxcli: # esxcli vsan storage remove -u <UUIDofDisk> or for a Capacity-tier disk # esxcli vsan storage remove -d <NaaofDisk> or for a Cache-tier disk: # esxcli vsan storage remove -s <NaaofDisk> or from the UI: Host > Configure > Storage Devices > select the device > Erase partitions
@JulienR33, No there is no license-level exception for 2-node for any feature other than 'stretched-cluster'. If you had Enterprise or Evaluation license on this previously then it would of course a... See more...
@JulienR33, No there is no license-level exception for 2-node for any feature other than 'stretched-cluster'. If you had Enterprise or Evaluation license on this previously then it would of course allowed enablement of and creation of Disk-Groups with Deduplication&Compression enabled, if you then put a Standard license on the cluster it will not remove the existing Disk-Groups that have this enabled, however creating/Recreating Disk-Groups or adding a disk to an existing one will not be able to use Deduplication&Compression so if you are planning on staying on Standard license then you should proactively disable this feature and recreate the Disk-Groups (which is part of the automated workflow after you disable it and remediate), ensure you have enough space to do so as the data will likely take up more space when 'rehydrated'.
@orddie, Cache:Capacity ratios are referring to size of the Cache-tier device to the total Capacity-tier of the Disk-Group, not the number of Capacity-tier devices in the Disk-Group.   E.g. 400GB C... See more...
@orddie, Cache:Capacity ratios are referring to size of the Cache-tier device to the total Capacity-tier of the Disk-Group, not the number of Capacity-tier devices in the Disk-Group.   E.g. 400GB Cache-tier + 2x 2TB Capacity-tier in a Disk-Group would be 10% ratio.   Not to over-complicate things but do note that these ratios are usually based on used capacity not available capacity.   If you are going with stripe-width=2, single Disk-Group per node in a 3-node cluster then that would require a minimum of 2 Capacity-tier devices in each Disk-Group.
@dbutch1976  Was cluster shutdown wizard run at some point here? If nodes are not getting unicastagent entries pushed to them then maybe they are set to ignore these updates - this can be checked w... See more...
@dbutch1976  Was cluster shutdown wizard run at some point here? If nodes are not getting unicastagent entries pushed to them then maybe they are set to ignore these updates - this can be checked with the below and both of these set to default values : # esxcfg-advcfg -g /VSAN/DOMPauseAllCCPs # esxcfg-advcfg -g /VSAN/IgnoreClusterMemberListUpdates # esxcfg-advcfg -s 0 /VSAN/DOMPauseAllCCPs # esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListUpdates   After setting these to 0, try move out of cluster then back in again, these lists can also be manually populated but getting vCenter to do this is preferable.
@mak14 the fastest method would be to connect the hosts in this new cluster to the IBM storage datastore and Storage vMotion all the data from that datastore onto vsanDatastore.   If that storage c... See more...
@mak14 the fastest method would be to connect the hosts in this new cluster to the IBM storage datastore and Storage vMotion all the data from that datastore onto vsanDatastore.   If that storage cannot be connected to the new hosts and is connected to old hosts only then can do a shared-nothing vMotion of the data and VMs from one to the other.
@cemalettin Yes, as otherwise you won't be able to vMotion between the sites and otherwise have to shutdown, unregister and re-register VMs if you want to do maintenance on nodes.
@Tibmeister "the NIC needs to be on the vSAN HCL and not just the standard ESXi HCL" - only if using RDMA for vSAN network, otherwise regular ESXi HCL/VCG is all that needs to be met.
@microlytix, (unless there was some change I am unaware of) No it won't vMotion the VM, it will just read from the remaining replica in the other site, start repairing the other replica (assuming it ... See more...
@microlytix, (unless there was some change I am unaware of) No it won't vMotion the VM, it will just read from the remaining replica in the other site, start repairing the other replica (assuming it is marked as degraded), then switch to reading from that new local replica once it is active.   If you ever want a good smashy smashy place to test failure behaviour, you can always use VMware HOL labs (though the vSAN ones are still on 7.0 U2 currently unfortunately).
@AndreaScarabell, I can't see any issues with that - ReadyNode allows adding additional ESXi-supported NICs and you aren't using them for vSAN traffic so there should be no strict requirements of ban... See more...
@AndreaScarabell, I can't see any issues with that - ReadyNode allows adding additional ESXi-supported NICs and you aren't using them for vSAN traffic so there should be no strict requirements of bandwidth/throughput etc. .
@lspin, Odd one, but seen it a few times. Were the disks hot-swapped/hot-removed or with the server powered off? If not done with server off then there is potentially still some process trying to d... See more...
@lspin, Odd one, but seen it a few times. Were the disks hot-swapped/hot-removed or with the server powered off? If not done with server off then there is potentially still some process trying to do something with that disk that is no longer there, if you can test with one node, put it in Maintenance Mode and cold reboot it (e.g. from iLO/iDRAC not from vSphere client).
@anandgopinath  "will we loose a VM  for ever  ( recoverable only from backups )  in case of a  cache disk / capacity disk or full disk group failure  ? " Yes, as you have just lost the only copy o... See more...
@anandgopinath  "will we loose a VM  for ever  ( recoverable only from backups )  in case of a  cache disk / capacity disk or full disk group failure  ? " Yes, as you have just lost the only copy of the data.   "as per the stretched cluster guide , it says vm will survive a disk or diskgroup failure by moving data to the other disk group , disks  . is this true" How would it move anything? The only replica is gone, there is nothing to read from. What you read was likely referring to PFTT=1 or SFTT=1 data where there was still available replica and somewhere valid to recreate second replica replacing the lost one.
@anandgopinath Yes it is fine, assuming the policy you are changing it to can be compliant with the cluster (e.g. a RAIF6,FTT=2 policy requires, All-Flash + Advanced or higher vSAN license + minimum ... See more...
@anandgopinath Yes it is fine, assuming the policy you are changing it to can be compliant with the cluster (e.g. a RAIF6,FTT=2 policy requires, All-Flash + Advanced or higher vSAN license + minimum 6-nodes with storage) and has adequate redundancy (e.g. don't change it to FTT=0).   This also shouldn't incur much resync as these objects are typically not consuming much actual space.
@usmabison and anyone else interested in such things - I created a KB article providing more information on this topic and also some troubleshooting tips: https://kb.vmware.com/s/article/91689