I have deployed a 3x node Medium cluster and had it running for a few days, before i started to notice some warning in vROPs that storage was running low.
Upon checking the available storage in the LI System Monitor for all 3 nodes and notices that one was only show 7.3GB of available compared to 233GB on the other 2 nodes.
I ran the support bundle and looking though the files i look at the df file and noticed that the /storage/core and /storage/var mounts were missing from the node with the 7.3GB available.
I have put the Node into Mainteanance mode and rebooted it, but the mounts have not automaticly returned.
Is there anyway to restore them?
Is this a known issue?
---
df details for referance.
df.11422.txt
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 8115648 2798036 4905352 37% /
udev 8233676 112 8233564 1% /dev
tmpfs 8233676 60 8233616 1% /dev/shm
/dev/sda1 130888 38204 85926 31% /boot
/dev/mapper/data-var 20642428 1525004 18068848 8% /storage/var
/dev/mapper/data-core 258026884 12947700 231972192 6% /storage/core
df.9537.txt
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 8115648 3604504 4098884 47% /
udev 8233676 112 8233564 1% /dev
tmpfs 8233676 8 8233668 1% /dev/shm
/dev/sda1 130888 38204 85926 31% /boot
/dev/mapper/data-var 20642428 3685404 15908448 19% /storage/var
/dev/mapper/data-core 258026884 12847568 232072324 6% /storage/core
df.28260.txt
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 8115648 5030908 2672480 66% /
udev 8233676 104 8233572 1% /dev
tmpfs 8233676 56 8233620 1% /dev/shm
/dev/sda1 130888 38204 85926 31% /boot
Ok, so we have 270GB of PV/VG and no LVs. This is the known issue in 3.0. A transient storage problem during LVM provisioning would result in Log Insight creating /storage on the root filesystem without the mount points. Log Insight 3.3 improves this by catching LVM provisioning errors and retrying.
We can remediate this in your environment without dataloss. The process is a bit lengthy and has lots of moving pieces, so stop and reply if you see something you don't expect.
# Stop the Log Insight service
service loginsight stop
# Move existing data aside
mv /storage /storage-orig
# Create two LVs
lvcreate -L 20G -n var data
lvcreate -l 100%FREE -n core data
# List them - you should see something like this:
lvs
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
var data -wi-ao--- 20.00g
core data -wi-ao--- 250.00g
# Create a pair of fileystems
mkfs -t ext3 /dev/data/var
mkfs -t ext3 /dev/data/core
# Open /etc/fstab in an editor (vi) and append two lines, one for each filesystem. Make sure the file ends with a newline, or mount will complain.
/dev/data/var /storage/var ext3 defaults 0 2
/dev/data/core /storage/core ext3 defaults 0 0
# Make the two mount point directories
mkdir -p /storage/core /storage/var
# Mount the new filesystems
mount -a
# Verify they're mounted and sizes are as expected
df -h
# Move your existing data into the new filesystems
mv /storage-orig/var/loginsight /storage/var/loginsight
mv /storage-orig/core/loginsight /storage/core/loginsight
# Remove the empty directories
rmdir -p -v /storage-orig/var /storage-orig/core
# Start services back up
service loginsight start
There is a known issue in Log Insight 3.0, fixed in 3.3. What version are you running? Do the LVM devices exist? Run lvdisplay and pvdisplay and vgdisplay
I am running 3.0 (it was done just before 3.3 was released)
lvdisplay returned nothing
Comparing this against the other nodes, it looks like the LVMs do not exist.
pvdisplay
--- Physical volume ---
PV Name /dev/sdb5
VG Name data
PV Size 270.00 GiB / not usable 2.00 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 69119
Free PE 69119
Allocated PE 0
PV UUID XiW35p-EU78-C4XB-2z9B-XsUc-PURS-tQAlHy
vgdisplay
--- Volume group ---
VG Name data
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 1
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 0
Open LV 0
Max PV 0
Cur PV 1
Act PV 1
VG Size 270.00 GiB
PE Size 4.00 MiB
Total PE 69119
Alloc PE / Size 0 / 0
Free PE / Size 69119 / 270.00 GiB
VG UUID iKfYJc-tlNL-GjPS-647X-kUB4-Pqpu-wVLHpn
Ok, so we have 270GB of PV/VG and no LVs. This is the known issue in 3.0. A transient storage problem during LVM provisioning would result in Log Insight creating /storage on the root filesystem without the mount points. Log Insight 3.3 improves this by catching LVM provisioning errors and retrying.
We can remediate this in your environment without dataloss. The process is a bit lengthy and has lots of moving pieces, so stop and reply if you see something you don't expect.
# Stop the Log Insight service
service loginsight stop
# Move existing data aside
mv /storage /storage-orig
# Create two LVs
lvcreate -L 20G -n var data
lvcreate -l 100%FREE -n core data
# List them - you should see something like this:
lvs
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert
var data -wi-ao--- 20.00g
core data -wi-ao--- 250.00g
# Create a pair of fileystems
mkfs -t ext3 /dev/data/var
mkfs -t ext3 /dev/data/core
# Open /etc/fstab in an editor (vi) and append two lines, one for each filesystem. Make sure the file ends with a newline, or mount will complain.
/dev/data/var /storage/var ext3 defaults 0 2
/dev/data/core /storage/core ext3 defaults 0 0
# Make the two mount point directories
mkdir -p /storage/core /storage/var
# Mount the new filesystems
mount -a
# Verify they're mounted and sizes are as expected
df -h
# Move your existing data into the new filesystems
mv /storage-orig/var/loginsight /storage/var/loginsight
mv /storage-orig/core/loginsight /storage/core/loginsight
# Remove the empty directories
rmdir -p -v /storage-orig/var /storage-orig/core
# Start services back up
service loginsight start
When running lvcreate -L 100%FREE -n core data
it displays
Invalid argument for --size: 100%FREE
Error during parsing of command line.
is there an alternative command or does this have to be explicit in GB?
Had a look the files with WinSCP and came across /opt/vmware/bin/li-disk-utils.sh. In there i can see the 2 lines you mentioned, one with a -L and one with a -l
lvcreate -L "$LV_SIZE" -n "$LV_NAME" "$VG_NAME" 2>&1 | logger -t $0
lvcreate -l "100%FREE" -n "$LV_NAME" "$VG_NAME" 2>&1 | logger -t $0
With this im going to continue with a -l instead of the -L
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 7.8G 4.6G 2.8G 63% /
udev 7.9G 112K 7.9G 1% /dev
tmpfs 7.9G 60K 7.9G 1% /dev/shm
/dev/sda1 128M 38M 84M 31% /boot
/dev/mapper/data-var 20G 173M 19G 1% /storage/var
/dev/mapper/data-core 247G 188M 234G 1% /storage/core
Thank you, this has resolved the issue,
Glad to hear it! I'll edit the L != l change for future readers.