VMware Cloud Community
Rick14
Contributor
Contributor

ESXi memory intensive programs on host

Hello everyone,

I'm trying to run a memory intensive program on the ESXi host over SSH. I've read just about all the blogs on the internet, but none of the things that I'm trying seem to allow me to run the program. The host should have 12GB of memory, and there are 0 vms installed. The memory of this host is:

# vsish -e get /memory/comprehensive
Comprehensive {
Physical memory estimate:12581800 KB
Given to VMKernel:12581800 KB
Reliable memory:0 KB
Discarded by VMKernel:1600 KB
Mmap buddy overhead:3084 KB
Kernel code region:22528 KB
Kernel data and heap:16384 KB
Other kernel:896876 KB
Non-kernel:256920 KB
Reserved memory at low addresses:60816 KB
Free:11384408 KB
}

But when I try to run my program dmesg shows the following:

2021-04-15T15:40:42.962Z cpu1:264640)Admission failure in path: host/vim/vimuser/terminal/ssh:afl-qemu-trace.264640:uw.264640
2021-04-15T15:40:42.962Z cpu1:264640)UserWorld 'afl-qemu-trace' with cmdline './afl-qemu-trace ./test-instr'
2021-04-15T15:40:42.962Z cpu1:264640)uw.264640 (13947) extraMin/extraFromParent: 4610/4610, ssh (669) childEmin/eMinLimit: 202010/204800

Which seems to point in the direction of a memory limitation. This documentation describes how to update the memory limits allocated to that group of processes, since I start the program from SSH, I thought it applied:

https://vdc-download.vmware.com/vmwb-repository/dcr-public/c14c9304-a9ef-4e2c-8c9e-332426ceba9c/bc49...

So I ran:

# grpID=$(vsish -e set /sched/groupPathNameToID host vim vimuser terminal ssh|cut -d' ' -f 1)
# vsish -e set /sched/groups/$grpID/memAllocationInMB max=unlimited minLimit=unlimited
# vsish -e get /sched/groups/$grpID/memAllocationInMB
memsched-allocation {
min:0
max:-1
shares:-3
minLimit:-1
units: 4 -> mb
}
#

But even after this change, the result is almost the same as the above output:

2021-04-15T15:49:57.400Z cpu1:264693)UserWorld 'afl-qemu-trace' with cmdline './afl-qemu-trace ./test-instr'
2021-04-15T15:49:57.400Z cpu1:264693)uw.264693 (14325) extraMin/extraFromParent: 4610/4610, host (0) childEmin/eMinLimit: 3126604/3129846

strace confirms that no memory is granted:

mmap(0x58f000, 18878584, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOSPC (No space left on device)
mmap(0x590000, 18878584, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOSPC (No space left on device)
mmap(0x591000, 18878584, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOSPC (No space left on device)
mmap(0x592000, 18878584, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOSPC (No space left on device)

I'm very confused, there are 0 vms running or configured on this machine. But it has 12GiB of memory available. It shows the memory as 1.2GiB used in the web interface, and the rest as 'free'.

Does anyone know how I can increase the limits so that I can run these memory intensive programs on the ESXi host?

It is 100% acceptable to leave 0 memory for guest VMs, I'm not trying to run a quest at this time.

0 Kudos
3 Replies
vbondzio
VMware Employee
VMware Employee

Why are you trying to run AFL on ESXi in the first place? 🙂 I'm assuming that isn't a secret since you didn't bother to rename the binary.

Anyhow, can you pastebin me the output of:

 memstats -r group-stats -g0 -l5 -s gid:name:parGid:nChild:min:max:minLimit:conResv:availResv:memSize -u mb 2> /dev/null | sed -n '/^-\+/,/.*\n/p'

?

0 Kudos
Rick15
Contributor
Contributor

It's not a secret no 🙂 I'm doing some security research on ESXi, a number of organizations I do security consulting for have critical systems running on them, and I want to see how easy it would be/how long it takes to find memory corruption bugs in any of the exposed services.

But to be honest, I'm completely clueless about the architecture of ESXi, and kind of stuck 🙂

I have the output here:

https://pastebin.com/raw/iPsSRaUk

 

0 Kudos
vbondzio
VMware Employee
VMware Employee

I wouldn't bet too much money on it running even if you aren't memory limited anymore (vmkernel / ESXi is posix-like / somewhat compatible, it isn't Linux), if it does and you discover anything (on the newest build), don't forget to write security at vmware dot com.

Back to your scenario though, did you reboot since you changed the resource settings? I see ssh max set to 800, vsi is ephemeral and doesn't persist over a reboot (or the vsish cmd to get the GID was mucked up by something that wasn't pasted into this thread).

You can check the limits (max) on the resource path e.g. like this:

memstats -r group-stats -g0 -l5 -s gid:name:parGid:nChild:min:max:minLimit:conResv:availResv:memSize -u mb 2> /dev/null | sed -n '/^-\+/,/.*\n/
p' | awk 'NR == 2 || $2 ~ /(vim|vimuser|terminal|ssh)$/ {print $0}'

 

0 Kudos