Test scenario:
Host Hardware: | HP Elitebook 8470p Intel(R) Core(TM) i7-3630QM CPU @ 2.40 GHz, 16GB RAM |
Host OS: | Windows 8.1 Pro |
VMware: | Workstation 9, 10, 11, 12 |
PXE server: | Serva |
Boot Manager: | Syslinux 6.03 |
Client Preboot Environment: | EFI 64 (VMware FW) |
When PXE booting under Workstation EFI guests OSs like i.e. ubuntu-15.04-desktop-amd64.iso
syslinux.efi initially TFTP transfers vmlinuz (6.5 MB) and initrd.lz (22.5 MB).
These transfers (even when they do not present any error) are extremely slow:
v9 | v10 | v11 | v12 | HP 2570p (EFI hardware) | |
---|---|---|---|---|---|
vmlinuz 6.5 MB | 8s | 16s | 9s | 9s | 2s |
initrd.lz 22.5 MB | 62s | 121s | 115s | 116s | 27s |
The last column represents a hardware client booting exactly the same OS
files from the same PXE server.
I can see that under VMware EFI FW all the TFTP transfers (when handled by syslinux.efi) are very slow.
Of course syslinux.efi uses EFI FW resources for these transfers, like instances of
EFI_UDP4_SERVICE_BINDING_PROTOCOL, EFI_UDP4_PROTOCOL, etc
paradoxically the VM Workstation with best UEFI PXE performance is the old v9 but it is
still far slower than i.e. an EFI notebook like the HP Elitebook 2570p booting the same test OS
files against the same PXE server.
I have run Wireshark traffic captures (i.e. Workstation 11)
43128 235.451388000 192.168.64.1 192.168.64.128 TFTP 1454 Data Packet, Block: 16388 Delta 0.015367
43129 235.466755000 192.168.64.128 192.168.64.1 TFTP 46 Acknowledgement, Block: 16388 Delta 0.000128
43130 235.466883000 192.168.64.1 192.168.64.128 TFTP 1454 Data Packet, Block: 16389 Delta 0.019316
43131 235.486199000 192.168.64.128 192.168.64.1 TFTP 46 Acknowledgement, Block: 16389
were the pattern shows a high delay between the arrived data packet and the sent ACK.
the reasons for this behavior from an EFI FW point of view could be i.e. not handling correctly
priority of events; syslinux.efi relies on these events for knowing when a data packet has arrived and
then send the corresponding ACK
Please let me know if you guys need more info
Best,
Patrick
Hi again patpat!
Which virtual NIC were you using for the tests? Also, due to a quirk in the way we need to share resources with the host, some vNICs perform better in UEFI when the virtual machine has only one virtual CPU, so it might be informative to try with one vCPU and with more than one vCPU and compare the results.
Cheers,
--
Darius
Hi Darius !
v12 (1 vCPU) | v12 (2 vCPU) | |
---|---|---|
vmlinuz 6.5 MB | 9s | 10s |
initrd.lz 22.5 MB | 116s | 71s |
The results are repeatable.
Let me know.
Best,
Patrick
Hmmm... I did get that the wrong way around: 1 vCPU VMs can perform worse than 2 vCPU VMs during UEFI PXE boot.
If you add to the VM's configuration file:
vnet.recvClusterSize = "1"
does the 1 vCPU VM's performance improve?
It probably still won't approach the performance of a physical machine, though. The architecture of UEFI does not permit us to use interrupts for NIC events -- we must use polling -- and polling is very bad when we need to steal CPU time from the host OS in a shared environment. We've made a few compromises to achieve a balance between TFTP throughput and not causing too much host CPU usage while the firmware is running (i.e. limiting the polling rate), and those compromises do make it somewhat unlikely that we will be able to match the performance of a native firmware implementation (which realistically can use just as much CPU as it wants... no one will care).
There's probably more that we can do here to improve our virtual UEFI firmware's TFTP performance if we have the time and opportunity to dig deeper...
Cheers,
--
Darius
Hi there
v12 (1 vCPU) | v12 (2 vCPU) | |
---|---|---|
vmlinuz 6.5 MB | 9s | 10s |
initrd.lz 22.5 MB | 116s | 71s |
vnet.recvClusterSize = "1" | v12 (1 vCPU) | v12 (2 vCPU) |
---|---|---|
vmlinuz 6.5 MB | 11s | 13s |
initrd.lz 22.5 MB | 129s | 74s |
Unfortunately vnet.recvClusterSize = "1" does not help.
I have also seen other vnet variables like
vnet.bwInterval =
vnet.dontClusterSize =
vnet.maxRecvClusterSize =
vnet.maxSendClusterSize =
vnet.noPollCallback =
vnet.pollInterval =
vnet.recalcInterval =
vnet.recvClusterSize =
vnet.recvThreshold =
vnet.sendClusterSize =
vnet.sendThreshold =
vnet.useSelect =
Some of them (i.e. vnet.pollInterval) sound interesting but I wonder if a description and their default value could be available.
I understand what you say about the need of compromising when considering EFI/NIC/Polling/CPU time/ etc
but probably the approach can still be improved; today EFI PXE is just too slow.
i.e. I think probably those FW compromises should not be the same before and after calling ExitBootServices()
I do not know if you guys today already consider that difference.
Best,
Patrick
I don't have a ready reference with a description of all the options and their defaults, sorry.
After ExitBootServices, the OS "owns" the platform and can do as it chooses, and it is not constrained by the UEFI restrictions -- it is free to reinitialize the PIC/APIC/LAPIC and MSI/MSI-X and use hardware interrupts as it chooses. The compromises are made solely in the firmware (in the DXE environment, where the firmware's own NIC drivers are in use) and are only relevant prior to ExitBootServices.
I've been contemplating setting the firmware up to use interrupts in a way that is "concealed" from the rest of the DXE environment, so that we can achieve better performance while not visibly breaking the UEFI specification... :smileydevil: Haven't actually implemented anything along those lines yet.
Cheers,
--
Darius
well w/o more variables to tweak I can only hope you guys can some how do something about this issue.
Something I have noticed, when bootmgfw.efi has to TFTP transfer (windowsize=8) Boot.wim (270 MB)
v12 | HP 2570p (EFI hardware) | |
---|---|---|
Boot.wim (270 MB) | 47s | 36s |
OK windowsize=8 means 8 times fewer ACKs but in this case the difference in performance is not so big.
I wonder if it could be something else besides driver performance like i.e. the use of
EFI_UDP4_SERVICE_BINDING_PROTOCOL, EFI_UDP4_PROTOCOL
(bootmgfw.efi does not use these protocols)
Best,
Patrick