A problem I've had ever since I've started using an ESX server and could never get rid off is the incredibly slow speed I get when I'm copying VM's to my ESX server.
The server (Dell PowerEdge 1950) has 2x 1 Gb Broadcom Extreme II NIC's in it of which I'm only using 1 at the moment.
When I'm trying to copy files to the ESX server it starts with a speed of 100-200 kb/s but after seconds it drops to 75-100 kb/s and that's where it stays at. I've read some topics on these forums saying that Autonegotiation could cause some of these problems.
Sounds fair enough, I've had the same problems with a few W2003 servers that needed to be set to 100 or 1000 full duplex before I got decent speeds to and from them.
The weird thing is, whatever I try with ethtool, I can't turn off Autonegotiation. I can change speed and duplex, but whenever I try to turn off autonegotiation I get the following message:
\[root@vmware sbin]# sudo ./ethtool -s vmnic0 autoneg off
Cannot set new settings: Function not implemented
not setting autoneg
Trying to turn it on gives me no errors whatsoever.
For more info about the NIC:
\[root@vmware sbin]# ./ethtool vmnic0
Settings for vmnic0:
Supported ports: \[ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: d
Wake-on: d
Cannot get message level: Function not implemented
Link detected: yes
If this rings a bell for anyone I'd love to hear about it. I'm trying to put some VM's on the server and the smalles one already takes over 30 hours (8 GB file), which is totally unacceptable, even more so if you consider it often times out after several hours.
Is your workstation a member of a domain?
smbclient //server/share -U domain/user
I've also found that sometimes I need to substitute the server name with its IP address:
smbclient //1.2.3.4/share -U user
Paul
try this:
sudu ethtool -s eth0 speed 1000 duplex full autoneg off
Sorry for not being clear enough. I've already tried that, it gives me the following error:
\[root@vmware sbin]# sudo ./ethtool -s vmnic0 speed 1000 duplex full autoneg off Cannot set new settings: Function not implemented
not setting speed
not setting duplex
not setting autoneg
I can change speed/duplex at the same time, or one at the time and it will change the setting and not give any errors back. Yet when I include Autoneg I get the "Cannot set new settings: Function not implemented" error.
Message was edited by:
LNM
Well now you see the reason why you don't use internal NIC's.
If you use REAL network cards like Intel 1000 you will get much better performance
Next time get PCI-X (4x) NIC's, and turn internal NICS off.
I can imagine there's some performance loss with certain NIC's, but you can't possibly be serious when you tell me that it's perfectly understandable that Gbit NIC's perform below 100kb/s.....
The server's cpu's are being used for 25%, memory for about 40-50%. I really doubt this is a hardware issue, but if you can explain why the current hardware would be not functioning properly I'm all ears.
Not sure from what host you're copying a VM to your ESX server, or how you're doing the copying, but on my WinXP machine I frequently copy ISOs (or any arbitrarily large file) to my ESX servers using a free product called FastSCP from Veeam.
I've timed the copy using FastSCP and one using plain ol' SCP and the FastSCP leaves it dead in the water.
http://www.veeam.com/veeam_fast_scp.asp
No, I don't work for Veeam but I find the tool very useful cause it saves me from having to manually type "scp blahblah"
Message was edited by:
KFM
You need to provide more information about the actual copy operation. Are you using WinSCP or another SCP tool?
Instead of using an SCP copy, I usually (temporarily) disable the firewall on my ESX boxes and copy using smbclient (Samba client).
For example:
service firewall stop
smbclient //server/share -U user
(asks for password)
Then you can use MGET, GET, etc. (like FTP) to pull files from a Windows box. Generally, a \*lot* faster than SCP copies.
Turn the firewall back on when you're done: service firewall start
You can also open the Samba client port on the firewall instead of shutting the firewall down... it's your choice.
Paul
I'm indeed using WinSCP, version 3.8.2.
I've stopped the firewall and tried to share the directory on my workstation which contains the VM I'm trying to copy but I'm getting an error:
session setup failed: NT_STATUS_LOGON_FAILURE
I've shared the directory on my computer and added my own user account to have full access rights. Any ideas?
I'll see if I can find what's going wrong.
-edit-
Silly me, in the past I've tried FastSCP, I still have it installed on my workstation. I just tried to use it and while the speed is doubled/tripled (200-300 kb/s) this means it'll still take over a complete workday to copy 1 VM.
It's better, but still not acceptable
I really think it has something to do with the network settings, which is why I was trying to put it on 1000Mb/s without autoneg.
Since FTP isn't encrypted I'm willing to give it a go to see if the performance increases. If I "only" get a 1-2mb/s transfer rate I'll be pretty damn happy. I've got several VM's to copy and I just cannot wait for 1-2 days for one to be copied to the ESX server. Let alone if they time out every now and then.
Thanks for the tips/ideas so far, I hope I can get this solved
-edit2-
The NIC in question is a Broadcom NetXtreme II. If anyone thinks this is a notoriously bad NIC let me know please
Message was edited by:
LNM
Is your workstation a member of a domain?
smbclient //server/share -U domain/user
I've also found that sometimes I need to substitute the server name with its IP address:
smbclient //1.2.3.4/share -U user
Paul
both ESX server and workstation are in the same domain. I tried with domain/user yet still get the same error.
I was already using the IP address. It did work when I used the hostname! Cheers!
Is there any way to see the transferring speed while the 8GB file is still being transferred?
Honestly, the easiest way to check the throughput is to open Task Manager on the Windows box, select the Networking tab, and watch the nice line graph plot your network I/O as ESX is pulling the file down!
Paul
I can't believe I didn't think of that, thanks for the pointer
Well, it's a day later, I've transferred roughly 2GB over the night, the network activity on my share (which has a 100Mbit NIC) is 0,2-0,3% so I think we can safely conclude the problem is not about the way it's being transferred.
WinSCP/FastSCP/FTP all give horribly slow speeds.
I'll see if I can force the ESX server's NIC and the switch to 100Mbit, see if that'll go any quicker.
-edit-
I've messed around some more with the NIC settings and finally got it to turn autonegotiate off. It's on 100Mbit full duplex now and speeds are up to multiple MB/s now.
I'm confident I'll get it to work on the Gb settings as well. Thanks for all the advice!
There have been some very firm statements thrown around in the past that there isn't really a "half" and "full" duplex side to Gigabit - it's "full" duplex or bust. The official spec apparently also state that you cannot force "full" duplex mode, since there is no choice.
This may be why ESX is not allowing you to set the duplex mode on a Gigabit link. (You can set the advertised speed[/b], but not the duplex[/b].)
This was true for ESX 2.x as well. Interesting 2.x article (relevant to 3.x as well): http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1564. Still doesn't give a clear answer as to why things are slow, but may shed light on some of your other questions.
Finally, have you tried just replacing the patchlead between the server and the switch? Or maybe it's just a dud switch port and you're getting lots of corrupted frames on the network? A lead that works well at 100Mbps may not do so at Gigabit speeds.
I've found a link to this forum of HP that says the same. According to IEEE standards you can only turn on Gigabit by using autonegotiation.
Link: http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=979491&admit=-682735245117751058422328353475
With autonegotiation on it won't function properly. I'll replace the patch cable tomorrow to see if that helps. I think it's a CAT5 cable so I'll see if I can find a CAT6 one.
If it still doesn't work properly on autonegotiation I'll plug it into our old Gigabit switch to see if that gives me more speed. It may indeed be a dud port, I'll find out.
Thanks again for the helpfull advice
How is your network switch port setup?
In other scenarios, I've setup the switch ports forcing 1000-Auto and the server side 1000-Auto, so speed is forced, duplex is negotiated.