Virtualization is a first step, cause with server consolidation can reduce the number of physical servers.
Note that old server usually require more power to work than new server. For example see this Dell document:
Dell PowerEdge R710 solution with VMware ESX vs. Dell PowerEdge 2850 solution - http://www.dell.com/downloads/global/products/pedge/en/server-poweredge-r710-vmware-esx-initial-investment-payback-vs-hp-proliant-dl385-solution-en.pdf
In a virtual environment based on vSphere there can be different approach to increase the power saving, not in an exclusive way:
Power Management on a single host using Dynamic Voltage and Frequency Scaling (DVFS)
Power Management on a DRS enabled cluster using VMware Distributed Power Management
Note that there come be some issue on the first solution, for example VMware FT best practices require that the hosts are set on max power profile.
Dynamic Voltage and Frequency Scaling (DVFS)
It is possibile to use CPU function (like Intel SpeedStep or AMD PowerNow!) but needs to be configured and enabled in the BIOS and then enabled in ESX/ESXi through the advanced configuration option "Power.CpuPolicy" (set this from the default "manual" to "dynamic").
Distributed Power Management (DPM)
This solution (introduced in ESX 3.5) require only the DPM license and a DRS cluster, but any special function on CPU level. It works by put the host in a standby state and resume when more cluster power is required.
On small cluster with few nodes and VMware HA enabled, this feature may make not sense (for example a 2 node cluster with HA enable can not use DPM cause HA will be not guardanteed).