Anyone who has used devices with several energy-saving modes is aware of the latencies introduced, such as disk drives that take some time to spin up, or programs that become unexpectedly sluggish as memory chips take a break from their naps. High-performance servers are usually configured more cleverly than desktop machines, but finding the right tradeoff between saving energy and producing acceptable performance has required manual tuning for each server application. These threshold values can be inappropriate when used for another program or workload. What this well-written and well-organized paper introduces are techniques and algorithms that eliminate the need for this manual tuning, while taking into account specific performance guarantees specified for the energy-managed computer system.
Section 1 describes the problem context and sketches out the proposed solutions. Section 2 outlines the power models used for memory and for disk, and then lists some of the existing control algorithms that they will be compared with. Sections 3 and 4 are the core of the paper, as they establish the precise meaning of “performance guarantee” for memory (relatively straightforward) and disk (more complex given the number parameters), and then provide efficient algorithms for setting device energy management parameters that take into account guarantees and current device usage. Sections 5 and 6 present and analyze experimental results obtained from simulators. Sections 7, 8, and 9 are devoted to some work on a hybrid “static + dynamic” algorithm, brief summaries of related work, and a discussion of future work.
There are several key ideas presented here. One is that of “slack,” or “the amount of allowed execution delay” introduced by energy management that still permits performance guarantees to be met. This idea is combined with that of an “epoch,” a sequence of device accesses (about a million such accesses for memory, and around 100 seconds for disk drives). Unused slack in one epoch can be used in a later epoch, and this allows energy management settings to differ from epoch to epoch (for performance-directed static (PS) control algorithms) or within an epoch (for performance-directed dynamic (PD) control algorithms). At the start of each epoch, the PS algorithm determines settings via optimization so that the energy savings over all devices are maximized while ensuring the sum of energy management-induced delays across all devices is less than the available slack. The PD algorithm computes threshold values at the start of each epoch, again chosen for all devices such that available slack is not exceeded. Both PS and PD use earlier epochs as predictions of expected device usage.
Both PS and PD produce better results than other schemes, although PS is better for memory devices, while PD is better for disk devices. A hybrid scheme combining both PS and PD is investigated, but the authors report mixed results.