Power Management¶
Comprehensive guide to hardware power management capabilities and techniques for optimizing energy consumption in HPC systems. Learn how to configure CPU frequency scaling, idle power states, and runtime power management systems to balance performance and energy efficiency.
Prerequisites
Basic understanding of HPC architecture and CPU fundamentals
Familiarity with Linux command-line tools and sysfs interface
Knowledge of system administration and performance monitoring
Understanding of basic power/energy concepts (Watts, Joules)
Episodes
Quiz
Reference
Description¶
This module provides a comprehensive introduction to hardware power management in HPC systems. It explores the mechanisms that modern CPUs provide to control power consumption—from frequency scaling to idle states to hardware-managed performance optimization.
Episode 0: Power Management Hardware Knobs covers the fundamental concepts of power management. Topics include the physics of power consumption (the P=CV²f relationship), Dynamic Voltage and Frequency Scaling (DVFS), CPU performance states (P-states), idle power saving states (C-states), thermal throttling (T-states), and shutdown states (S-states). The episode explains why power management matters for HPC: cost, performance constraints, reliability, and sustainability.
Episode 1: Power Management Implementation and Runtime Systems provides technical deep-dives into the Linux interfaces and mechanisms that enable power management. It covers scaling drivers (intel_pstate, acpi-cpufreq), scaling governors (performance, powersave, ondemand, conservative, userspace), MSR register-level frequency control, Intel Turbo Boost technology, Hardware P-State (HWP/SpeedShift) for autonomous frequency selection, GPU frequency management, frequency transition latency, and practical runtime systems for automatic power optimization.
Course Topics¶
Power-Performance Trade-off: Understanding the physics (P=CV²f+I_leak)
Dynamic Voltage and Frequency Scaling (DVFS): How modern CPUs adjust power
Performance States (P-states): Frequency-voltage pairs from turbo to minimum
Idle Power States (C-states): Sleep states and their energy efficiency
Thermal Throttling (T-states): Automatic frequency reduction under thermal stress
ACPI Standard: Industry-standard power management framework
Linux Scaling Drivers: intel_pstate vs acpi-cpufreq interfaces
Scaling Governors: Policies for automatic frequency selection (ondemand, powersave, etc.)
MSR Registers: Direct hardware frequency control mechanisms
Intel Turbo Boost: Opportunistic frequency boosting beyond nominal
Hardware P-State (HWP): Autonomous hardware frequency management
GPU Frequency Scaling: NVIDIA and AMD GPU power control
Runtime Power Management Systems: Automatic optimization frameworks
Target Audience¶
Level: Intermediate to Advanced
Prerequisites: Comfortable with Linux systems, basic Python, understanding of CPU architectures, and familiarity with sysfs and /proc filesystems. System administration experience is helpful but not required.
Language: English
Technical Requirements¶
Linux system with frequency scaling support (intel_pstate or acpi-cpufreq driver)
Root or power management group access for sysfs writes
Python 3.7+ with NumPy for analysis (optional)
Access to HPC system or multi-core workstation for experiments
Instructors¶
Ondrej Vysocky is a senior researcher at the Infrastructure Research Laboratory within the IT4Innovations National Supercomputing Center. His work primarily focuses on the reduction of the energy consumption of supercomputers to lower operating costs, achieving annual savings in the millions of crowns and a significant reduction in the carbon footprint of computations.
Learning outcomes¶
This module prepares HPC practitioners, system administrators, and performance engineers to understand and optimize power consumption through hardware power management techniques.
By the end of this module, learners should be able to:
Understand power physics: Explain the relationship between voltage, frequency, and power consumption (P=CV²f), and discuss why power becomes a performance-limiting constraint
Navigate power management hierarchy: Explain P-states (performance), C-states (idle), T-states (thermal), and S-states (shutdown) and their appropriate use cases
Control CPU frequency: Use Linux sysfs interfaces to query and adjust CPU frequency scaling parameters
Select appropriate scaling governors: Choose between performance, powersave, ondemand, conservative, and userspace governors based on workload characteristics
Interpret frequency scaling data: Read frequency information from sysfs and understand MSR register-level frequency control
Apply Turbo Boost effectively: Understand the relationship between instruction set (SSE vs AVX-512) and turbo frequency, and decide when to enable/disable turbo
Leverage Hardware P-State: Understand how HWP (SpeedShift) improves frequency management responsiveness and explain its benefits over OS-controlled scaling
Manage GPU frequency: Use vendor tools (nvidia-smi, rocm-smi) to monitor and control GPU frequency scaling
Design power management strategies: Create appropriate power management policies for different workloads (batch, interactive, accelerated)
Implement runtime power systems: Deploy automatic power management systems that balance energy and performance constraints
See also¶
Credit
FIXME
Don’t forget to check out additional course materials from CASTIEL Energy Efficient Computing webinar series.
License
CC BY-SA for media and pedagogical material
Copyright © 2025 XXX. This material is released by XXX under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).
Canonical URL: https://creativecommons.org/licenses/by-sa/4.0/
You are free to
Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms
Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
This deed highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. You should carefully review all of the terms and conditions of the actual license before using the licensed material.
MIT for source code and code snippets
MIT License
Copyright (c) 2026, EVITA project, Ondrej Vysocky
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Note
To module authors: For code you may use any OSI-approved license as mentioned in https://spdx.org/licenses/, such as Apache License 2.0, GNU GPLv3, MIT. Please make sure to update the deed above and
LICENSE.code file accordingly.