Instructor guide¶
Why we teach this lesson¶
Power management is essential for modern HPC because:
Cost Reduction: Energy costs are 20-30% of HPC operational budgets. Understanding power management enables targeted cost reduction through frequency scaling and power capping—often yielding 10-30% power savings without sacrificing application performance.
Performance Is Power-Limited: Modern CPUs and accelerators are power-limited devices. You cannot run all cores at maximum frequency indefinitely—thermal and power delivery constraints enforce a power envelope. Understanding this relationship is critical for performance modeling and prediction.
Thermal Management: Data centers operate at capacity limits for cooling. Poor power management leads to thermal throttling, system failures, and operational issues. Students must understand how frequency scaling prevents thermal emergencies.
System Reliability: Power delivery infrastructure and thermal systems are reliability bottlenecks. A 10% reduction in average power reduces cooling load by ~200 kW in large data centers—avoiding infrastructure upgrades and improving MTBF.
Heterogeneous Architectures: Modern HPC systems (CPU+GPU, multiple CPU types) require heterogeneous power strategies. Students need to understand how different components interact in power-constrained environments.
Regulatory and Sustainability: Increasing environmental regulations and carbon footprint accountability make power efficiency a business requirement. Understanding power management is now a professional skill for HPC practitioners.
Timing¶
Episode 0: Power Management Hardware Knobs
Power-Performance trade-off physics: 15 min (with equations and intuition)
CPU frequency scaling (DVFS) fundamentals: 15 min
P-states, C-states, T-states, S-states overview: 20 min (with examples)
ACPI standard and OS role: 10 min
Why power management matters: 10 min
Total: ~70 minutes
Episode 1: Implementation and Runtime Systems
Scaling drivers (intel_pstate vs acpi-cpufreq): 15-20 min
Scaling governors (performance, powersave, ondemand, conservative): 20-25 min
MSR-level frequency control: 10 min
Intel Turbo Boost: 10-15 min
Hardware P-State (HWP): 10-15 min
GPU frequency management: 10 min
Frequency transition latency and practical implications: 10 min
Runtime power management systems: 15 min
Total: ~100-110 minutes
Quiz and Exercises: 60-90 minutes depending on depth
Recommended structure: Teach Episode 0 on Day 1 (fundamentals), Episode 1 on Day 2 (implementation), with hands-on exercises on actual systems.
Hardware requirements¶
Minimum Setup for Demonstrations:
Linux workstation or HPC compute node with:
Intel (Haswell+) or AMD (Zen+) CPU
Linux kernel 4.0+ with cpufreq driver enabled
4+ GB RAM
10 GB disk space
Essential Access:
Root or group permissions for
/sys/devices/system/cpu/*/cpufreq/writesAbility to read MSR registers (may require kernel module)
cpupowerorturbostattools available
Optional but Recommended:
Intel Xeon Scalable or AMD EPYC processor (for realistic HPC hardware)
Multiple sockets/cores for demonstrating heterogeneous frequency management
NVIDIA or AMD GPU for accelerator frequency scaling demos
Thermal monitoring (lm-sensors) for demonstrating thermal throttling
Software Stack:
Linux kernel with intel_pstate or acpi-cpufreq driver
linux-toolspackage containing cpupower, turbostatPython 3.7+ with NumPy for data analysis
Optional: perf, likwid for power measurement
Virtual Environment Compromise: If no appropriate hardware available:
Use pre-recorded sysfs traces from actual systems
Provide simulation of frequency switching with synthetic data
Use vendor documentation with concrete examples
Demo with frequency limiting on available CPU (even if limited range)
Preparing exercises¶
1 Week Before:
Inventory your hardware:
# Check CPU cat /proc/cpuinfo | grep "model name" # Check supported frequencies cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq # Check available governors cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors
Verify scaling driver:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driverTest sysfs access:
# Ensure you can read and write cat /sys/devices/system/cpu/intel_pstate/max_perf_pct echo 90 > /sys/devices/system/cpu/intel_pstate/max_perf_pct # Test write
Prepare measurement tools:
sudo apt install linux-tools-generic # Provides cpupower, turbostat # Install likwid for power measurement (if available) # Pre-compile or document installation procedure
Day Before:
Set up clean baseline:
# Reset to default power settings echo 0 > /sys/devices/system/cpu/intel_pstate/no_turbo echo 100 > /sys/devices/system/cpu/intel_pstate/max_perf_pct echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct
Create demonstration scripts:
Script to measure frequency vs power (run application at different frequencies)
Script to monitor thermal throttling
Script to compare governors
Prepare example data:
Baseline frequency/power measurements for your system
Example traces showing frequency changes over time
Thermal throttling event logs
Test all exercises in advance:
Run through Episode 0 hands-on activities
Test Governor switching on all CPUs
Verify turbostat output format
Day Of:
System warmup: Boot 30 minutes early, run sustained load to establish thermal state
Verify kernel parameters: Confirm all expected sysfs files exist and are readable
Check frequency stability: Run measurement script and verify no throttling occurring
Network setup: Ensure students can access sysfs or have pre-shared results
Other practical aspects¶
Teaching Approach:
Start with physics (P=CV²f) to build intuition before implementation details
Use real numbers: “Reducing frequency by 20% saves ~35% power” is concrete
Compare to real systems: “Your laptop does this—here’s how to see it”
Hands-on first, then explanation: Let students observe frequency changes via sysfs, then explain what they’re seeing
Lab Exercise Ideas:
Frequency scaling exercise:
Run application at different frequencies
Measure time and energy consumption
Plot performance vs frequency
Governor comparison:
Run identical workload with different governors
Compare frequency transitions and power usage
Discuss trade-offs
Thermal throttling simulation:
Stress-test CPU to trigger throttling
Observe frequency drops via sysfs
Measure performance impact
Power budget simulation:
Given a 100W budget for 4-socket node
Determine allowable frequency per-socket
Implement via sysfs settings
Engaging Discussion Topics:
“Why would you ever use the performance governor?” (Answer: when you have no power budget constraints)
“What happens if you set minimum frequency > maximum?” (Reveal common configuration error)
“Why do different instruction sets have different turbo frequencies?” (AVX-512 power intensity)
“Who controls frequency on your laptop vs supercomputer?” (OS vs administrator)
Interesting questions you might get¶
Q: “If lower frequency saves power, why not just run at 1 GHz always?”
A: Because frequency directly impacts execution time. A job that takes 1 hour at 3 GHz takes 3 hours at 1 GHz. If your deadline is tight or utilization is low (wasting rental charges), the energy savings get overwhelmed by performance loss. There’s always a trade-off.
Q: “Can we predict power from frequency alone?”
A: No. Power depends on: frequency, voltage, workload type (SSE vs AVX-512), memory bandwidth, I/O activity, temperature, and other factors. The P=CV²f equation is simplified; real CPUs have leakage power, memory subsystem power, etc.
Q: “Why does my CPU frequency keep changing when I run the same application?”
A: The ondemand governor is responding to observed utilization. If your application has I/O waits or irregular parallelism, frequency bounces around. This is normal and usually fine, but can be problematic for latency-critical applications.
Q: “What’s the difference between intel_pstate and acpi-cpufreq?”
A: intel_pstate is newer, firmware-independent, with finer control. acpi-cpufreq is older, relies on ACPI firmware tables, more portable. Most modern Intel systems use intel_pstate.
Q: “Does limiting frequency affect reliability?”
A: No—if anything, it improves reliability by reducing heat/stress. You’re operating within the CPU’s safe power envelope, just at a lower level.
Q: “Can I change frequency of individual cores?”
A: Newer Intel CPUs (Skylake+) support per-core frequency with HWP. Older CPUs change frequency for all cores together. Check your CPU documentation.
Q: “What’s the relationship between power management and performance monitoring?”
A: Power management changes CPU frequency, which affects performance counter interpretation. A 2M cycle counter at 1 GHz vs 2 GHz represents different wall-clock time. Students must understand this relationship for accurate performance analysis.
Q: “Can we use power management for security (preventing side-channel attacks)?”
A: Potentially—variable frequency makes timing-based attacks harder. This is active research but not yet standard practice.
Typical pitfalls¶
Misconception 1: “Frequency reduction saves power proportionally”
The Problem: Students assume “50% frequency = 50% power”
Reality: Power ∝ V² × f, not linear. Reducing frequency by 50% requires voltage reduction, yielding ~60-70% power saving
How to catch it: Show mathematical derivation and empirical data
Teaching tip: Use concrete example: “Reducing frequency 3.0 GHz→2.0 GHz saves ~35% power”
Misconception 2: “The performance governor is always best”
The Problem: Students assume maximum frequency = best performance always
Reality: Performance governor wastes power during I/O and memory waits. For loosely coupled jobs, ondemand often delivers same performance with 15-20% power savings
How to catch it: Run actual measurements comparing governors on their workload
Teaching tip: Emphasize that performance and energy efficiency don’t always conflict
Misconception 3: “Turbo Boost is free frequency”
The Problem: Students think turbo comes without power cost
Reality: Turbo operates within the same power envelope as non-turbo. Running all cores at turbo requires either: (a) fewer active cores, or (b) reduced frequency when all cores boost
How to catch it: Ask “If all cores boost to 4.5 GHz, what’s the total power?” and lead to power limit reasoning
Teaching tip: Explain that turbo is a reallocation of power budget, not additional power
Misconception 4: “Thermal throttling means your CPU is broken”
The Problem: Students panic when they observe frequency drops under load
Reality: Thermal throttling is normal protection mechanism. It only triggers if cooling is genuinely insufficient
How to catch it: Distinguish between normal thermal management (fine) and persistent throttling (cooling insufficient)
Teaching tip: Show temperature vs frequency correlation; explain BIOS thermal limits
Misconception 5: “Changing frequency requires restarting the application”
The Problem: Students think frequency must be set before execution
Reality: Modern CPUs change frequency while running. Applications don’t need to be aware
How to catch it: Demo: change frequency via sysfs while application runs; show it responding to frequency change
Teaching tip: Use
watchcommand to continuously display frequency while running application
Misconception 6: “HWP is slower than OS control”
The Problem: Students assume hardware control is less responsive
Reality: HWP responds in ~1 microsecond vs ~50 microseconds for OS. Hardware wins decisively
How to catch it: Show latency numbers; explain hardware has specialized logic
Teaching tip: Emphasize that HWP enables low-latency adaptation that OS control cannot achieve
Misconception 7: “Scaling governors are mutually exclusive”
The Problem: Students think you must pick one globally
Reality: Different cores can use different governors (with limitations). System-wide policies are typical but not enforced
How to catch it: Show sysfs structure:
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governorTeaching tip: Clarify that writing to
/sys/devices/system/cpu/cpufreq/scaling_governorsets all CPUs, but individual control possible
Misconception 8: “Overclocking and frequency scaling are related”
The Problem: Students confuse DVFS with overclocking
Reality: DVFS operates within CPU specifications. Overclocking exceeds them (risky). Different concepts
How to catch it: Clarify: DVFS = manufacturer-supported, safe. Overclocking = pushing beyond spec, risky
Teaching tip: Emphasize that we’re teaching manufacturer-supported features only