Quiz: HPC Power Management (Episodes 0 & 1)¶
Episode 0: Power Management Hardware Knobs - Fundamentals and Concepts¶
Multiple choice, single answer
Power Physics Fundamentals:
What is the primary component of CPU power consumption that scales with frequency?
A) Leakage power only
B) Dynamic power (CV²f)
C) Thermal power
D) I/O power
According to the power equation P = CV²f + I_leak·V, which change provides the greatest power savings?
A) Reducing frequency by 20%
B) Reducing voltage by 20%
C) Reducing capacitance by 20%
D) Increasing leakage current by 20%
Why is dynamic voltage and frequency scaling (DVFS) more effective at lower frequencies?
A) Lower frequencies consume less energy per operation
B) Power savings are quadratic with voltage reduction at lower frequencies
C) Cooling is more efficient
D) Memory bandwidth increases
CPU Frequency Scaling (P-states):
What is the primary purpose of P-states in CPU power management?
A) Control CPU temperature
B) Select the frequency and voltage for CPU cores
C) Manage idle power consumption
D) Coordinate with GPU frequency
Intel processors typically support how many P-states?
A) 2-5
B) 10-15
C) 20-40
D) 100+
What does P0 represent in the P-state hierarchy?
A) Minimum frequency
B) Turbo boost frequency
C) Baseline/nominal frequency
D) Shutdown state
Idle Power Management (C-states):
Which C-state represents the CPU actively executing instructions?
A) C0
B) C1
C) C2
D) C3
Approximately what percentage of power can be saved by transitioning from C0 to C3?
A) 5-10%
B) 20-30%
C) 50%+
D) 100%
What is the trade-off when using deeper C-states (C3+)?
A) Higher frequency requirements
B) Increased latency to wake up and resume execution
C) More power consumption
D) Inability to access memory
Thermal and System Power Management:
What are T-states used for in CPU power management?
A) Temperature sensing
B) Reducing frequency during thermal stress (thermal throttling)
C) Enabling turbo boost
D) Managing core voltage
What does S5 represent in the system power state (S-state) hierarchy?
A) System running state
B) System suspended
C) System fully powered off
D) System in thermal shutdown
ACPI and Scaling Drivers:
Which of the following is NOT a scaling driver mentioned in Episode 0?
A) acpi-cpufreq
B) intel_pstate
C) amd-cpufreq
D) gpu-pstate
What is the primary role of a scaling governor?
A) Measure CPU temperature
B) Decide which P-state (frequency) to use based on workload conditions
C) Manage system fans
D) Control memory frequency
Why Power Management Matters in HPC:
What is the typical power consumption percentage for a data center (facility-level)?
A) 5-10% of operational costs
B) 20-30% of operational costs
C) 50-70% of operational costs
D) 90%+ of operational costs
Approximately what fraction of power in a large HPC system goes to cooling?
A) 10%
B) 25%
C) 50%
D) 75%
Which of the following is NOT a benefit of power management in HPC?
A) Reduced electricity costs
B) Improved multi-tenancy isolation
C) Guaranteed faster execution time
D) Decreased environmental impact
Conceptual questions
Power Equation Analysis: A processor running at 2.0 GHz with voltage 0.8V consumes P₁ Watts of dynamic power. If you reduce frequency to 1.6 GHz and can also reduce voltage to 0.7V, calculate the power reduction ratio P_new/P₁. Assume capacitance C remains constant. Show all steps and explain why voltage scaling is crucial.
Workload and Power Interaction: Consider two workloads: (1) an all-reduce collective communication pattern (memory-bound) and (2) a dense matrix multiplication (compute-bound). For each workload, explain:
Whether frequency reduction will hurt performance significantly
Why it might be safe/unsafe to reduce frequency
What monitoring would you use to validate your strategy
Power Management Strategy Design: You are responsible for power management on a 1000-node HPC cluster where 30% of jobs are batch (deadline-loose), 50% are interactive (need low latency), and 20% are GPU-accelerated (compute-intensive). Design a power management strategy that:
Identifies which power management knobs to use
Proposes different settings for each job type
Explains trade-offs between performance and energy
Episode 1: Power Management Implementation and Runtime Systems¶
Multiple choice, single answer
Scaling Drivers and Interfaces:
Which driver provides more responsive CPU frequency control on modern Intel processors?
A) acpi-cpufreq
B) intel_pstate
C) ondemand-governor
D) They are equivalent
What does the max_perf_pct parameter in intel_pstate sysfs control?
A) Maximum percentage of cores to activate
B) Maximum allowed P-state as a percentage of maximum frequency
C) Maximum percentage of memory bandwidth
D) Maximum core temperature
To disable turbo boost on an intel_pstate system, which sysfs file should be written to?
A)
scaling_max_freqB)
turbo_boostC)
no_turboD)
turbo_disabled
Scaling Governors (Policies):
Which scaling governor always runs at maximum frequency?
A) powersave
B) performance
C) ondemand
D) conservative
What is the main advantage of the ondemand governor compared to conservative?
A) Lower power consumption
B) Faster response to load increases
C) Better for real-time systems
D) Requires less configuration
Which governor allows direct user/application control of CPU frequency?
A) performance
B) powersave
C) userspace
D) ondemand
In the ondemand governor, what does up_threshold control?
A) CPU utilization threshold for scaling frequency up
B) Maximum frequency cap
C) Minimum frequency floor
D) Temperature threshold
Hardware Frequency Control (MSR):
What is the MSR (Model-Specific Register) address for IA32_PERF_CTL?
A) 0x199
B) 0x770
C) 0x610
D) 0x620
Which bit field in MSR 0x199 specifies the target P-state?
A) Bits [7:0]
B) Bits [15:8]
C) Bits [31:16]
D) Bits [63:32]
Intel Turbo Boost:
What is the key difference between SSE and AVX-512 boost frequencies?
A) AVX-512 has the highest boost frequency
B) SSE has the highest boost frequency due to lower power density
C) They are identical
D) AVX-512 doesn’t support boost
Why does Intel implement instruction-set-specific frequency levels?
A) To comply with firmware limitations
B) To allow higher compute throughput within power and thermal budgets
C) To maintain thermal stability
D) To prevent privilege escalation
Frequency Transition Latency:
What is a typical frequency transition latency on modern Intel processors?
A) 1-5 microseconds
B) 5-20 microseconds
C) 100-500 microseconds
D) 1-10 milliseconds
For which type of application is frequency transition latency most critical?
A) Batch HPC jobs (hours-long)
B) Real-time embedded systems
C) General server workloads
D) Interactive web applications
GPU Frequency Management:
What command-line tool is used to control NVIDIA GPU frequency?
A) rocm-smi
B) nvidia-smi
C) gpu-control
D) pstate-set
Which AMD tool is used for GPU frequency management on AMD GPUs?
A) nvidia-smi
B) rocm-smi
C) amd-power
D) frequency-control
What is the typical frequency granularity for NVIDIA GPUs?
A) 1 MHz
B) 5-10 MHz
C) 25-50 MHz
D) 100+ MHz
Hardware P-State (HWP) and SpeedShift:
How does HWP differ from OS-controlled frequency scaling?
A) HWP is slower
B) Hardware autonomously selects P-states within OS-specified range
C) HWP only works for memory
D) They are the same thing
What is the approximate latency improvement of HWP over OS control?
A) 2-5× faster
B) 5-10× faster
C) 10-100× faster
D) HWP is actually slower
Energy-Performance Preference (EPP):
What does MSR IA32_ENERGY_PERF_BIAS (0x1B0) allow?
A) Setting exact frequency values
B) Specifying energy-performance trade-off preference (0-15 scale)
C) Disabling frequency scaling
D) Monitoring power consumption
On modern Intel (Skylake+), which MSR provides finer-grained EPP control with 0-255 scale?
A) 0x199
B) 0x620
C) 0x774
D) 0x1B0
CPU Uncore Frequency:
Approximately what percentage of CPU chip area does the uncore subsystem consume?
A) 10%
B) 20%
C) 30%
D) 50%
What is MSR MSR_UNCORE_RATIO_LIMIT used for?
A) Controlling core frequency only
B) Setting limits on uncore (shared subsystem) frequency
C) Measuring power consumption
D) Detecting thermal issues
Workload Characterization:
Which workload type would most benefit from frequency reduction without performance loss?
A) Dense matrix multiplication
B) Sparse linear solver (memory-latency-bound)
C) Real-time signal processing
D) Latency-sensitive trading system
What metric helps predict whether a workload is compute-bound or memory-bound?
A) Latency
B) Throughput
C) Arithmetic intensity (operations per memory access)
D) Frequency
Intel RAPL Power Capping:
How many power domains does Intel RAPL typically support?
A) 1-2
B) 2-3
C) 3-5
D) 5+
Which RAPL domain is specific to server architectures?
A) Package
B) Core (PP0)
C) DRAM
D) Graphics (PP1)
What are the two time windows in Intel RAPL Package domain?
A) 1 ms and 10 ms
B) Short (~1.2× TDP, ms) and Long (~TDP, seconds)
C) 1 second and 10 seconds
D) 100 ms and 1 second
What does MSR MSR_PKG_POWER_LIMIT (0x610) control?
A) Current power consumption
B) Maximum and minimum frequency
C) Power capping limits and time windows
D) Thermal shutdown temperature
Case Studies and Advanced Platforms:
In the Cascade Lake case study, what percentage CPU energy savings was achieved with frequency scaling?
A) 5-10%
B) 18%
C) 30%
D) 50%
Why is power management challenging on Grace Hopper?
A) No frequency scaling available
B) Multiple power domains (CPU, GPU, interconnects) require coordination
C) GPU power cannot be measured
D) Frequency is fixed at 2.2 GHz
On RIKEN Fugaku’s A64FX, what does FPU elimination in ECO mode do?
A) Disables floating-point calculations entirely
B) Reduces frequency by 50%
C) Uses one of two FPU pipelines only, reducing power
D) Moves computations to GPU
Runtime Systems and Strategies:
What does a power-capping runtime system do?
A) Forces all jobs to use the same frequency
B) Ensures total node power doesn’t exceed a limit while maximizing performance
C) Measures power for accounting purposes only
D) Disables turbo boost
Which power management strategy is most suitable for a tightly power-constrained HPC facility?
A) Fixed frequency for all jobs
B) Per-application tuning
C) Dynamic runtime control with power budgeting
D) No power management (always maximum)
What is the typical energy savings range for dynamic runtime power management?
A) 5-10%
B) 10-20%
C) 20-40%
D) 50%+
Coding and analysis questions
MSR-Based Frequency Control: The IA32_PERF_CTL register (MSR 0x199) controls CPU frequency. Assume a system has P-states 0-39, where:
P0 = 3.8 GHz (turbo)
P1 = 3.6 GHz (nominal)
P39 = 0.8 GHz (minimum)
Given that target P-state is specified in bits [15:8]:
a) Write the MSR value in hex to set CPU to P24 (2.0 GHz)
b) Explain how to read current frequency from P-state number
c) Design pseudocode to linearly scale frequency from current to 1.8 GHz over 10 steps
Scaling Governor Selection: You have three workloads:
Scientific simulation: CPU-intensive, 8-hour runtime, deadline = 9 hours
Data processing pipeline: 30% communication, 70% compute, continuously running
Interactive visualization: Variable workload, < 100ms latency requirement
For each workload:
a) Recommend a scaling governor (performance, powersave, ondemand, conservative, userspace)
b) Justify your choice with reasoning about workload characteristics
c) Propose specific tuning parameters (e.g., up_threshold, sampling_rate)
Power Equation Application: A CPU core operates at V = 0.9V, f = 2.5 GHz with capacitance C = 100 pF. Dynamic power = CV²f ≈ 56W. Leakage power ≈ 4W. Total P = 60W.
a) If you reduce to f = 2.0 GHz and V can be scaled to 0.8V, calculate new power
b) What is the energy savings per 1-hour job?
c) At $0.10/kWh, what is the annual cost savings if running 500 jobs/day?
Intel RAPL Analysis: Assume you read the following RAPL MSR values:
Measurement |
Value |
|---|---|
MSR_RAPL_POWER_UNIT (0x606) |
0xA1003 |
MSR_PKG_ENERGY_STATUS (0x611) at t₀ = 0s |
0x2A5B0E00 |
MSR_PKG_ENERGY_STATUS (0x611) at t₁ = 60s |
0x3F7F0E00 |
Using MSR format: energy_unit = 2^(-bit_position) Joules
a) Decode MSR_RAPL_POWER_UNIT to get energy unit in Joules
b) Calculate energy consumed between t₀ and t₁
c) Calculate average power during this interval
d) If power was capped at 200W, was it exceeded?
HWP vs OS Control Comparison: You are optimizing a 10-second latency-sensitive application.
Scenario A: OS-controlled frequency, 50 μs per frequency change, 4 changes needed
Scenario B: HWP hardware-controlled, 2 μs per P-state selection, 4 changes needed
a) Calculate total latency overhead for each scenario
b) Compute performance impact as percentage of 10-second deadline
c) Explain why HWP is preferable for this application
Case Study Analysis - Cascade Lake: From Episode 1, Cascade Lake achieves 18% CPU energy savings with frequency scaling on AVX-512 workloads at arithmetic intensity 8.
a) If a node’s baseline power is 100W CPU + 50W memory, calculate new CPU power with 18% savings
b) Total node power = 350W baseline. Assume 15% node savings. Calculate new total node power
c) For 1000 nodes running 24/7, compute annual energy cost savings at $0.12/kWh
d) Estimate payback period if implementing power management costs $500,000 in software development
Runtime System Design - Power Budget Allocation: Design a runtime system that allocates a 500W power budget among 4 cores, each running different workload types:
Core |
Workload Type |
Compute Intensity |
Baseline Power |
|---|---|---|---|
Core 0 |
Compute-bound |
8 ops/byte |
60W |
Core 1 |
Memory-bound |
0.5 ops/byte |
40W |
Core 2 |
I/O-bound |
0.1 ops/byte |
30W |
Core 3 |
Balanced |
2 ops/byte |
50W |
a) Estimate how much frequency reduction each core can tolerate without significant performance loss
b) Design an algorithm to allocate the 500W budget to maximize throughput
c) Write pseudocode for dynamic reallocation if a core becomes idle
d) Propose how to handle thermal constraints (max 90°C core temperature)
Workload Characterization and Power Optimization: Given profile data for a Lattice Boltzmann Method (LBM) simulation:
Metric |
Value |
|---|---|
Memory accesses per 1000 cycles |
450 |
Floating-point operations per 1000 cycles |
1800 |
L3 cache hit rate |
85% |
Memory latency (cache miss) |
200 cycles |
Frequency |
2.5 GHz |
a) Calculate arithmetic intensity (flops per byte accessed)
b) Is this workload compute-bound or memory-bound? Justify.
c) Estimate the performance impact of 20% frequency reduction
d) Propose a power management strategy specific to this workload type