Energy calculation uses stale resource utilization due to refresh ordering

### Kepler Version

0.10.0 or later (Current/Supported)

### Bug Description

Hi, while working on energy measurement and validating results, I noticed an inconsistency in how resource utilization is accounted for during power calculation.

From my measurements and testing, the current implementation computes energy based on stale resource utilization data.

In firstReading, resources are refreshed after the initial node read:

```
   func (pm *PowerMonitor) firstReading(newSnapshot *Snapshot) error {
	// First read for node
	if err := pm.firstNodeRead(newSnapshot.Node); err != nil {
		return fmt.Errorf(nodePowerError, err)
	}

	if err := pm.resources.Refresh(); err != nil {
		pm.logger.Error("snapshot rebuild failed to refresh resources", "error", err)
		return err
	}
}
```

However, in calculatePower, node power is computed before refreshing resources:

   
  ```
   func (pm *PowerMonitor) calculatePower(prev, newSnapshot *Snapshot) error {
  	// Calculate node power
  	if err := pm.calculateNodePower(prev.Node, newSnapshot.Node); err != nil {
  		return fmt.Errorf(nodePowerError, err)
  	}

	if err := pm.resources.Refresh(); err != nil {
		pm.logger.Error("snapshot rebuild failed to refresh resources", "error", err)
		return err
	}
```


**Problem**

Because pm.resources.Refresh() happens after calculateNodePower, the energy calculation is effectively based on the previous snapshot’s resource utilization, not the current one.

In practice, this leads to a measurable lag/inaccuracy in energy accounting, especially when resource usage changes between snapshots.

### Steps to Reproduce


1. **Start the system with power monitoring enabled**

2. **Establish a baseline**

   * Let the system run under low or idle resource utilisation and record a few consecutive energy/power measurements.

3. **Introduce a sudden change in resource utilisation**

   * For example Start a CPU-intensive workload (e.g., stress test)

4. **Observe power/energy measurements across snapshots**

   * Compare the timestamp when resource utilisation increases with the timestamp when the energy measurement reflects the increase

5. **Identify the lag**
   * You will notice that:

     * Resource utilization increases at snapshot **N**
     * Energy measurement reflects this increase only at snapshot **N+1**

Energy measurements lag behind actual resource utilization by one snapshot cycle.


### Expected Behavior

Energy measurements should reflect resource utilization changes within the same snapshot cycle.

[kepler_logs.log](https://github.com/user-attachments/files/26192610/kepler_logs.log)

### Environment

- OS: Ubuntu 22.04.5 LTS (Jammy Jellyfish)
- Kubernetes Version: v1.33.5+k3s1 (k3s single-node setup)
- Container Runtime: containerd 2.1.4 (k3s)
- Hardware: Intel Xeon Silver 4309Y CPU (RAPL supported), 32 CPUs
- Deployment Method: Kubernetes DaemonSet (Kepler, 1 node)

### Logs and Error Messages

From the collected data, _node-active-energy_ appears to be computed using the **previous interval's _node-CPUUsageRatio_** rather than the current one.

For example:

- At time **2026-03-23T18:33:30.960Z**: node-CPUUsageRatio ≈ 0.007805

- At time **2026-03-23T18:33:35.978Z**: _node-CPUUsageRatio_ ≈ 0.020210 && _node-rapl-delta-energy_ ≈ 607.13

If energy were computed using the **current ratio**: Expected ≈ 607.13 × 0.020210 ≈ 12.27

However, the observed _node-active-energy_ is ≈ 4.74, which matches: 607.13 × 0.007805 ≈ 4.74

This indicates that: node-active-energy(2026-03-23T18:33:35.978Z) is derived using node-CPUUsageRatio(2026-03-23T18:33:30.960Z) i.e., from the **previous interval**.

Additionally, a request was issued at:
**2026-03-23 18:33:32.408387 UTC**

which falls between these two samples. This causes the CPU usage increase to appear in the later snapshot, while the energy attribution is still based on the previous (lower) CPU usage, leading to stale accounting.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Energy calculation uses stale resource utilization due to refresh ordering #2446

Kepler Version

Bug Description

Steps to Reproduce

Expected Behavior

Environment

Logs and Error Messages

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Energy calculation uses stale resource utilization due to refresh ordering #2446

Description

Kepler Version

Bug Description

Steps to Reproduce

Expected Behavior

Environment

Logs and Error Messages

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions