-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
distributed currently runs (simplified):
knock_knock = KnockKnock(polling_interval_micros=1000)
knock_knock.start()
cumulative_gil_contention = 0
prev_ts = now = time()
def system_monitor_periodic_callback():
global cumulative_gil_contention, prev_ts, now
prev_ts, now = now, time()
# snip: fairly long list of pure-python code
cumulative_gil_contention += (now - prev_ts) * knock_knock.contention_metric
knock_knock.reset_contention_metric() This can be problematic, as a user's function can acquire the GIL either between the sampling of time() and the sampling of contention_metric, or between sampling contention_metric and calling reset_contention_metric().
This is a screenshot of a torture test (code here) where:
- phase 1: task runs a C extension that completely blocks the GIL for 1 second; wait 10~100ms for the next task; repeat
- phase 2: task runs a C extension that completely blocks the GIL for 4 seconds; wait 10~100ms for the next task; repeat
- phase 3: task runs a C extension that completely blocks the GIL for 15 seconds; wait 10~100ms for the next task; repeat
- phase 4: task runs a hot pure-python for loop constantly
Those points at 0 in the middle of phases 1, 2, and 3 are incorrect.
Proposed design
GILKnocker should offer an API to read its own cumulative metric. This metric should hold the GIL between the moment it samples the timestamp and the moment it samples the GIL use. reset_contention_metric should not clear it.
milesgranger
Metadata
Metadata
Assignees
Labels
No labels
