i#7854 wholesys traces: Invariant checks for hw cxt markers#7880
i#7854 wholesys traces: Invariant checks for hw cxt markers#7880abhinav92003 merged 9 commits intomasterfrom
Conversation
Augments the invariant checker to identify the TRACE_MARKER_TYPE_HARDWARE_EVENT and TRACE_MARKER_TYPE_HARDWARE_CONTEXT_RETURN markers. This involves treating them similarly to the existing TRACE_MARKER_TYPE_KERNEL_EVENT and TRACE_MARKER_TYPE_KERNEL_XFER markers, and keeping track of the enclosed part of the trace in a separate context. Issue: #7854
| gen_instr(TID_A, /*pc=*/1), | ||
| gen_marker(TID_A, TRACE_MARKER_TYPE_SYSCALL, 123), | ||
| gen_marker(TID_A, TRACE_MARKER_TYPE_SYSCALL_TRACE_START, 123), | ||
| gen_marker(TID_A, TRACE_MARKER_TYPE_HARDWARE_EVENT, 5), |
There was a problem hiding this comment.
But shouldn't this jump from pc=1 to pc=5 be a discontinuity? How did it get to 5 with no branch or a size=4 instruction?
There was a problem hiding this comment.
Some syscalls have TCG discontinuity events with a from_pc like such. I've seen it for nanosleep notably. Since the TCG traces would have some apparent discontinuities simply from switching threads (currently we don't have a way to identify tid), I assumed these are not a problem per-se.
There was a problem hiding this comment.
Added a comment to say: "known discontinuity due to some whole-system event like possibly switching to a different thread"
There was a problem hiding this comment.
// Control is expected to return to some PC other than non-fallthrough
// of the syscall. This is a known discontinuity at the syscall event itself
// due to some whole-system event like possibly switching to a different
// thread.
There was a problem hiding this comment.
But if we can't detect discontinuities from switching threads (this example should probably change TID then?) how can we detect any other discontinuity since it could also switch threads?
There was a problem hiding this comment.
Wouldn't a software thread switch go through the kernel though? That should involve a timer interrupt for a preempt or a voluntary switch and go through context switch code. Not understanding this.
There was a problem hiding this comment.
ow can we detect any other discontinuity since it could also switch threads?
Other discontinuities that cause an abrupt PC change are different and would show up as an error. This one in particular is at syscalls. My understanding is that for some of these instances, it is known to TCG at the syscall itself it won't be coming back to syscall-fallthrough.
Wouldn't a software thread switch go through the kernel though?
Correct, and that's what this discontinuity at the syscall is I believe, for voluntary switches.
There was a problem hiding this comment.
Not following this: any switch, including a voluntary one, will run some kernel code to actually accomplish the switch. Yet here we have the PC changing suddenly from one instruction to the next in a whole-system trace: that is an error in the trace unless there was a trap or interrupt there and we're now running the handler: I don't see how that can possibly be a software thread switch.
Augments the invariant checker to identify the TRACE_MARKER_TYPE_HARDWARE_EVENT and TRACE_MARKER_TYPE_HARDWARE_CONTEXT_RETURN markers.
This involves treating them similarly to the existing TRACE_MARKER_TYPE_KERNEL_EVENT and TRACE_MARKER_TYPE_KERNEL_XFER markers, and keeping track of the enclosed part of the trace in a separate context.
Also adds unit tests for the various hardware context marker related scenarios.
Issue: #7854