Skip to content

Commit fd402bb

Browse files
[MemProf] Fix callee guid for non-leaf frame (llvm#172502)
When matching callsite profile info, we synthesize VP metadata for matched indirect calls from the CalleeGuids recorded with the CallSite profile info. However, those are the callee guids of the leaf-most frame in the callsite. In cases where we match to a portion of the frames, not including the leaf, the callee guid should instead be synthesized from the next leaf-most frame in the list. This addresses the case where indirect call promotion was applied in the profiled binary during SamplePGO matching in a ThinLTO backend, where we didn't have VP metadata.
1 parent 99d63c6 commit fd402bb

File tree

2 files changed

+27
-4
lines changed

2 files changed

+27
-4
lines changed

llvm/lib/Transforms/Instrumentation/MemProfUse.cpp

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -702,8 +702,14 @@ readMemprof(Module &M, Function &F, IndexedInstrProfReader *MemProfReader,
702702
for (auto &StackFrame : CS.Frames) {
703703
uint64_t StackId = computeStackId(StackFrame);
704704
ArrayRef<Frame> FrameSlice = ArrayRef<Frame>(CS.Frames).drop_front(Idx++);
705-
ArrayRef<GlobalValue::GUID> CalleeGuids(CS.CalleeGuids);
706-
LocHashToCallSites[StackId].push_back({FrameSlice, CalleeGuids});
705+
// The callee guids for the slice containing all frames (due to the
706+
// increment above Idx is now 1) comes from the CalleeGuids recorded in
707+
// the CallSite. For the slices not containing the leaf-most frame, the
708+
// callee guid is simply the function GUID of the prior frame.
709+
LocHashToCallSites[StackId].push_back(
710+
{FrameSlice, (Idx == 1 ? CS.CalleeGuids
711+
: ArrayRef<GlobalValue::GUID>(
712+
CS.Frames[Idx - 2].Function))});
707713

708714
ProfileHasColumns |= StackFrame.Column;
709715
// Once we find this function, we can stop recording.

llvm/test/Transforms/PGOProfile/memprof_annotate_indirect_call.test

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,17 @@ HeapProfileRecords:
3030
- { Function: _Z3barv, LineOffset: 3, Column: 5, IsInlineFrame: true }
3131
- { Function: _Z3foov, LineOffset: 10, Column: 7, IsInlineFrame: false }
3232
CalleeGuids: [0x3456, 0x4567]
33+
# This set of frames is for the second indirect call below. We should match
34+
# the interior non-leaf frame with the call. In this case the synthesized
35+
# VP metadata should have _Z3xyzv (GUID 15367485273663173088 aka
36+
# -3079258800046378528) as the target, not the one from CalleeGuids which
37+
# is the callee of the leafmost frame. This simulates the case where sample
38+
# PGO performed ICP during matching in the profiled compile, without using
39+
# VP metadata.
40+
- Frames:
41+
- { Function: _Z3xyzv, LineOffset: 1, Column: 1, IsInlineFrame: true }
42+
- { Function: _Z3barv, LineOffset: 4, Column: 10, IsInlineFrame: false }
43+
CalleeGuids: [0x5678]
3344
...
3445

3546
;--- basic.ll
@@ -38,12 +49,17 @@ entry:
3849
%fp = alloca ptr, align 8
3950
%0 = load ptr, ptr %fp, align 8
4051
call void %0(), !dbg !5
41-
; CHECK-ENABLE: call void %0(), {{.*}} !prof !6
52+
; CHECK-ENABLE: call void %0(), {{.*}} !prof ![[VP1:[0-9]+]]
4253
; CHECK-DISABLE-NOT: !prof
54+
call void %0(), !dbg !6
55+
; CHECK-ENABLE: call void %0(), {{.*}} !prof ![[VP2:[0-9]+]]
4356
ret void
4457
}
4558

46-
; CHECK-ENABLE: !6 = !{!"VP", i32 0, i64 6, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1, i64 4660, i64 1, i64 9029, i64 1, i64 13398, i64 1, i64 17767, i64 1}
59+
; CHECK-ENABLE: ![[VP1]] = !{!"VP", i32 0, i64 6, i64 1311768467463790320, i64 1, i64 2541551405711093505, i64 1, i64 4660, i64 1, i64 9029, i64 1, i64 13398, i64 1, i64 17767, i64 1}
60+
;; The second call above gets a single target synthesized from the callee frame,
61+
;; to the GUID of _Z3xyzv, see comments in the profile above.
62+
; CHECK-ENABLE: ![[VP2]] = !{!"VP", i32 0, i64 1, i64 -3079258800046378528, i64 1}
4763

4864
!llvm.module.flags = !{!2, !3}
4965

@@ -53,6 +69,7 @@ entry:
5369
!3 = !{i32 2, !"Debug Info Version", i32 3}
5470
!4 = distinct !DISubprogram(name: "bar", linkageName: "_Z3barv", scope: !1, file: !1, line: 1, unit: !0)
5571
!5 = !DILocation(line: 4, column: 5, scope: !4)
72+
!6 = !DILocation(line: 5, column: 10, scope: !4)
5673

5774
;--- fdo_conflict.yaml
5875
---

0 commit comments

Comments
 (0)