Skip to content

Commit 4004957

Browse files
authored
Add HOWTO.md
Add a document explaining the app features. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
1 parent d57af1a commit 4004957

File tree

1 file changed

+215
-0
lines changed

1 file changed

+215
-0
lines changed

HOWTO.md

Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
The bpfvv app is in early stages of development, and you should expect
2+
bugs, UI inconveniences and significant changes from week to week.
3+
4+
If you're writing BPF programs and you think this tool (or a better
5+
version of it) would be useful, feel free to use it and don't be shy
6+
to report issues and request features via github. Thanks!
7+
8+
# How to use bpfvv
9+
10+
Go here: https://theihor.github.io/bpfvv/
11+
12+
Load a log by pasting it into the text box or choosing a file.
13+
14+
The app expects BPF verifier log of `BPF_LOG_LEVEL1`[^1]. This is a log
15+
that you get when your BPF program has failed verification on load
16+
attempt.
17+
18+
Here is a small example:
19+
```
20+
processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
21+
ERROR: Error loading BPF program for usdt___a_out_test_struct_by_val_reg_pair_loc0_2.
22+
Kernel error log:
23+
0: R1=ctx() R10=fp0
24+
; @ bpftrace.bpf.o:0
25+
0: (b7) r2 = 1 ; R2_w=1
26+
1: (7b) *(u64 *)(r10 -24) = r2 ; R2_w=1 R10=fp0 fp-24_w=1
27+
2: (79) r3 = *(u64 *)(r1 +32) ; R1=ctx() R3_w=scalar()
28+
3: (07) r3 += -16 ; R3_w=scalar()
29+
4: (bf) r1 = r10 ; R1_w=fp0 R10=fp0
30+
5: (07) r1 += -8 ; R1_w=fp-8
31+
6: (b7) r2 = 16 ; R2_w=16
32+
7: (85) call bpf_probe_read_user#112
33+
invalid indirect access to stack R1 off=-8 size=16
34+
processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
35+
ERROR: Loading BPF object(s) failed.
36+
```
37+
38+
This log represents a particular trace through the BPF program, that
39+
led to an invalid state (as judged by the BPF verifier). It contains a
40+
lot of information about the interpreted state of the program on each
41+
instruction. The app parses the log and re-constructs program states
42+
in order to display potentially useful information in interactive way.
43+
44+
There are two main views of the program:
45+
* (on the left) formatted log, instruction stream
46+
* (on the right) program state: known values of registers and stack slots
47+
48+
![2025-03-20_14-35](https://github.com/user-attachments/assets/2dfebc04-2f96-402d-b6c5-25abeafd0fc2)
49+
50+
51+
## What's in the log
52+
53+
Notice that the displayed text has different content than the raw log.
54+
For example, consider this line:
55+
```
56+
1: (7b) *(u64 *)(r10 -24) = r2 ; R2_w=1 R10=fp0 fp-24_w=1
57+
```
58+
In the log view you will only see:
59+
```
60+
*(u64 *)(r10 -24) = r2
61+
```
62+
And program counter (pc) in a spearate column on the left.
63+
64+
This is intentional, as the comment on the right in the original line
65+
is the state of values as reported by BPF verifier. Since it is
66+
captured and displayed in a separate view, printing it in the log view
67+
is redundant.
68+
69+
Some instructions also are printed differently to facilitate
70+
interactive features. Notable example is call instructions.
71+
72+
For example, consider the following raw log line:
73+
```
74+
23: (85) call bpf_map_lookup_elem#1 ; R0=map_value_or_null(id=3,map=eventmap,ks=4,vs=2452)
75+
```
76+
77+
It is displayed like this:
78+
```
79+
r0 = call bpf_map_lookup_elem#1(r1, r2, r3, r4, r5)
80+
```
81+
82+
Notice also that the lines not recognized by the parser are greyed
83+
out. If you notice an unrecognized instruction, please submit a bug
84+
report.
85+
86+
## What can you do?
87+
88+
### Step through the instruction stream
89+
90+
The most basic feature of the visualizer is "stepping" through the
91+
log, similar to what you'd do in a debugger.
92+
93+
You can select a line by clicking on it, or by navigating with arrows
94+
(you can also use pgup, pgdown, home and end). The selected line has
95+
light-blue background.
96+
97+
When a line is selected, current state of known values is displayed in
98+
the panel on the right. By moving the selected line up/down the log,
99+
you can see how the values change with each instruction.
100+
101+
In the "state panel", the values that are written by selected
102+
instruction are marked with light-red background and the previous
103+
value is also often displayed, for example:
104+
```
105+
r6 0 <= scalar(id=1)
106+
```
107+
Means that current instruction changes the value of `r6` from
108+
`scalar(id=1)` to `0`.
109+
110+
The values that are read by current instruction have light-green
111+
background.
112+
113+
Note that for "update" instructions (such as `r1 += 8`), the slot
114+
will be marked as written.
115+
116+
### View data dependencies
117+
118+
The app computes a use-def analysis [^2] and you can interactively
119+
view dependencies between the instructions.
120+
121+
The concept is simple. Every instruction may read some slots
122+
(registers, stack, memory) and write to others. Knowing these sets
123+
(verifier log contains enough information to compute them), it is
124+
possible to determine for a slot used by current instruction, where
125+
its value came from (from what slot in what instruction).
126+
127+
You can view the results of this analysis by clicking on some
128+
instruction operands (registers and stack slots).
129+
130+
The selected slot is identified by a box. This selection changes the
131+
log view, greying out "irrelevant" instructions, and leaving only
132+
data-dependent instructions in the foreground.
133+
134+
<img src="https://github.com/user-attachments/assets/928e16b4-e75d-49c6-ac5a-b23841d053e1" width="640">
135+
136+
#### What's clickable?
137+
138+
Registers r0-r9 and stack accesses such as `*(u32 *)(r10 -8)`.
139+
140+
r10 (stack frame pointer) is not clickable because it's effectively a
141+
constant [^3].
142+
143+
#### How deep is the displayed dependency chain?
144+
145+
It depends, but usually not deep.
146+
147+
The problem with showing all dependencies is that it's too much
148+
information, which renders it useless.
149+
150+
Currently the upstream instruction is highlighted if it's an
151+
unambiguous dependency. For example:
152+
```
153+
42: r1 = 13
154+
43: r7 = 0
155+
44: r2 = r1
156+
```
157+
158+
Instruction 42 is an unambiguous dependency of instruction 44, because
159+
r1 is the only read slot, and there were no modifications to it along
160+
the way.
161+
162+
All such direct dependencies up the chain are shown.
163+
164+
However, when more than one value is read in the upstream instruction,
165+
the UI will stop highlighting at that instruction.
166+
167+
Consider an example:
168+
```
169+
42: r1 = r2
170+
43: r3 = *(u32 *)(r10 -16)
171+
44: r1 += r3
172+
45: *(u32 *)(r10 -64) = r1
173+
```
174+
175+
If you select `r1` at instruction 45, only instruction 44 will be
176+
highlighted, even though 42 and 43 are its transitive dependencies
177+
(`r1 += r3` reads both `r1` and `r3`).
178+
179+
The reason for this UI behavior is that showing all dependencies (both
180+
r1 and r3 and in turn all their dependencies) may very quickly cover
181+
most of the instructions. This is especially true for call
182+
instructions, which read up to 5 registers.
183+
184+
On the other hand the app can't know what the user is looking for, and
185+
there is no point in guessing. So, for an instruction like `r1 += r3`,
186+
the user must choose specific operand (r1 or r3 in this case) to
187+
expand the dependency chain further.
188+
189+
#### Note on memory stores and loads
190+
191+
Currently non-stack memory access is a "black hole" from the point of
192+
view of use-def analysis in this app. The reason is that it's
193+
impossible to be sure what is the value of a memory slot between
194+
stores and loads from it, because it may have been written outside of
195+
BPF program, and because it's not always simple to identify a specific
196+
memory slot.
197+
198+
So, when you see a store/load instruction to/from something like
199+
`*(u32 *)(r8 +0)` you can only click on r8 to check it's
200+
dependencies. If you see `*(u32 *)(r8 +0)` down the instruction
201+
stream, even if value of r8 hasn't changed, the analysis does not
202+
recognize these slots as "the same".
203+
204+
205+
## Footnotes
206+
207+
[^1]: `BPF_LOG_LEVEL2` can be parsed, and the app can handle big input
208+
(100+ Mb). However since level 2 log contains all states of the BPF
209+
program explored by the verifier, and the app does not distinguish
210+
between them (yet), the accumulated state at a particular log line is
211+
likely to be wrong.
212+
213+
[^2]: https://en.wikipedia.org/wiki/Use-define_chain
214+
215+
[^3]: https://docs.cilium.io/en/latest/reference-guides/bpf/architecture/

0 commit comments

Comments
 (0)