Skip to content

Interpreter work#1765

Draft
Bike wants to merge 30 commits intomainfrom
interpreter-work
Draft

Interpreter work#1765
Bike wants to merge 30 commits intomainfrom
interpreter-work

Conversation

@Bike
Copy link
Copy Markdown
Member

@Bike Bike commented Apr 22, 2026

  • bytecode_vm now maintains its own stack of dynenvs instead of calling itself recursively
  • bytecode_vm uses computed goto rather than a big switch
  • fixes up trampolines (i.e. native stubs for bytecode functions, so they show up in DWARF metadata for profiling etc.) to use copied object code rather than recompiling llvm-ir anew for each trampoline

very much a draft - i'm just filing it as a PR to review it

Christian Schafmeisterr added 24 commits April 18, 2026 08:41
A static variable was declared in potential_number_p code that
we inherited from ECL.  In ECL they fixed it nine years ago.
Fixes snapshot load race condition
They copy the CIE and FDE from the llvm compilation output
and interleave it with the trampolines.

Added a message about building this on macOS.
On MacOS it should still build but the trampolines are not available.
When CLASP_FLAME_PROFILE=1 is set
when the user sends SIGUSR2 to the process a ten
second profile is generated and a flamegraph is written
to either /tmp/clasp-PID.svg or to a file specified as the
artument to CLASP_FLAME_PROFILE
This is so cando can add command line arguments
Copy link
Copy Markdown
Member Author

@Bike Bike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still has plenty of hardcoding, just more subtly

// matching exit opcodes pop; an outer try/catch(Unwind&) in bytecode_vm walks
// the stack to run cleanups / resume at a saved pc on non-local exits.
//
// Currently only the type and the stack exist — no opcodes are migrated yet.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire parallel structure of dynenvs is pointless duplication. Even if we do want to keep the parallel stack of dynenvs, which I don't think we do, the kind, frame, slots, and all are apparent from the existing dynenv classes (TagbodyDynEnv_O etc.). A separate enum for what are types, padding bytes (??), slots that are punned, this is all really unacceptable. Only the bytecode specific bits like the sp mark and target pc should really need recording anyway, and we could probably just add them to the dynenv classes or make new ones.

Adding to dynenv classes might also remove the need for VMFrameDynEnv which would be good.

unsigned char* _pc;

// Dynamic-environment record stack. Root-allocated so GC scans the
// T_O*/T_O** slots conservatively. _dynRecordTop points one past the last
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the slots in the parallel dynenv structures already exist in the dynenv classes, which are already reachable from the control stack or from the thread local state, so GC should not be a concern here

// If true, this ObjectFile is transient arena-init scaffolding (shared
// trampoline / stub template) that must not be serialized into snapshots.
// The ObjectFile is still registered in _AllObjectFiles normally — LLVM's
// link layer plugin looks it up by name during materialization, so it must
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our link layer plugin looks it up by name during materialization, not LLVM. And it does so for reasons trampolines don't need - so that we can look up DWARF from an instruction pointer. Since trampolines are for bytecode functions, we can use the bytecode debug info mechanisms instead of using DWARF at all. So I don't think trampolines need to go in _AllObjectFiles.

EDIT: Okay, so actually these changes use the trampoline to get closure etc, for some reason, so we do need DWARF. However to figure out if a PC is in an arena we can use arena_lookup_by_pc so we're still not going through object files.

#pragma once

/*
File: trampoline.h
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filename is incorrect

Comment thread include/clasp/core/lisp.h
std::atomic<T_sp> _AllObjectFiles;
std::atomic<T_sp> _AllCodeBlocks;
std::atomic<T_sp> _AllBytecodeModules;
// Every GFBytecodeSimpleFun ever made (atomic-pushed list of cons cells).
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"list of cons cells" indicates that this is a list whose elements are conses, but actually they are GFBytecodeSimpleFuns as suggested by the name.

// exactly one uwtable function (the template), with the _end marker
// attributed as no-uwtable — so LLVM emits one CIE + one FDE. Any other
// count (zero, or two+ of either) is a sign that something upstream
// changed and silent copy-of-wrong-bytes is about to happen: we fail loud.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a sign that something upstream changed". This is an admission that this code relies on undocumented behavior of libraries

// need no relocations, so what's in the unlinked ObjectFile is already
// correct. The only field that requires a relocation is the FDE's PC
// begin; its pre-relocation value is typically zero, and the caller
// patches it for the slot layout.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we ARE patching the bytes llvm gives us, or what?

}
if (pos >= n) return false;
uint8_t fde_enc = cie[pos];
// DW_EH_PE_pcrel = 0x10, DW_EH_PE_sdata4 = 0x0B. Combined = 0x1B.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it looks like what we're doing is having LLVM generate the DWARF, but then failing if it generates anything other than a specific sequence of bytes?

if (fde_enc != kExpectedFdeEncoding) {
fprintf(stderr,
"[trampoline-arena] CIE FDE encoding = 0x%02x (expected 0x%02x = "
"pcrel|sdata4). LLVM may have switched .eh_frame defaults — the "
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another reference to libraries (LLVM) changing the undocumented behavior we're relying on

}

// Patch the FDE's PC begin field for the slot layout
// [code | CIE | FDE | terminator]. The CIE uses DW_EH_PE_pcrel|sdata4
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, we're relying on specific bytes again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants