Skip to content

[LiveRangeEdit][AIEX] Fix rematerialization crash#877

Open
andcarminati wants to merge 3 commits intoaie-publicfrom
andreu.fix.remattable
Open

[LiveRangeEdit][AIEX] Fix rematerialization crash#877
andcarminati wants to merge 3 commits intoaie-publicfrom
andreu.fix.remattable

Conversation

@andcarminati
Copy link
Copy Markdown
Collaborator

Improve rematerialization validation for register class compatibility
During register allocation, instructions may be rewritten in ways that change
the register class of their operands. For example, the SplitEditor may transfer
a def to a split product, and target-specific passes (like super-register
rewriters) may later modify the instruction, changing register classes.

This patch adds validation to scanRemattable() to prevent rematerialization
when the defining instruction's register class requirements are incompatible
with the target register:

  1. Finds the correct def operand by tracing through VirtRegMap to identify
    which operand corresponds to the original register being rematerialized

  2. Validates register class compatibility: checks that the instruction's
    required register class for the def operand is compatible with the
    original register's class

  3. Handles subreg defs correctly: when a def operand has a subreg index
    (indicating a partial register definition), register class validation
    is skipped since the partial def's class may legitimately differ from
    the full register's class

  4. Refactors validation logic into isRematerializableDefInstr() helper
    function for better code organization

This prevents machine verifier failures and incorrect code generation on
targets with complex register hierarchies like AIE, while maintaining
correct behavior for subreg rematerialization on all targets.

Comment thread llvm/test/CodeGen/AIE/aie2ps/ra/liverangeedit-avoid-cross-class-remat.ll Outdated
Comment thread llvm/lib/Target/AIE/AIESuperRegUtils.cpp
Comment thread llvm/lib/CodeGen/LiveRangeEdit.cpp Outdated
Comment on lines +120 to +122
// However, if the def operand has a subreg index, this indicates a
// partial def of the original register, which is always valid for
// rematerialization regardless of register class constraints.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that? Wouldn't this fail if we have split the original register definition in two and later try to rematerialize the entire original register? This check would assure that it is fine to rematerialize with DefMI, because it is writing to a subregister of OriginalDef, but it is not correct, because we would be missing the rematerialization of the other part of the register. Am I missing something?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, the answer is no, and here's why. The function scanRemattable() doesn't decide whether to rematerialize an entire register. Instead, it examines each individual value definition (represented by VNInfo objects) and asks: "Can this specific value be recreated by re-executing its defining instruction?"

When we encounter a defining instruction with a subreg index, like %reg.sub_20bit = MOV ..., this instruction created a specific value for that subregister at that point in the program. We're not asking "can we use this to rematerialize the full register?" We're asking "can we use this to rematerialize the value it originally created?" And the answer is yes, because that's exactly what it did originally.

If the register allocator later needs the full register value somewhere, it won't try to use just this one subreg def to recreate it. The register allocator tracks values at a finer granularity. If a full register value is needed, either that full value was defined somewhere (and has its own VNInfo that can be checked for rematerialization), or the register allocator will need to keep the value live rather than rematerializing it.

The subreg check is simply recognizing that when an instruction defines a subregister, the register class of that subregister definition naturally differs from the full register's class, and that's perfectly fine. We're not bypassing safety checks; we're correctly handling the fact that partial definitions have different register class requirements than full definitions.

Think of it this way: if an instruction originally wrote to the low 16 bits of a 32-bit register, we can safely re-execute that same instruction to recreate those same low 16 bits. We're not claiming we can use it to recreate the full 32 bits.

Also, in our staged RA (fine grained), when the rewrite an unallocated register, we use a full composite register class instead of subregisters.

Do I miss some point?

Copy link
Copy Markdown
Collaborator

@mludevid mludevid Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that bypassing the register class check if the original register definition had a subregister index feels unsafe to me.
E.g. imagine the original code had something like this:

%0.high_16= G_CONSTANT ...
...
%1 = COPY %0.high_low_8

if this is later rewritten to something like:

%2 = G_CONSTANT ...
...
%1 = COPY %2.low_8

Even though the definition of the original register (%0.high_16) has a subregister index it is not safe to rematerialize %1 with the G_CONSTANT. I think this is related to this discussion: https://discourse.llvm.org/t/rematerialization-bug-in-greedy-regalloc/89128.

Maybe this problem cannot occur on AIEs, but we are touching generic code here, so we better be sure this works in the general case.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the other targets, the behavior is the same as before for the cases where we have subregister, see the place we use this function. We are enforcing an additional constraint for the case we have no subregister in the MO. Also see that we are not bypassing any other existing check here.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we don't seem to be breaking anything existing, but in your comment you claim that

rematerializing is always valid if the def operand has a subreg index

My question is regarding the comment, not the implementation. I don't think it is correct. I think it's fine if we focus on fixing our specific problem, but if we leave gaps we should document them properly.

Copy link
Copy Markdown
Collaborator Author

@andcarminati andcarminati Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh, now I see your point. I changed the message be more clear. You can check!

…er-reg rewrite"

We are not reverting the original commit because it would remove one important lit test.
@andcarminati andcarminati force-pushed the andreu.fix.remattable branch from 0852b47 to eed6e03 Compare March 26, 2026 13:51
…ss compatibility

During register allocation, instructions may be rewritten in ways that change
the register class of their operands. For example, the SplitEditor may transfer
a def to a split product, and target-specific passes (like super-register
rewriters) may later modify the instruction, changing register classes.

This patch adds validation to scanRemattable() to prevent rematerialization
when the defining instruction's register class requirements are incompatible
with the target register:

1. Finds the correct def operand by tracing through VirtRegMap to identify
   which operand corresponds to the original register being rematerialized

2. Validates register class compatibility: checks that the instruction's
   required register class for the def operand is compatible with the
   original register's class

3. Handles subreg defs correctly: when a def operand has a subreg index
   (indicating a partial register definition), register class validation
   is skipped since the partial def's class may legitimately differ from
   the full register's class

4. Refactors validation logic into isRematerializableDefInstr() helper
   function for better code organization

This prevents machine verifier failures and incorrect code generation on
targets with complex register hierarchies like AIE, while maintaining
correct behavior for subreg rematerialization on all targets.
@andcarminati andcarminati force-pushed the andreu.fix.remattable branch from eed6e03 to 048fa16 Compare March 26, 2026 15:15
mludevid
mludevid previously approved these changes Mar 27, 2026
Copy link
Copy Markdown
Collaborator

@mludevid mludevid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
One last thing: after commit 2 we have failing tests, can you mark them as XFAILS there to make it clear that that commit breaks them and then reenable them in commit 3 to make it clear that they are fixed there again? That also makes it easier to find changes in the future when traversing the git history on these files.

@niwinanto
Copy link
Copy Markdown
Collaborator

I could not reproduce the crash on second commit. Is it known? Some other change already hiding the issue?

@mludevid
Copy link
Copy Markdown
Collaborator

That is very surprising to me! I'd expected llvm/test/CodeGen/AIE/aie2p/ra/staged-ra-stale-remat.mir to fail after reverting the changes from #845. @andcarminati, can you look into this?

// the instruction's def class.
//
// However, if the def operand has a subreg index, this indicates a
// partial def of the original register, which may be valid for
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a gamble. Perhaps we can strengthen the check by checking that all subregindices connect to the same class.

return true;
}

void LiveRangeEdit::scanRemattable() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a table, or is this abbreviating 'rematerializable'. In both cases, the spelling is horrible.

@niwinanto
Copy link
Copy Markdown
Collaborator

I am playing with https://github.com/Xilinx/llvm-aie/blob/aie-public/llvm/test/CodeGen/AIE/aie2p/ra/staged-ra-stale-remat.mir with out the last commit to understand the problem and able to reproduce the crash. I feel the problem is not just materialization of the composite register copy, but OrigVNI->def points to wrong value numbering while we materialize.

bb0:
  undef %36.sub_dim_size:eds = MOV_PD_imm11_pseudo 0
  %36.sub_dim_stride:eds = MOV_PD_imm11_pseudo 128
  undef %35.sub_dim_size:eds = COPY %36.sub_dim_size:eds {
    internal %35.sub_dim_stride:eds = COPY %36.sub_dim_stride:eds
  }
bb2:
  %35.sub_mod:eds = COPY %35.sub_dim_size:eds
  %117:spill_edj_to_er = COPY %35.sub_dim_size:eds

After unallocated virtual register rewrite

bb0:
%125:edn_as_32bit = MOV_PD_imm11_pseudo 0
%35.sub_dim_size:eds = COPY %125:edn_as_32bit
%126:magusrc_and_magudst_and_spill_edj_to_er = MOV_PD_imm11_pseudo 128
%35.sub_dim_stride:eds = COPY %126:magusrc_and_magudst_and_spill_edj_to_er
bb2:
%35.sub_mod:eds = COPY %35.sub_dim_size:eds
%117:spill_edj_to_er = COPY %35.sub_dim_size:eds

Greedy materialize to 

bb0:
...
bb2:
%142:eds = MOV_PD_imm11_pseudo 128
%142.sub_mod:eds = COPY %142.sub_dim_size:eds
%117:spill_edj_to_er = COPY %142.sub_dim_size:eds

In this test, there is no composite register involved, it is the subregister copies from the beginning.
Look for %35
I added some debug prints and

-------Start line-------
SplitEditor: Attempt to rematerialize before MI
  %35.sub_mod:eds = COPY %35.sub_dim_size:eds
Orig Inst(RM.OrigMI = LIS.getInstructionFromIndex(OrigVNI->def);)
  %126:magusrc_and_magudst_and_spill_edj_to_er = MOV_PD_imm11_pseudo 128
Rematerialized new Inst
  %142 = UNKNOWN 128
-------End line-------

OrigVNI->def points to wrong value numbering, It should have been MOV_PD_imm11_pseudo 0 not MOV_PD_imm11_pseudo 128. Can this be some live interval calculation mismatch in unallocated virtual register rewrite. I can see the live interval corresponds to the rewritten register is removed but not really re calculated with the replaced register.

@niwinanto
Copy link
Copy Markdown
Collaborator

I do have more findings regarding the problem.

I could spot the bad remat https://github.com/Xilinx/llvm-aie/blob/aie-public/llvm/test/CodeGen/AIE/aie2p/ra/staged-ra-stale-remat.mir, which is coming from SplitEditor invoked by doRegionSplit in the greedy. Added more debug prints around the region where illegal remat happens.

VNInfo *SplitEditor::defFromParent(unsigned RegIdx, const VNInfo *ParentVNI,
                                   SlotIndex UseIdx, MachineBasicBlock &MBB,
                                   MachineBasicBlock::iterator I) {
  SlotIndex Def;
  LiveInterval *LI = &LIS.getInterval(Edit->get(RegIdx));

  // We may be trying to avoid interference that ends at a deleted instruction,
  // so always begin RegIdx 0 early and all others late.
  bool Late = RegIdx != 0;

  // Attempt cheap-as-a-copy rematerialization.
  Register Original = VRM.getOriginal(Edit->get(RegIdx));
  LiveInterval &OrigLI = LIS.getInterval(Original);
  VNInfo *OrigVNI = OrigLI.getVNInfoAt(UseIdx);
  Register Reg = LI->reg();
  bool DidRemat = false;
  if (OrigVNI) {
    LiveRangeEdit::Remat RM(ParentVNI);
    RM.OrigMI = LIS.getInstructionFromIndex(OrigVNI->def);
    if (Edit->canRematerializeAt(RM, OrigVNI, UseIdx, true)) {

      dbgs() << "\nSplitEditor-------Start line-------\n";
      dbgs() << "Attempt to materialize before MI\n";
      LIS.getInstructionFromIndex(UseIdx)->dump();
      dbgs() << "materialize MI\n";
      LIS.getInstructionFromIndex(OrigVNI->def)->dump();
      dbgs() << LIS.getInterval(Original) << " " << UseIdx << "\n";
      Def = Edit->rematerializeAt(MBB, I, Reg, RM, TRI, Late);
      ++NumRemats;
      DidRemat = true;
    }
  }

dumps

SplitEditor-------Start line-------
Attempt to materialize before MI
  %35.sub_mod:eds = COPY %35.sub_dim_size:eds
materialize MI
  %126:magusrc_and_magudst_and_spill_edj_to_er = MOV_PD_imm11_pseudo 128
%0 [16r,48r:0)[48r,688r:1)[688r,940B:2)[940B,1760r:8)[1760r,1776r:3)[1776r,1792r:6)[1792r,1808r:4)[1808r,2016r:7)[2016r,2480r:5) 0@16r 1@48r 2@688r 3@1760r 4@1792r 5@2016r 6@1776r 7@1808r 8@940B-phi  L0000000000000400 [16r,2480r:0) 0@16r  L0000000000000800 [48r,2432B:0) 0@48r  L0000000000200000 [688r,2432B:0) 0@688r  L0001000000000000 [1760r,2016r:0) 0@1760r  L0000800000000000 [1792r,2016r:0) 0@1792r  L0000000000000200 [1776r,2016r:1)[2016r,2368r:0) 0@2016r 1@1776r  L0000400000000000 [1808r,2016r:1)[2016r,2016d:0) 0@2016r 1@1808r  weight:5.725086e-01 688B

The important info is the dbgs() << LIS.getInterval(Original) which is %0 [16r,48r:0)[48..... But if I look at the MIR %0 is already removed(Precisely in the first greedy run, now we are at the 4th run of greedy after unallocated register rewriter). Register Original = VRM.getOriginal(Edit->get(RegIdx)); which points to the VRM which holds the Virt2SplitMap and this has not been updated with removal of register and its LI from LIS. Hence, we get the wrong live interval and the VNIs linked to it are not valid.

Also, our solution to bail out from the remat itself is also not enough. For example, OrigLI which we get from Original is used in the same function other that remat. Basically, all the users of VRM.getOriginal could end up in bad situations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants