AArch64: add acq/rel load ordering to CAS instructions#9076
AArch64: add acq/rel load ordering to CAS instructions#9076liamwhite wants to merge 1 commit intoNationalSecurityAgency:masterfrom
Conversation
|
I'm not very familiar with the acquire release semantics, so I'm getting myself brought up to speed here. But from the manual (https://developer.arm.com/documentation/ddi0602/2025-12/Base-Instructions/CAS--CASA--CASAL--CASL--Compare-and-swap-word-or-doubleword-in-memory-) it looks like the acquire should only be generated if Rather than having both |
|
Good call on the zr case - yes, if the result of the load is discarded (placed into zr) then the acquire op does not occur architecturally |
|
Nothing is ever written to Wt here, so the issue isn't the value being discarded. Wt represents the new value to be stored. No acquire is generated if the zero register is the value potentially stored - because the store never depends on the value of previous memory accesses since the value is always 0? Or is it a hint to the processor that the value in memory is essentially being discarded? |
|
Okay, I misread and you are correct. That does seem a bit odd. It doesn't really make any sense to me that they would define it this way because the useful property is that if the load result can't be used, then there are no observable side effects if the acquire is not performed. But the load result is Rs, and the acquire op is for Rt != 31, so... |
|
I looked into it and it's actually just common decode for all LSE instructions. The acquire is architecturally relaxed when Rt=31 (and for the other LSE instructions, this generally makes because Rt is the load result). |
For this function:
The load ordering pcode ops used by other LSE instructions aren't emitted.
For CAS, the acquire op always occurs if specified, and a specified release op only occurs on successful store. The output after applying this PR looks like this: