Adapt to latest EQM to remove complex couplings by zhanglw0521 · Pull Request #300 · ACEsuit/ACEpotentials.jl

zhanglw0521 · 2025-09-26T17:24:44Z

This is to fix #299, by using the latest version of EQM and RepLie, where the complex couplings are avoided.

One thing worth noting is that the fix seems to require Julia1.11.

zhanglw0521 · 2025-09-26T17:31:24Z

Even if the tests pass, it remains to be checked whether we still have inexact error constructing ace1model with high correlation order and polynomial degrees, i.e.

julia> using ACEpotentials
julia> md = ace1_model(elements = [:Si],
                               order = 4,
                               totaldegree = 16,
                               rcut = 6.0)

I will paste the output below.

cortner · 2025-09-26T18:04:14Z

I'm curious why the tests are failing.

cortner · 2025-09-26T18:05:40Z

EquivariantModels: git object 0e3f3b6c2b858aed4d6442b54f0bae674e6bb432 could not be found

I forgot to push EM, will now rerun tests.

zhanglw0521 · 2025-09-26T18:16:21Z

The above command returns a model that has A2Bmap

julia> md.model.tensor.A2Bmap
1445×8638 SparseArrays.SparseMatrixCSC{Float64, Int64} with 9734 stored entries:
⎡⠙⠲⠤⢤⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎤
⎢⠀⠀⠀⠀⠈⠉⠙⠒⠶⢤⣄⡦⣴⠤⣄⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠉⠉⠉⠙⠒⠒⠒⠒⠲⠦⢤⣀⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠉⠓⠚⠒⠓⠲⠤⠤⠤⣄⣀⣀⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⎥
⎢⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠉⠛⠋⠓⠒⠲⠤⢤⣤⣤⣀⣀⠠⣤⣄⣀⠠⣤⣀⡀⠀⠀⠀⠀⎥
⎣⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠉⠉⠀⠀⠉⠉⠀⠈⠉⠉⠛⠒⠒⎦

zhanglw0521 · 2025-09-26T18:21:49Z

One of the tests fails because the rmse slightly goes beyond the preset tolerance...

cortner · 2025-09-26T18:29:54Z

I wonder what changed.

cortner · 2025-09-26T18:31:16Z

But on 1.10 there are several more accuracy problems. very strange.

cortner · 2025-09-26T18:32:44Z

Is there a chance that the coupling coefficients are no longer orthogonal?

zhanglw0521 · 2025-09-26T18:40:54Z

It is indeed very strange. I think the orthogonality still hold, but the normality may not. I will go back to RepLie and try both 1.10 and 1.11 there to see if they consist.

zhanglw0521 · 2025-09-29T17:02:41Z

I am so very confused...

I tested four cases: ACEpot in the main branch, with Julia 1.10 and 1.11, ACEpot in this PR, with Julia 1.10 and 1.11.

For the same version of ACEpot, 1.10 and 1.11 return exactly the same couplings, meaning that calling old EM or new EM for coupling coeffs is not affected by Julia version.

I also tested the coupling coefficients generated by the two versions of ACEpot (the A2Bmap's) - they are full rank and they span the same space. Both of them are orthogonal but not orthonormal, with comparable condition numbers.

The weirdest part lies in the evaluation of the energy, ie, the function energy_forces_virial performs differently in 1.10 and 1.11 even when keeping the same at (an eight-atom system with 5 Si and 3 O, fixed bulk), ps, st. This can be checked by running the following script:

using ACEbase 
using ACEpotentials
M = ACEpotentials.Models

using AtomsBase, AtomsBuilder
# using EmpiricalPotentials
AB = AtomsBuilder

import AtomsCalculators

using Random, LuxCore, StaticArrays, LinearAlgebra, JLD2, Unitful
rng = Random.MersenneTwister(1234)

elements = (:Si, :O)
level = M.TotalDegree()
max_level = 15
order = 3
E0s = Dict( :Si => -158.54496821u"eV", 
            :O => -2042.0330099956639u"eV")
NZ = length(elements)

function make_rin0cut(zi, zj) 
   r0 = ACEpotentials.DefaultHypers.bond_len(zi, zj)
   return (rin = 0.0, r0 = r0, rcut = 6.5)
end

rin0cuts = SMatrix{NZ, NZ}([make_rin0cut(zi, zj) for zi in elements, zj in elements])


model = M.ace_model(; elements = elements, order = order, Ytype = :solid, 
                      level = level, max_level = max_level, maxl = 8, 
                      pair_maxn = 15, 
                      init_WB = :glorot_normal, init_Wpair = :glorot_normal,
                      E0s = E0s,
                      rin0cuts = rin0cuts
                      )

ps, st = LuxCore.setup(rng, model)

# read saved ps
WB = load("psWB.jld2")["WB"]
ps.WB[:] .= WB[:]
Wpair = load("psWpair.jld2")["Wpair"]
ps.Wpair[:] .= Wpair[:]

calc = M.ACEPotential(model, ps, st)   

# compare the A2Bmap - test RepLie for different Julia versions
A2B110 = load("A2Bmap_110.jld2")["A2B"]
A2B111 = load("A2Bmap_111.jld2")["A2B"]
calc.model.tensor.A2Bmap == A2B110 == A2B111

# splinification
lin_calc = M.splinify(calc, ps)
ps_lin, st_lin = LuxCore.setup(rng, lin_calc)
ps_lin.WB[:] .= ps.WB[:] 
ps_lin.Wpair[:] .= ps.Wpair[:]

at = AB.rattle!(AB.bulk(:Si, cubic=true), 0.0) # a fixed bulk

E = M.energy_forces_virial(at, calc, ps, st).energy
E_lin = M.energy_forces_virial(at, lin_calc, ps_lin, st_lin).energy
difference = abs(E - E_lin)

with the attached parameters/coefficients I saved.

Even for the main branch, the above difference between Julia versions remains. This difference caused the extra failing cases in this PR with Julia 1.10, and I think it is something beyond the changes I made.

Or have I missed anything else that can make the calc's different?

saved_ps.zip

cortner · 2025-09-30T16:23:43Z

I'm re-running all tests locally on different branches to see things for myself.

zhanglw0521 · 2025-09-30T16:27:38Z

Thanks. And just to share information: locally I've gotten the same results as the CI results here.

cortner · 2025-09-30T16:30:31Z

yes, I would expect that but I want to run the tests on different branches to see where things break down.

cortner · 2025-09-30T17:39:18Z

These are the updates when I go from main (where all tests pass) to the current branch.

  [73ee3e68] ↑ EquivariantModels v0.0.5 ⇒ v0.0.6
  [f07d36f2] ↑ RepLieGroups v0.0.3 ⇒ v0.1.1
    Updating `~/gits/ACEpotentials.jl/Manifest.toml`
  [023d0394] - DecoratedParticles v0.0.7
  [73ee3e68] ↑ EquivariantModels v0.0.5 ⇒ v0.0.6
  [0e77f7df] + LazilyInitializedFields v1.3.0
  [89398ba2] + LocalRegistry v0.5.7
  [793d2195] + PartialWaveFunctions v0.2.0
  [2792f1a3] + RegistryInstances v0.1.0
  [d1eb7eb1] + RegistryTools v2.3.0
  [f07d36f2] ↑ RepLieGroups v0.0.3 ⇒ v0.1.1

Same failure as in CI:

Test silicon: Test Failed at /Users/ortner/gits/ACEpotentials.jl/test/test_silicon.jl:41
  Expression: (rmse[config])[obs] <= (expected[config])[obs]
   Evaluated: 0.0003602567072731953 <= 0.00035

I'm not so worried about Julia versions right now but about different results within the same Julia version. Here, the only thing that can be different are the coupling coefficients.

zhanglw0521 · 2025-09-30T18:36:17Z

If we checked that the coupling coefficients from the two branches (main and this PR) have the same size, generate the same space, and keep the energy invariant, then can we say the different accuracy is just because of some tricky differences in fitting itself?

cortner · 2025-09-30T18:49:35Z

I can't quite decide whether to just accept this, tag a new version but then file an issue with a detailed summary and let the tests fail to make sure we come back to it?

zhanglw0521 · 2025-09-30T20:32:49Z

I really can't say for certain, but the very last test I ran, which was specified for this failed test, seems to give a reason to accept it.

I stored the coupling coeffs generated by the two branches (attached below), from which we can see that each of their corresponding column differs only by a scaling. I then compared the fitting errors. The results actually indicate that the overall errors they have are comparable, while new couplings, somehow, perform better for force and virial, but slightly worse for energy, and hence the failed example.

Error in main

err_blr["rmse"]
Dict{String, Any} with 5 entries:
  "isolated_atom" => Dict("V"=>0.0, "E"=>0.0, "F"=>0.0)
  "dia"           => Dict("V"=>0.030166, "E"=>0.00158392, "F"=>0.0294047)
  "liq"           => Dict("V"=>0.0344579, "E"=>0.000335032, "F"=>0.178791)
  "set"           => Dict("V"=>0.0652927, "E"=>0.00262894, "F"=>0.138453)
  "bt"            => Dict("V"=>0.0886698, "E"=>0.00348342, "F"=>0.0713924)

Error in this PR

err_blr["rmse"]
Dict{String, Any} with 5 entries:
  "isolated_atom" => Dict("V"=>0.0, "E"=>0.0, "F"=>0.0)
  "dia"           => Dict("V"=>0.0299322, "E"=>0.00158943, "F"=>0.0288154)
  "liq"           => Dict("V"=>0.0339324, "E"=>0.000360257, "F"=>0.178246)
  "set"           => Dict("V"=>0.0649321, "E"=>0.00259179, "F"=>0.138032)
  "bt"            => Dict("V"=>0.0882134, "E"=>0.00342114, "F"=>0.071385)

I think this is something acceptable, though I have no idea why it is the case (perhaps it is that the scaling happened to behave like another weighting or a preconditioner or so...).

A2Bmaps.zip

cortner · 2025-09-30T21:13:23Z

Thank you for the thorough test. Then let's merge and tag but keeping the failing test for future discussion.

cortner · 2025-09-30T21:14:30Z

Actually I might change the tolerance a bit but still open an issue

zhanglw0521 · 2025-09-30T21:20:13Z

Sounds good to me. Let's see if the new version resolves the complex couplings issue #299 and can hence enable larger basis.

cortner · 2025-10-01T19:44:09Z

this is now registered as 0.9.0. It feels like a bugfix but it is technically breaking.

Adapt to latest EQM to remove complex couplings

cd7ed2e

A minor change so that the AA2BB_map has the same size as before

171f4ba

adjust test tolerance

5777a9a

cortner merged commit d5c7816 into main Oct 1, 2025
2 of 3 checks passed

cortner deleted the FixComplexCoupling branch October 1, 2025 19:10

cortner mentioned this pull request Oct 1, 2025

Regression Tests - Error Variations #302

Open

Conversation

zhanglw0521 commented Sep 26, 2025

Uh oh!

zhanglw0521 commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cortner commented Sep 26, 2025

Uh oh!

cortner commented Sep 26, 2025

Uh oh!

zhanglw0521 commented Sep 26, 2025

Uh oh!

zhanglw0521 commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cortner commented Sep 26, 2025

Uh oh!

cortner commented Sep 26, 2025

Uh oh!

cortner commented Sep 26, 2025

Uh oh!

zhanglw0521 commented Sep 26, 2025

Uh oh!

zhanglw0521 commented Sep 29, 2025

Uh oh!

cortner commented Sep 30, 2025

Uh oh!

zhanglw0521 commented Sep 30, 2025

Uh oh!

cortner commented Sep 30, 2025

Uh oh!

cortner commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhanglw0521 commented Sep 30, 2025

Uh oh!

cortner commented Sep 30, 2025

Uh oh!

zhanglw0521 commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cortner commented Sep 30, 2025

Uh oh!

cortner commented Sep 30, 2025

Uh oh!

zhanglw0521 commented Sep 30, 2025

Uh oh!

Uh oh!

cortner commented Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhanglw0521 commented Sep 26, 2025 •

edited

Loading

zhanglw0521 commented Sep 26, 2025 •

edited

Loading

cortner commented Sep 30, 2025 •

edited

Loading

zhanglw0521 commented Sep 30, 2025 •

edited

Loading