-
Notifications
You must be signed in to change notification settings - Fork 31
Description
I think that closures should not be allowed after ')' terms.
To start, I think there's a bug in how the deepsmiles implementation handles this case. Consider COPNBCl))=3.
I expected the )) to transform COPNBCl)) to COPN with a branch on the N, then followed by the =3 to give the final SMILES CO1PN=1(BCl).
But the code does the following:
>>> decode = deepsmiles.Converter(True, True).decode
>>> decode("COPNBCl))=3")
'COPN1BCl=1'
>>> decode("COPN=3")
'CO1PN=1'
The =3 acts on the Cl, linking it to N, even though the )) "should" have popped those two atoms.
This can be a problem because the closures allow multiple branch-pops in a sequence. My code expected that each atom would have at most a single close ')' after it. As a result, I get
>>> cdeepsmiles.decode("COPNB)=3)N")
'CO1P(N=1(B)N\x00'
when I think I "should: have gotten:
'CO1P(N=1(B))N'
Closures after branch-pops are unneeded, and SMILES doesn't support them.
Rather than trying to handle this case correctly (and the fact that two different code bases got it wrong suggests that it isn't easy to get right), how about making it not be allowed?