Skip to content

disallow closures after branches #8

@adalke

Description

@adalke

I think that closures should not be allowed after ')' terms.

To start, I think there's a bug in how the deepsmiles implementation handles this case. Consider COPNBCl))=3.

I expected the )) to transform COPNBCl)) to COPN with a branch on the N, then followed by the =3 to give the final SMILES CO1PN=1(BCl).

But the code does the following:

>>> decode = deepsmiles.Converter(True, True).decode
>>> decode("COPNBCl))=3")
'COPN1BCl=1'
>>> decode("COPN=3")
'CO1PN=1'

The =3 acts on the Cl, linking it to N, even though the )) "should" have popped those two atoms.

This can be a problem because the closures allow multiple branch-pops in a sequence. My code expected that each atom would have at most a single close ')' after it. As a result, I get

>>> cdeepsmiles.decode("COPNB)=3)N")
'CO1P(N=1(B)N\x00'

when I think I "should: have gotten:

'CO1P(N=1(B))N'

Closures after branch-pops are unneeded, and SMILES doesn't support them.

Rather than trying to handle this case correctly (and the fact that two different code bases got it wrong suggests that it isn't easy to get right), how about making it not be allowed?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions