Skip to content

GPU easy-cluster and easy-linclust write out ASCII headers but binary sequences to _rep_seq.fasta #1063

@jecorn

Description

@jecorn

Using MMseqs2 Version: 8cc5ce3 (release)
Running on a RTX 4090 GPU or CPU only.

easy-cluster and easy-linclust run on the CPU give normal results in _rep_seq.fasta. But running the exact same command on the GPU gives headers in ASCII but sequences in binary.

Command (same with easy-cluster)
> mmseqs easy-linclust <input.fasta> <output> tmp --min-seq-id 0.9 --threads 72 --gpu 1

Result:

>tr|A0A8X7MI69|A0A8X7MI69_9BASI Uncharacterized protein (Fragment) OS=Tilletia controversa OX=13291 GN=A4X06_0g9490 PE=4 SV=1
����C�?�����ə?���

Command (same with easy-cluster)
> mmseqs easy-linclust <input.fasta> <output> tmp --min-seq-id 0.9 --threads 72

Result:

>tr|A0A0X8HR55|A0A0X8HR55_9SACH HDL273Cp OS=Eremothecium sinecaudum OX=45286 GN=AW171_hschr42364 PE=4 SV=1
MSDDAGEIYLEKSVSDELFGRLNSNPENKICFDCGNKNPTWTSVPFGIMLCIQCSGEHRKLGVHITFVKSSNLDKWTLNNLRRFKVGGNHRARAFFLKNNGKQFLDYKTDKNVKYTSQVAKNYKAHLDRKAARDREQHPSEIVFSTEDEVESSDSGSSKNNSVDDFFSSWEKPAASPSNTKLLTPTSTSGSQKTGRSSILSAPSNRRRTPLASGNSSSGGRNHPILSSSRKPISRAGAKKVDADMFDQFEKEAQEERETAAIARSTNSISGEGFKPSQKPTYSAVQFHPTSSESSLNAKDYDVEENPYNDGIKFDQVRAGGVVPSVDDVQPKLAKLSFGMTKNDAKKLADDSKPAARAPTGPKYTGQIAAKYGSQKAISSDQVFGRGGYDEGTSRAAQERLKSNFGNATSISSASYFGEDSAEQAQTGRSVDQGNNLIEVTLGKDEDIELVKQALELGAEKLGSYLRDYLRK

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions