Skip to content

Question about the DETR pretraining process #18

@jinsingsangsung

Description

@jinsingsangsung

Thanks for the impressive work.
I have one question about the pretraining process of DETR (of which you've mentioned here: https://github.com/amazon-science/tubelet-transformer#training)

From here (#4 (comment)),
I figured that you've brought the DETR weights trained on COCO dataset and re-trained it on AVA to detect human instances.

  1. Could you describe this process in a more detailed way? (e.g., how did you manipulated the DETR structure to only detect human, what exactly was the input, position embedding, ... etc)
  2. Was your intention of this pretraining to make queries focus more on classification after DETR architecture of TubeR learns how to localize actors well enough?
  3. Have you tried training the whole architecture without the pretrained DETR weights? I've tried several times but could not find a good configuration to make the actual learning happen.

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions