Skip to content

Mismatch in Audio Encoder Choice: wav2vec vs. Whisper #184

@Seeeeeeeeeeeea

Description

@Seeeeeeeeeeeea

Hi, thanks for the great work and for releasing the code.

I noticed a discrepancy between the paper and the implementation regarding the audio encoder.
In the paper, wav2vec is described as the audio feature extractor, while in the released code the audio encoder seems to be Whisper.

Could you please clarify the motivation behind this choice?

嗨,感谢您出色的工作以及为我们提供了代码。

我注意到在音频编码器这一方面,论文描述与实际实现之间存在差异。
在该论文中,wav2vec 被描述为音频特征提取器,而在github发布的代码中,音频编码器似乎是 Whisper。

您能否解释一下做出这一选择的原因呢?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions