Skip to content

[Feature] LLM-as-a-Judge: evaluate review alignment #3

@SiegfriedZhen

Description

@SiegfriedZhen

Current review is based on human expert, but it costs a lot of time. Also need to rerun every new results.

possible solutinos:

  1. evaluation by top models like grok3, grok3 thinking ...
  2. evaluation by thinking model: o3-mini-high, gemeni flash thinking

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions