Skip to content

Add support for PaddleOCRv5 models for character recognition.#747

Open
tjanczak wants to merge 2 commits intomainfrom
paddle_ocr_v5
Open

Add support for PaddleOCRv5 models for character recognition.#747
tjanczak wants to merge 2 commits intomainfrom
paddle_ocr_v5

Conversation

@tjanczak
Copy link
Copy Markdown
Contributor

@tjanczak tjanczak commented Apr 2, 2026

Description

Add converters for PaddleOCR v5 recognition models; add support to download models from HuggingFace repository + load vocabulary form HF config file.

Fixes # (issue)

Any Newly Introduced Dependencies

N/A.

How Has This Been Tested?

Validated locally; to be added to CI tests as part of new sample.

Checklist:

  • I agree to use the MIT license for my code changes.
  • I have not introduced any 3rd party components incompatible with MIT.
  • I have not included any company confidential information, trade secret, password or security token.
  • I have performed a self-review of my code.

double exp_sum = 0.0;
for (size_t v = 0; v < vocab_size; ++v)
exp_sum += std::exp(static_cast<double>(row[v] - row_max));
log_conf_sum += std::log(1.0 / exp_sum + 1e-10);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this line computes log(1/exp_sum + 1e-10) as a log-softmax approximation. the 1e-10 is added after the division to prevent log(0), but this is numerically odd: standard log-softmax would be log_max - log(exp_sum), not this form.
please check for correctness

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is to avoid divide by zero error, changed to explicit check if not zero


export_ppocr_v5_model() {
local MODEL_NAME=$1
MODEL_DIR="$MODELS_PATH/public/$MODEL_NAME"
Copy link
Copy Markdown
Contributor

@oonyshch oonyshch Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MODEL_DIR should be declared as local
the other variables in this function (MODEL_NAME, DST_FILE1, DST_FILE2) all use local, but MODEL_DIR is in the parent scope


// Output shape: [batch_size, seq_len, vocab_size]
const auto &dims = blob->GetDims();
const size_t vocab_size = (dims.size() == 3) ? dims[2] : 0;
Copy link
Copy Markdown
Contributor

@oonyshch oonyshch Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these lines could set vocab_size and seq_len to 0 if the tensor rank is wrong, but the subsequent check at line 224 produces a misleading error message "Unexpected vocabulary size".
It would be clearer to add an explicit tensor rank check, that would give a clear error when the model output shape is wrong, rather than defaulting and failing later

Copy link
Copy Markdown
Contributor

@oonyshch oonyshch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far everything that I've not mentioned in comments LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants