Text Recognition Fails on Rotated Text (Architectural Drawings) – Detection Box Corner Order Seems Wrong #17325

bananaback · 2025-12-10T17:04:26Z

bananaback
Dec 10, 2025

Hi developers,
I’m working on an OCR system for architectural drawings, where text appears at many different angles (not only horizontal).

Horizontal text works fine, but rotated text often becomes flipped or mis-read.
Here is an example of how the rotated labels look in my dataset:
(see image)

And here is how the recognition goes wrong when the text rotates further:

What I Tried

1. Finetuning the recognition model

I first tried finetuning the recognition model, but it did not help.
So I moved on to investigating the detection stage.

2. Investigating the detection model using PPOCRLabel

During annotation in PPOCRLabel, I noticed something important:

When PPOCRLabel auto-draws the box (top-left → bottom-right),
→ Recognition is wrong for rotated text.

When I manually draw the box carefully from bottom-left → top-right,
then click Re-Recognition,
→ The recognition becomes correct.

This made me think the ordering of the 4 points affects the angle/direction, so I labeled my rotated dataset very carefully using this bottom-to-top direction.

I labeled ~500 rotated samples and finetuned the detection model.

3. Testing the finetuned detection model

After finetuning, the real prediction results are still not good — rotated text is still mis-recognized.

So I inspected the output boxes from predict(), and this is what I found:

Every predicted box always follows the same point order

top-left

top-right

bottom-right

bottom-left

This happens for all rotations — vertical, diagonal, upside down, etc.
It seems the detector normalizes every box back into TL–TR–BR–BL and does not preserve any rotation information.

Because of this, the recognition model receives no real “angle” signal.

Current Problems / Questions

1. How can I correctly finetune the detection model for rotated text?

In my dataset, text can rotate freely clockwise.
But when it rotates beyond 270° clockwise, it effectively becomes a 90° counter-clockwise rotation (as shown in the first attached image).

During labeling, for vertical or highly rotated text, I manually draw the box from bottom-left → top-right, then click Re-Recognition — and the recognition becomes correct.
This is why I believed the point order helps the model learn direction.

I labeled ~500 such rotated samples carefully, finetuned the detection model, but the final predictions are still angle-neutral (always TL → TR → BR → BL).
So the recognition model never receives orientation information.

I’m not sure whether:

my finetuning approach is wrong,
my dataset is too small,
or the detection model simply cannot learn angle information at all.

2. Which module in PaddleOCR is actually responsible for orientation?

Because the detection outputs always normalized corner order, I am unsure where rotation is supposed to be handled:

Is angle predicted by another module?
Is there an angle classifier that needs to be trained?
Or is the recognition model expected to infer rotation directly from the cropped patches?
Am I misunderstanding how the PaddleOCR pipeline handles orientation?

I’ve searched documentation and online discussions for days, but I still can’t find a clear explanation of which component is responsible for rotation or what the correct training process is for multi-angle text.

Right now I don’t know whether:

I trained the wrong module,
I labeled incorrectly,
or I misunderstood the intended pipeline.

liuhongen1234567 · 2025-12-11T09:58:09Z

liuhongen1234567
Dec 11, 2025
Collaborator

Hello, I tested the demo image and found that the recognition results are not particularly problematic. Below is my inference code:

from paddleocr import PaddleOCR

ocr = PaddleOCR(
    use_doc_orientation_classify=False, # 通过 use_doc_orientation_classify 参数指定不使用文档方向分类模型
    use_doc_unwarping=False, # 通过 use_doc_unwarping 参数指定不使用文本图像矫正模型
    use_textline_orientation=True, # 通过 use_textline_orientation 参数指定不使用文本行方向分类模型
)
result = ocr.predict("/paddle/ppocr_vl_model/issue_test/dd893599980caf804c1d85ebb7ce5581.png")
for res in result:
    res.print()
    res.save_to_img("output")
    res.save_to_json("output")

dd893599980caf804c1d85ebb7ce5581_ocr_res_img

8 replies

bananaback Dec 18, 2025
Author

And there is another case: for text that is originally in a vertical orientation, it gets rotated 90 degrees, which causes the recognition model to become confused.

bananaback Dec 18, 2025
Author

bananaback Dec 18, 2025
Author

In theory, I should enable use_text_line_orientation,
and fine-tune the text line orientation model with a dataset that matches my specific use case.

bananaback Dec 18, 2025
Author

However, due to limited time, I used a workaround: I disabled text line orientation and modified the library code to rotate the text 90 degrees clockwise instead.

This allowed me to avoid spending a lot of time creating a dataset and fine-tuning the model.

bananaback Dec 18, 2025
Author

And the model works well in most cases, but there are still many difficult edge cases.

We are about to move into a more serious development phase, and I’m concerned that in the long run I may encounter cases where the text is vertically aligned downward. In such situations, this workaround would no longer work, and I’m also unsure whether my current solution is truly reliable.

I’m therefore considering a different approach: instead of customizing PaddleOCR’s source code, I would enable use_text_line_orientation and fine-tune the orientation model according to the official guidelines. In this case, I have a few questions, such as how large the dataset needs to be to reliably distinguish between 0° and 180° orientations, and what an effective fine-tuning strategy would look like.

Thank you to the developers, and I look forward to your response.

MelloCaique · 2026-03-05T13:50:34Z

MelloCaique
Mar 5, 2026

Any update on this? I'm having the exact same problem, but with the added complication that my text can be rotated at any angle.
When using PPOCRLabel, if I draw the RectBox with the right text orientation and not only TL–TR–BR–BL as mentioned by @bananaback, the the text is extracted correctly. The question is:

How can I properly fine-tune the rotation model?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text Recognition Fails on Rotated Text (Architectural Drawings) – Detection Box Corner Order Seems Wrong #17325

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 8 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Text Recognition Fails on Rotated Text (Architectural Drawings) – Detection Box Corner Order Seems Wrong #17325

Uh oh!

bananaback Dec 10, 2025

What I Tried

1. Finetuning the recognition model

2. Investigating the detection model using PPOCRLabel

3. Testing the finetuned detection model

Current Problems / Questions

1. How can I correctly finetune the detection model for rotated text?

2. Which module in PaddleOCR is actually responsible for orientation?

Replies: 2 comments · 8 replies

Uh oh!

liuhongen1234567 Dec 11, 2025 Collaborator

Uh oh!

Uh oh!

bananaback Dec 18, 2025 Author

Uh oh!

bananaback Dec 18, 2025 Author

Uh oh!

bananaback Dec 18, 2025 Author

Uh oh!

Uh oh!

bananaback Dec 18, 2025 Author

Uh oh!

bananaback Dec 18, 2025 Author

Uh oh!

MelloCaique Mar 5, 2026

bananaback
Dec 10, 2025

Replies: 2 comments 8 replies

liuhongen1234567
Dec 11, 2025
Collaborator

bananaback Dec 18, 2025
Author

bananaback Dec 18, 2025
Author

bananaback Dec 18, 2025
Author

bananaback Dec 18, 2025
Author

bananaback Dec 18, 2025
Author

MelloCaique
Mar 5, 2026