Table is considered as picture

Im trying to parse pdf to markdown and my pdf file contains more tables. The issue I'm facing is some of the tables is extracted properly, whereas some tables are considered as picture.

Attached the pdf file which i have used
[Different Table formats.pdf](https://github.com/user-attachments/files/24717716/Different.Table.formats.pdf)

Some tables is extracted properly, seems some tables is treated as picture. My table content is enclosed with the below placeholders
**----- Start of picture text -----**<br> 
My table content here
**----- End of picture text -----**<br>

**Tables text considered as picture**
**Attaching the screenshot of first page of the pdf for convenience**

<img width="966" height="1250" alt="Image" src="https://github.com/user-attachments/assets/b5b6c736-02ce-414b-a17b-014d47fc7512" />

**Parsed text which i got** 
<img width="2804" height="978" alt="Image" src="https://github.com/user-attachments/assets/805a7086-d1ac-47c9-ab40-4ce47b142c38" />


**Correct parsing of table text**
**Screen shot of page 2 from the pdf**
<img width="1050" height="1590" alt="Image" src="https://github.com/user-attachments/assets/ccdddfef-23d4-4571-bad2-9c6e60a20134" />

**Correct Table parsing**
<img width="1444" height="982" alt="Image" src="https://github.com/user-attachments/assets/1200bb6c-c261-48d8-9931-15e5ad4e1f0f" />


**code used** 
import pymupdf.layout
import pymupdf4llm

md_text = pymupdf4llm.to_markdown(input_path,footer=False)
print(md_text)

**Version**
pymupdf-layout : 1.26.6
pymupdf4llm : 0.2.9
tesseract : 5.5.1
opencv-python : 4.13.0.90


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Table is considered as picture #359

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Table is considered as picture #359

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions