ORC Predicate Pushdown

### Describe the enhancement requested

Arrow's ORC reader already supports **column projection** (reading only selected columns), but lacks **row-level predicate pushdown**. Currently, filtering rows from ORC files requires:
1. Reading all rows from selected columns (all stripes)
2. Applying filters post-read using Arrow compute

This is inefficient for large ORC files where only a small subset of rows match the filter criteria. ORC files store min/max statistics at the stripe level, which can be used to skip entire stripes that cannot contain matching rows—avoiding I/O for data that will be filtered out anyway.

### Use Cases

1. **Data Lake Queries**: Efficiently query large ORC datasets with selective predicates
2. **PyIceberg Integration**: Enable predicate pushdown for Iceberg tables stored in ORC format
3. **Parity with Parquet**: Match the filtering capabilities already available for Parquet files

### Component(s)

Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ORC Predicate Pushdown #48986

Describe the enhancement requested

Use Cases

Component(s)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ORC Predicate Pushdown #48986

Description

Describe the enhancement requested

Use Cases

Component(s)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions