Skip to content

Normalize ORCRecordReader BOOLEAN and LIST output types #18222

@xiangfu0

Description

@xiangfu0

Motivation

pinot-plugins/pinot-input-format/pinot-orc/src/main/java/org/apache/pinot/plugin/inputformat/orc/ORCRecordReader.java documents two unresolved mappings:

  • BOOLEAN -> String with TODO -> Boolean?
  • LIST -> Object[] of the supported types with TODO -> List?

These TODOs suggest Pinot's ORC reader still exposes surprising Java types for common ORC schemas.

Scope

  • Confirm the intended Pinot row representation for ORC BOOLEAN and LIST fields.
  • Align the implementation and documentation.
  • Add tests that lock down the chosen output types.

Notes

Observed on upstream/master on April 15, 2026.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovement to existing functionalityingestionRelated to data ingestion pipelinejavaPull requests that update Java codepluginsRelated to the plugin systempriority: lowNice to have, can waituser-experienceRelated to user experience

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions