Skip to content

StringLookup layer with output_mode="one_hot" produces incorrect symbolic output shape for nested input tensors #22336

@MomoPhD

Description

@MomoPhD

System Information

  • Keras Version: 3.13.2
  • Backend: TensorFlow 2.20.0
  • OS: Linux

Code to Reproduce

import os
import numpy as np
import keras

# Disable GPU for reproducibility
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

# Layer configuration
layer = keras.layers.StringLookup(
    max_tokens=20,
    num_oov_indices=4,
    oov_token="[UNK]",
    vocabulary=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'],
    output_mode="one_hot",
    pad_to_max_tokens=True,
    sparse=False,
    encoding="utf-8",
)

# Input data with shape (2, 2, 2)
input_data = np.array([[['a', 'b'], ['c', 'd']], [['a', 'b'], ['c', 'd']]])

# 1. Eager execution (Correct behavior)
eager_output = layer(input_data)

# 2. Symbolic execution (Incorrect behavior)
symbolic_input = keras.Input(shape=input_data.shape[1:], dtype="string")
symbolic_output = layer(symbolic_input)

# --- Verification ---
print("Keras Version:", keras.__version__)
print("Input data shape:", input_data.shape)
print("-" * 20)

# Expected: (2, 2, 2, 20)
print("Eager output shape:", eager_output.shape)

# Expected: (None, 2, 2)
print("Symbolic input shape:", symbolic_input.shape)

# Expected: (None, 2, 2, 20)
# Actual: (None, 20) -> This is incorrect.
print("Symbolic output shape:", symbolic_output.shape)

Expected Behavior

The symbolic output shape should be consistent with the eager execution output shape. Given an input tensor with a shape of (batch_size, d1, d2, ..., dN), the StringLookup layer with output_mode="one_hot" should produce an output tensor with a shape of (batch_size, d1, d2, ..., dN, max_tokens).

For the provided example, the symbolic output shape should be (None, 2, 2, 20).

Actual Behavior

The symbolic execution incorrectly infers the output shape, resulting in (None, 20).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions