Skip to content

Resolution mismatch - The size of tensor a (height_1) must match the size of tensor b (height_2) at non-singleton dimension 3 #130

@Gnefil

Description

@Gnefil

Hii,

I want to report a resolution mismatch error when using odak.learn.perception.MetamericLoss().

Failure case

An example code to replicate the failure:

import torch
from odak.learn.perception import MetamericLoss
import matplotlib.pyplot as plt
import numpy as np
from torchvision.io import read_image

def visualise_img_tensor(tensor):
    tensor = tensor[0].detach().cpu().numpy()
    tensor = np.transpose(tensor, (1, 2, 0))  # Change from (C, H, W) to (H, W, C)
    plt.imshow(tensor)
    plt.axis('off')
    plt.show()

target_tensor_path = "examples/indian_head_test_pattern.png"
target_tensor = read_image(target_tensor_path).float() /255 # Normalize to [0, 1]
target_tensor = target_tensor.unsqueeze(0)  # Add batch dimension
visualise_img_tensor(target_tensor)

tensor_shape = target_tensor.shape
random_tensor = torch.rand(tensor_shape)
visualise_img_tensor(random_tensor)


if tensor_shape[1] > 3:
    target_tensor = target_tensor[:, :3, :, :]  # Keep only the first 3 channels if more than 3 exist
    random_tensor = random_tensor[:, :3, :, :]  # Keep only the first 3 channels if more than 3 exist

print("Tensor shape:", target_tensor.shape)
print("Random tensor shape:", random_tensor.shape)

epochs = 100
random_tensor = random_tensor.requires_grad_()  # Enable gradient computation
optimizer = torch.optim.Adam([random_tensor], lr=1)
criterion = MetamericLoss()
for epoch in range(epochs):
    optimizer.zero_grad()
    loss = criterion(random_tensor, target_tensor)
    loss.backward()
    optimizer.step()
    
    if epoch % 20 == 0:
        print(f"Epoch {epoch}, Loss: {loss.item()}")
        visualise_img_tensor(random_tensor)
visualise_img_tensor(random_tensor)

Output:

Tensor shape: torch.Size([1, 3, 2400, 4094])
Random tensor shape: torch.Size([1, 3, 2400, 4094])

RuntimeError                              Traceback (most recent call last)
-->  loss = criterion(random_tensor, target_tensor)

odak/learn/perception/metameric_loss.py:235, in MetamericLoss.__call__(self, image, target, gaze, image_colorspace, visualise_loss)
--> line 235     self.target_stats = self.calc_statsmaps(
     ...
     )

odak/learn/perception/metameric_loss.py:144, in MetamericLoss.calc_statsmaps(self, image, gaze, alpha, real_image_width, real_viewing_distance, mode)
    ...
--> line 144     output_stats.append(means * periphery_mask)
    ...

RuntimeError: The size of tensor a (1024) must match the size of tensor b (1023) at non-singleton dimension 3

The image used is attached too.

Image

Other cases

It was thought that the restriction is that the image must have a resolution multiple of 1080 x 1920, like 540 x 960. But it doesn't work for 432 x 768, which is also a multiple. Meanwhile, resolutions like 790 x 1264 and 1024 x 1024, which are not multiples, work fine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions