-
-
Notifications
You must be signed in to change notification settings - Fork 60
Open
Description
Hii,
I want to report a resolution mismatch error when using odak.learn.perception.MetamericLoss().
Failure case
An example code to replicate the failure:
import torch
from odak.learn.perception import MetamericLoss
import matplotlib.pyplot as plt
import numpy as np
from torchvision.io import read_image
def visualise_img_tensor(tensor):
tensor = tensor[0].detach().cpu().numpy()
tensor = np.transpose(tensor, (1, 2, 0)) # Change from (C, H, W) to (H, W, C)
plt.imshow(tensor)
plt.axis('off')
plt.show()
target_tensor_path = "examples/indian_head_test_pattern.png"
target_tensor = read_image(target_tensor_path).float() /255 # Normalize to [0, 1]
target_tensor = target_tensor.unsqueeze(0) # Add batch dimension
visualise_img_tensor(target_tensor)
tensor_shape = target_tensor.shape
random_tensor = torch.rand(tensor_shape)
visualise_img_tensor(random_tensor)
if tensor_shape[1] > 3:
target_tensor = target_tensor[:, :3, :, :] # Keep only the first 3 channels if more than 3 exist
random_tensor = random_tensor[:, :3, :, :] # Keep only the first 3 channels if more than 3 exist
print("Tensor shape:", target_tensor.shape)
print("Random tensor shape:", random_tensor.shape)
epochs = 100
random_tensor = random_tensor.requires_grad_() # Enable gradient computation
optimizer = torch.optim.Adam([random_tensor], lr=1)
criterion = MetamericLoss()
for epoch in range(epochs):
optimizer.zero_grad()
loss = criterion(random_tensor, target_tensor)
loss.backward()
optimizer.step()
if epoch % 20 == 0:
print(f"Epoch {epoch}, Loss: {loss.item()}")
visualise_img_tensor(random_tensor)
visualise_img_tensor(random_tensor)
Output:
Tensor shape: torch.Size([1, 3, 2400, 4094])
Random tensor shape: torch.Size([1, 3, 2400, 4094])
RuntimeError Traceback (most recent call last)
--> loss = criterion(random_tensor, target_tensor)
odak/learn/perception/metameric_loss.py:235, in MetamericLoss.__call__(self, image, target, gaze, image_colorspace, visualise_loss)
--> line 235 self.target_stats = self.calc_statsmaps(
...
)
odak/learn/perception/metameric_loss.py:144, in MetamericLoss.calc_statsmaps(self, image, gaze, alpha, real_image_width, real_viewing_distance, mode)
...
--> line 144 output_stats.append(means * periphery_mask)
...
RuntimeError: The size of tensor a (1024) must match the size of tensor b (1023) at non-singleton dimension 3
The image used is attached too.
Other cases
It was thought that the restriction is that the image must have a resolution multiple of 1080 x 1920, like 540 x 960. But it doesn't work for 432 x 768, which is also a multiple. Meanwhile, resolutions like 790 x 1264 and 1024 x 1024, which are not multiples, work fine.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels