I found what I think it’s an strange but potentially critical bug. If you see the code below, it runs two linear layer over an input tensor. As you see there is a mismatch in the shapes between the input tensor and the bad_layer
so you would expect that when you call bad_layer(input_tensor)
it fails BUT IT DOESN’T FAIL ON HPU DEVICES AND IT DOES FAIL ON NVIDIA GPUS.
Please share your thoughts.
I am using this docker image: vault.habana.ai/gaudi-docker/1.13.0/ubuntu20.04/habanalabs/pytorch-installer-2.1.0:latest
import torch
device = "hpu"
if device == "hpu":
import habana_frameworks.torch as htorch
import habana_frameworks.torch.core as htcore
input_tensor = torch.rand(8, 16, 384).to(device)
good_layer = torch.nn.Linear(in_features=384, out_features=64, bias=True).to(device)
bad_layer = torch.nn.Linear(in_features=128, out_features=64, bias=True).to(device)
good_out = good_layer(input_tensor)
bad_out = bad_layer(input_tensor)
if device == "hpu":
htorch.hpu.synchronize()