Linear Layer Inconsistency

I found what I think it’s an strange but potentially critical bug. If you see the code below, it runs two linear layer over an input tensor. As you see there is a mismatch in the shapes between the input tensor and the bad_layer so you would expect that when you call bad_layer(input_tensor) it fails BUT IT DOESN’T FAIL ON HPU DEVICES AND IT DOES FAIL ON NVIDIA GPUS.

Please share your thoughts.
I am using this docker image: vault.habana.ai/gaudi-docker/1.13.0/ubuntu20.04/habanalabs/pytorch-installer-2.1.0:latest

import torch

device = "hpu"
if device == "hpu":
    import habana_frameworks.torch as htorch
    import habana_frameworks.torch.core as htcore

input_tensor = torch.rand(8, 16, 384).to(device)
good_layer = torch.nn.Linear(in_features=384, out_features=64, bias=True).to(device)
bad_layer = torch.nn.Linear(in_features=128, out_features=64, bias=True).to(device)

good_out = good_layer(input_tensor)
bad_out = bad_layer(input_tensor)

if device == "hpu":
    htorch.hpu.synchronize()

Yes indeed, the bad_layer should crash. We’ll look into this and update here

1 Like

This issue should get fixed in r1.16 (ie, an appropriate error message will be shown if sizes are incompatible)

1 Like