Linear Layer Inconsistency

bastiandiaz · March 12, 2024, 4:48pm

I found what I think it’s an strange but potentially critical bug. If you see the code below, it runs two linear layer over an input tensor. As you see there is a mismatch in the shapes between the input tensor and the bad_layer so you would expect that when you call bad_layer(input_tensor) it fails BUT IT DOESN’T FAIL ON HPU DEVICES AND IT DOES FAIL ON NVIDIA GPUS.

Please share your thoughts.
I am using this docker image: vault.habana.ai/gaudi-docker/1.13.0/ubuntu20.04/habanalabs/pytorch-installer-2.1.0:latest

import torch

device = "hpu"
if device == "hpu":
    import habana_frameworks.torch as htorch
    import habana_frameworks.torch.core as htcore

input_tensor = torch.rand(8, 16, 384).to(device)
good_layer = torch.nn.Linear(in_features=384, out_features=64, bias=True).to(device)
bad_layer = torch.nn.Linear(in_features=128, out_features=64, bias=True).to(device)

good_out = good_layer(input_tensor)
bad_out = bad_layer(input_tensor)

if device == "hpu":
    htorch.hpu.synchronize()

Sayantan_S · March 12, 2024, 5:53pm

Yes indeed, the bad_layer should crash. We’ll look into this and update here

Sayantan_S · April 24, 2024, 9:48pm

This issue should get fixed in r1.16 (ie, an appropriate error message will be shown if sizes are incompatible)

Topic		Replies	Views
PRELU RuntimeError for inputs more than 1 dimension Training	3	113	July 23, 2024
Error related to complex torch indexing on HPU only General Questions	7	367	April 2, 2024
Gaudi1 HPU doesn't support long? PyTorch pytorch	11	324	April 4, 2024
PyTorch model works on CPU/CUDA but not on HPU Training pytorch	5	1747	January 19, 2022
RuntimeError: [Rank:0] FATAL ERROR :: MODULE:PT_BRIDGE Exception in Lowering thread PyTorch	3	246	March 23, 2025

Linear Layer Inconsistency

Related topics