Habana Gaudi Hpus Training time improvement

When posting a technical issue, please describe the issue; be as descriptive as possible, you can include things like:
• What was the expected behavior:
Improve training time. Taking more than expected time compared to other vendors.

• What is the observed result:
slow training.

• Is the issue consistently reproducible? how long does it take to reproduce:
yes, can be reproduced from
https://drive.google.com/drive/folders/1Vq5f9lk_jiRlbkcwpqN_ecS3GC4gKO3W?usp=sharing
• If you are using AWS DL1 instance, please report the AMI name that you are using
Deep Learning AMI Habana TensorFlow 2.9.1 SynapseAI 1.5.0 (Ubuntu 20.04) 20220714

What is the minimal script/command to reproduce the issue:
Please include any error message or stack trace observed:
Please run the Snapshot for Debug tool and post to the issue
• git clone GitHub - HabanaAI/Snapshot_For_Debug: Snapshot scripts for gathering information about the model and Habana training session for Habana analysis and debug
• touch OUT_DOCKER.txt
• python src/gather_info_docker.py --lite --cmd=<command_script> -s OUT_DOCKER.txt
• post the generated tar file (gather_info_docker.tar.gz) after checking its contents

Thanks for posting, we will take a look at your issue and get back to you