We’re excited to introduce the release of Intel® Gaudi® software version 1.15.0
Bringing forth numerous enhancements and updates for an improved user experience.
We added support for PyTorch Fully Sharded Data Parallel (FSDP). FSDP runs distributed training on large-scale models while reducing memory footprint. See Using Fully Sharded Data Parallel (FSDP) with Intel Gaudi.
Added further improvements of Text Generation Inference (TGI) support for Gaudi. For more details, see GitHub - huggingface/tgi-gaudi: Large Language Model Text Generation Inference on Habana Gaudi.
We have improved the performance and support for the LLaMA models family. This includes LLaMA 2 70B BF16 for pre-training and LLaMA 2 7B/70B BF16/FP8 for inference. In addition, we optimized Mixtral 8x7B for inference.
Gaudi Megatron-DeepSpeed fork was moved to a separate repository and rebased to PR #307. You can find the new repository here GitHub - HabanaAI/Megatron-DeepSpeed: Intel Gaudi's Megatron DeepSpeed Large Language Models for training.
You can now use CRI-O with Intel Gaudi processors, in addition to the existing support for Docker Engine and ContainerD. You can find more information here Intel Gaudi Software Stack and Driver Installation — Gaudi Documentation 1.15.1 documentation
In this release, we’ve also upgraded the validated versions of several libraries, including PyTorch 2.2.0, DeepSpeed 0.12.4, PyTorch Lightning 2.2.0, Kubernetes 1.27, 1.28 and 1.29, OpenShift 4.14, RHEL 9.2, and Megatron-DeepSpeed PR #307.
Lastly a reminder that the support for TensorFlow is deprecated starting from Intel Gaudi Software 1.15.0. You can find more information on Intel Gaudi software 1.15.0 release notes page.