Why does Gaudi provide 10 x 100GbE ports? How does this compare to other options like NV-scale?

Greg_S · June 30, 2021, 6:39am

With 100GbE RoCE native integration on the Gaudi training processor, customers avoid performance and throughput bottlenecks inherent in off-chip platform implementation of RoCE that necessitate connectivity through a separate NIC with each processor.

In HLS-1, 7 of 10 ports are used for all-to-all connections within the server and the other 3 are used for scaling out of the server. Scale-out ports in one server can connect to scale-outs ports in another server, while GPU-based systems require separate NICs that go through PCIe. This could create additional performance bottlenecks.

See the Habana website for more info on our RoCE implementation

Topic		Replies	Views
Work Flow of RoCE v2 in Gaudi-3 General Questions advisory	1	278	June 13, 2024
RDMA Process using RoCE v2 General Questions advisory	1	216	June 13, 2024
Questions about Gaudi 2 General Questions	1	582	March 14, 2023
How do I run Gaudi inside of my data center? I want to buy it for on-premise workloads FAQ	0	862	June 30, 2021
Why is there no hello-world level tutorial for using the Gaudi chip? Training	3	460	April 23, 2024

Why does Gaudi provide 10 x 100GbE ports? How does this compare to other options like NV-scale?

Related topics