RDMA (Remote Direct Memory Access) enables two servers to directly read from and write to each other's memory without involving the CPU, caches, or the operating system. By bypassing these traditional processing layers, RDMA significantly reduces CPU overhead, minimizes latency, and accelerates data transfer rates. These characteristics give RDMA a distinct advantage across a wide range of application areas, including networking, storage, and high-performance computing.
Historically, RDMA deployment was primarily restricted to High-Performance Computing (HPC) environments—specifically supercomputing projects—with limited adoption in general cloud computing and enterprise data centers. However, the landscape shifted dramatically at the end of 2022 as Artificial Intelligence and Machine Learning (AI/ML) became the central focus of technology investment.
As data center spending rapidly pivoted toward AI/ML deployments, RDMA—originally designed for the massive parallel computing tasks of HPC clusters—emerged as a critical enabler. Its inherent ability to handle large-scale parallel processing efficiently made it indispensable for modern AI/ML workloads.


The scale and speed of this transformation are unprecedented. By the end of 2023, the rate of RDMA-based network deployments exceeded the combined totals of 2021 and 2022, signaling a rapid democratization of the technology. This widespread adoption has solidified RDMA as a foundational component of AI/ML infrastructure. Market forecasts suggest that by 2028, the RDMA networking market will exceed $22 billion.
There are two primary implementations of RDMA:
InfiniBand: Offers a dedicated, purpose-built RDMA solution.
RoCEv2 (RDMA over Converged Ethernet): Leverages Ethernet infrastructure to provide DMA services over standard Ethernet networks.
While earlier versions of RoCE required specialized "converged" Ethernet, modern iterations can operate on standard Ethernet. The industry is currently focused on improving Ethernet congestion control, which is vital for reducing packet loss and supporting RoCE in high-performance environments. With over 400 million Ethernet switch ports already installed globally, Ethernet is expected to play an increasingly dominant role in AI/ML networking. Consequently, a growing share of RDMA operations will likely be executed over Ethernet in the near future.
The server market is undergoing a major transformation as general-purpose servers are rapidly replaced by systems specifically designed for AI/ML workloads. Projections indicate that the number of AI/ML servers will soar from 1 million units in 2023 to over 6 million by 2028, with the total market size approaching $300 billion.
Key trends in this evolution include:
Backend Connectivity: Most of these 6 million servers will feature backend networks or AI-specific architectures to interconnect computing nodes.
Increased GPU Density: While 8-GPU servers are currently common, configurations with 16 or 32 GPUs are expected to become the standard.
Memory Scaling: As AI models grow from billions to trillions of parameters, the memory capacity per GPU must increase accordingly.
In this context, RDMA is vital for improving data transfer efficiency between servers, which is essential for system scaling and achieving the ambitious goals of large-scale AI model training.
Directly accessing the memory of other servers significantly boosts overall AI model performance. RDMA ensures data is delivered to the GPU faster, which shortens Job Completion Time (JCT) and improves overall performance metrics.
In the early stages of AI/ML cluster development, a major challenge was "GPU idling," where expensive cores sat idle due to packet loss or data transfer delays. This could cause entire clusters to stall, leading to underutilization of resources. RDMA effectively resolves these bottlenecks, optimizing JCT and maximizing performance. While minor performance differences may exist between Ethernet and InfiniBand, RDMA represents a massive leap forward compared to traditional networking technologies.
While all InfiniBand Network Interface Cards (NICs) support RDMA, not all Ethernet NICs currently possess RDMA/RoCE capabilities. For traditional Ethernet NIC manufacturers, integrating RoCE is now a strategic necessity to compete in the AI/ML space.
Standardization: As NIC speeds reach 400Gbps and 800Gbps+, RoCE support is expected to become a standard feature.
Average Selling Price (ASP): The addition of advanced features and higher port speeds will likely drive up the ASP of Ethernet NICs.
Performance Variability: The quality of RoCE integration depends on processor types, offload engines, and R&D expertise, leading to performance variance among vendors. However, as the technology matures, these gaps are expected to close, improving interoperability and giving users more flexibility.
Most AI/ML servers rely on a "backend network" that operates independently of the primary data center network. These networks—whether based on InfiniBand or Ethernet—are designed specifically to interconnect servers within an AI/ML cluster, providing high-speed connectivity between GPUs and between GPUs and memory. This architectural shift significantly expands the market potential for networking equipment by adding more ports per server.


Furthermore, AI/ML applications often involve multiple backend networks tailored for specific tasks. While RDMA runs on Ethernet and InfiniBand, some GPU or ASIC providers may utilize proprietary or alternative network types to create even higher-performance solutions.
Prior to 2021, the RDMA market was primarily driven by HPC, with annual revenues between 400 million and 700 million. Driven by AI/ML demand, the market surpassed 6 billion in 2023 and is on track to exceed 22 billion by 2028.
The RDMA market can be categorized into two areas:
Technical Implementation: Currently, RDMA is largely deployed via InfiniBand, but RDMA over Ethernet is poised for significant growth as adoption broadens.
Hardware-Led Sales: The market centers on NICs and switches. In InfiniBand environments, these are typically purchased together; in Ethernet environments, they are often sourced by separate teams (server vs. network) on different cycles. This organizational separation persists even as AI/ML architectures become more integrated.
RDMA and RoCE have become indispensable technologies for AI/ML networking. Without them, the rapid scaling required to meet the demands of modern AI infrastructure would be impossible. As the server market continues its pivot from traditional computing to AI/ML, the application of RDMA and RoCE will only accelerate.
In the future, Ethernet and InfiniBand will likely coexist. Most customers will not choose one over the other exclusively but will instead deploy RDMA across both types of networks. RDMA’s flexibility ensures its role in hybrid environments, allowing organizations to focus on optimizing workloads—from foundational training to inference—without being hindered by underlying networking limitations.
Traditional TCP/IP networking relies heavily on the CPU to process data packets and move them between memory buffers, which creates high latency and "CPU overhead". In AI training, where GPUs must constantly exchange massive datasets, this bottleneck causes "GPU idling," where expensive processors wait for data to arrive. RDMA bypasses the CPU and operating system, allowing servers to read/write directly to each other's memory. This minimizes latency and optimizes Job Completion Time (JCT), ensuring that computational resources are fully utilized.
While both facilitate RDMA, they differ in their underlying infrastructure:
InfiniBand: A purpose-built, lossless networking architecture designed specifically for RDMA from the ground up. It is often sold as a unified bundle of NICs and switches.
RoCEv2: An implementation that allows RDMA to run over standard Ethernet. It is more flexible and can leverage existing Ethernet switch ports—of which there are over 400 million globally. Historically, InfiniBand was the standard for HPC, but RoCEv2 is gaining ground as Ethernet congestion control improves.
Not necessarily. While modern versions of RoCE can run on standard Ethernet, not all legacy Ethernet Network Interface Cards (NICs) support RDMA/RoCE functionality. As network speeds move toward 400Gbps and 800Gbps, RoCE support is becoming a standard feature in new NICs. Additionally, high-performance RoCE deployment requires advanced Ethernet switches that support sophisticated congestion control to prevent packet loss, which is critical for RDMA performance.
Although RDMA-capable NICs have a higher Average Selling Price (ASP) due to integrated offload engines and higher speeds, they significantly lower the TCO of the entire AI cluster. By reducing JCT and preventing GPU idle time, RDMA allows organizations to complete model training faster and with fewer total servers. Given that the AI/ML server market is expected to reach $300 billion by 2028, the efficiency gains from RDMA are essential for justifying these massive capital expenditures.
AI/ML servers typically use a backend network specifically to interconnect compute nodes (GPUs and ASICs) within a cluster. This network is physically or logically independent of the primary data center network used for general traffic. Its sole purpose is to provide the high-bandwidth, low-latency pipes needed for RDMA data exchange between GPUs. This "split" architecture is why a single AI server often contains many more network ports than a traditional general-purpose server.
The market is shifting, but they are expected to coexist for the foreseeable future. InfiniBand currently leads in RDMA deployments, but Ethernet's massive installed base and the integration of RoCE into new 400G/800G hardware are driving rapid growth. Most large-scale customers are moving toward hybrid environments, deploying RDMA across both network types to maintain flexibility and avoid vendor lock-in.
Manufacturer: Microchip
IC MCU 8BIT 896B OTP 18DIP
Product Categories: 8bit MCU
Lifecycle:
RoHS:
Manufacturer: Texas Instruments
IC DGTL MEDIA PROCESSR 1031FCBGA
Product Categories: DSP
Lifecycle:
RoHS:
Manufacturer: Texas Instruments
IC DSP ARM SOC BGA
Product Categories: SOC
Lifecycle:
RoHS:
Manufacturer: Texas Instruments
IC DSP ARM SOC BGA
Product Categories: SOC
Lifecycle:
RoHS:
Looking forward to your comment
Comment