Engineering is about tradeoffs—performance versus cost, proprietary versus open-source, efficiency versus flexibility. These dichotomies apply as much to data center computing technologies as anything else, from the infrastructure to the interconnects that stitch the data center together.

There are two competing philosophies that persist: integration versus modularity. Integration, often proprietary, has unique benefits and challenges compared to producing a functioning system with modules from different vendors.

Open-source collaboration, on the other hand, enables shared technological innovation at the risk of cutting into profitable, licensable patents. Technology companies have traditionally fallen on either side of that philosophical divide.

CXL Bridges the Gap

Compute Express Link (CXL), a data protocol created by the CXL Consortium, attempts to bridge the gap between the integration approach and open-source collaboration. By providing an open-standard cache-coherent link between processors, memory buffers, and accelerators, CXL enables data center fabrics comprised of interoperable elements from different vendors to freely share resources to tackle tough computational problems.

The protocol enables resource-sharing for efficient processing of the petabytes of data generated by emerging, processing-intensive technologies. If CXL becomes the industry standard, it will change how data centers enable emerging technologies such as automated intelligence, machine learning, edge computing, and many others.

What is CXL?

CXL is a relatively new interface technology that uses the physical electrical layer of Peripheral Component Interconnect Express (PCIe), but it features its own unique link and transfer layer protocols. CXL accomplishes this by taking advantage of Alternate Protocol Negotiation, a feature introduced in PCIe 5.0.

The new protocols provide high-bandwidth, low-latency links for computational accelerators and memory buffers. CXL is, in a way, a subgenre or an upgrade to PCIe tailored to high-performance computing applications. CXL links facilitate resource pooling between a CPU and specialized endpoint hardware for process-specific workloads while using the same familiar form factors of PCIe devices. 

The CXL Consortium is an industry-backed open standards development group that consists of dozens of member companies from across the technology industry. Members include the biggest names in the semiconductor, memory, data center networking, and test and measurement industries. Like similar standards bodies such as the PCI-SIG and USB-IF, the member companies of the CXL Consortium are invested in defining an interface that enables hosts and endpoint devices to work together seamlessly.

Three Protocols and Device Types

CXL has three main protocols. CXL.io, required for all CXL devices, handles discovery, configuration, and interrupts like the PCIe transaction layer.

CXL.cache enables CXL accelerators to access CPU memory to ensure onboard cache is coherent, which is necessary for two devices to share computational resources, shown in Figure 1. CXL.memory allows for memory expansion devices (buffers), increasing available persistent memory, which operates at near-DRAM speeds with NAND-like non-volatility, shown in Figure 2. 

CXL devices come in three flavors:

  • Type 1 devices are hardware accelerators featuring CXL.cache only.
  • Type 2 devices are accelerators with memory onboard featuring CXL.memory and CXL.cache. 
  • Type 3 devices are memory expansions featuring CXL.memory only.

Figure 1. CXL.cache allows cache-sharing between a host and an acceleration device.

Figure 1. CXL.cache allows cache-sharing between a host and an acceleration device. Image used courtesy of Keysight

Figure 2. CXL.memory allows a host to access memory on an attached memory buffer device.

Figure 2. CXL.memory allows a host to access memory on an attached memory buffer device. Image used courtesy of Keysight

What Are the Benefits of CXL?

The main goal of CXL is to enable data center capacity expansion to handle increasing workload demands from emerging technologies. CXL’s unique innovations make disaggregating complex computing tasks more feasible and efficient by sharing memory and processing resources all while keeping them coherent with low latency. 

CXL benefits from existing physical layer infrastructure, building on decades of PCI-SIG® innovation and industry familiarity, but reduces latency by streamlining communication between devices. PCIe supports multiple use cases and variable payload lengths across its channels. 

Each transaction requires overhead communication between host and endpoint to communicate the length of the payload and other transactional details. CXL eliminates extraneous overhead by using a fixed 528-bit flow control unit (flit): Four 16-byte slots plus two cyclic redundancy check (CRC) bytes.

CXL uses PCIe’s Flex Bus, meaning CXL devices fit into PCIe slots. If either the CPU or the endpoint device does not support CXL, the devices simply default to PCIe operation.

What’s New in CXL 3.0?

Since its introduction in 2019, the development of CXL has shown steady progress toward its goal of enabling full computational fabric and disaggregated computing. CXL 1.1 supports only one device/host relationship at a time.

CXL 2.0 introduced the ability to support up to 16 hosts simultaneously accessing different portions of memory and switching, making the resource pooling more flexible by allowing CPUs to access other devices as needed.

CXL 3.0 adds peer-to-peer memory access and multi-tiered switching, which increases the scope and support for disaggregated computing. CXL 3.0 also allows CXL to match PCIe 6.0 speeds, 64 GT/s, over PCIe 6.0 hardware but also supports backward compatibility with previous CXL protocols and PCIe hardware.

Perhaps most importantly, CXL 3.0 introduces fabric capabilities, freeing the standard from the traditional tree topology. Now, up to 4,096 nodes can communicate with each other, creating a complex web of resource-sharing processors. A select list of CXL 3.0 features can be found in Figure 3.

 Figure 3. CXL features across generations.

Figure 3. CXL features across generations. Image used courtesy of the CXL Consortium

What Does CXL Mean for Data Centers?

CXL’s development from a speedy PCIe alternative into a multi-tiered, highly flexible link network has edged the standard further into enabling composable, scalable computational fabrics. Fabrics are interconnected nodes in a system that can engage and interact with others to get a job done quicker and more efficiently rather than remain constricted by traditional tree-based architectures. 

Data centers have trended in a similar direction, disaggregating processing from single server systems to employing networks of link switches, allowing pooling of resources. Some industries—like autonomous vehicles that generate terabytes of data per second from thousands of sensors to make safety-critical, near-instantaneous decisions—are embracing edge computing to bring the processing closer to the source.

Now that Industry 4.0, AI, machine learning, and other new technologies are placing an unprecedented load on data centers, everyone from chip designers to systems integrators has had to rethink how data gets transmitted, communicated, and processed.

Resource Pooling With CXL

The most important element that CXL brings to the data center is resource pooling. Allowing CPUs to access other specialized resources to complete complex computations is key to an efficient, decentralized design philosophy.

CXL 3.0 includes new features such as multi-level switching, multi-headed and fabric-attached devices, enhanced fabric management, and composable disaggregated infrastructure (Figure 4), which enables the standard to become the link or the thread that weaves the fabric of the data center together.

While not likely to completely replace the Ethernet electrical and optical cables and transceivers, CXL provides value for communicating between elements in unique ways.

Figure 4. Example of a CXL 3.0-enabled fabric data center architecture.

Figure 4. Example of a CXL 3.0-enabled fabric data center architecture. Image used courtesy of the CXL Consortium

Another newcomer to the data standard game, Universal Chiplet Interconnect Express (UCIe), pushes composability down to the integrated circuit level. Bucking trends of locked-down, proprietary systems-on-chips, UCIe and CXL both embrace modularity and flexibility.

But as technology trends become more discrete and modular, validating that components perform flawlessly together, regardless of the original manufacturer, may become more of a challenge.

Design and Validation Challenges for CXL Products

Modularity means compliance with defined interoperability requirements. Each module in such a system must work seamlessly with any other module in that system, regardless of who designed or manufactured it. Validation and compliance with a standard become important tests to ensure that each vendor’s product plays nicely with every other device.

Compliance testing brings challenges. Although CXL builds upon PCIe interconnects and electrical building blocks, even seasoned PCIe developers need to take care when designing and validating their CXL devices. 

One potential issue to watch out for relates to how CXL reduces latency. CXL transmissions make assumptions that PCIe does not, resulting in shorter flits and less overhead during communication between host and device. Because of this, crucial information may be missing in the flit, information essential to debug issues and that developers would expect in a PCIe transmission. A CXL retry is often missing information that could help a developer debug compared to PCIe. 

Maintaining coherency among disparate caches has overhead costs from snoop operations and data copying. The CXL specification recommends a bias-based coherency model to alleviate the need for excessive snoop operations. However, the system may mask improper behavior around biasing.

While memory access would be possible and coherency maintained, there could be unnecessary overhead if the system does not follow biasing rules properly. Analyzing and detecting improper behavior around biasing can yield important insights to improve system performance and reduce latency. Because of these and other potential issues with CXL devices, specialized test software may come in handy for developers learning about debugging and validating CXL device performance.

CXL: Symbolic of Multiple Technology Trends

The CXL interface is symbolic of multiple trends in today’s tech world. CXL is a step toward disaggregation and modular design, but it equally represents the importance of collaborating to tackle large tasks. CXL enables multiple devices to work together on complex computations, freely sharing resources to tackle petabytes of data generated by data-intensive industries.

As an open-source standard, CXL is the product of the brightest minds in the communications, signal processing, and test and measurement industries working together to tackle society’s ever-growing data demand. Even though they may hail from competing companies, these engineers developed a standard that ensures their products will interoperate to boost data center capacity.

It may take a few more years and more generations of the CXL standard to see the full effect it will have on the data center industry, but it is safe to say CXL will play a significant role in the coming data revolution.

Industry Articles are a form of content that allows industry partners to share useful news, messages, and technology with All About Circuits readers in a way editorial content is not well suited to. All Industry Articles are subject to strict editorial guidelines with the intention of offering readers useful news, technical expertise, or stories. The viewpoints and opinions expressed in Industry Articles are those of the partner and not necessarily those of All About Circuits or its writers.