Understanding AI Passive Components and Interconnects

This article based on AVNET blog is ellaborating on how Artificial Intelligence (AI) applications impact requirements and development of passive components and interconnection.

Artificial Intelligence (AI) applications often involve large datasets. Workloads can require multiple distributed CPUs and GPUs communicating with each other in real time. This is the essence of high-performance computing (HPC) architectures.

Routing high-speed digital signals between processing elements introduces chip-to-board and board-to-board connectivity. Communication protocols and physical standards have been developed to meet the high-speed requirements, often based on signal integrity standards. Those standards also facilitate interoperability between suppliers.

Non-standard connectors are occasionally used, sometimes because of a particular form-factor requirement or other mechanical constraint. In these instances, their suitability for the application can be determined by comparing their specifications with those of industry standard parts.

AI data bandwidth

Bandwidth and impedance are key electrical characteristics when considering signal integrity. Pin count, the materials used, and mounting methods are important mechanical considerations that impact performance and reliability. The power consumed in HPC systems is increasing, contact resistance is becoming ever more important in the face of improving data center power efficiency.

At the processor interface, solderless connectivity to a CPU takes the form of a land grid array (LGA) or pin grid array (PGA) package. Intel invented the LGA and uses it for almost all its CPUs. If the processor isn’t designed to be user replaceable it may use a ball grid array (BGA). A BGA uses solder balls to connect the component to the printed circuit board. This is the most common approach for GPUs, but it is also used for some CPUs.

The rate at which data can be transferred between memory and a processor remains a key factor in overall system performance. For systems using random access memory (RAM), the most recent development is the evolution from DDR4 to DDR5. The DDR4 standard supports data rates up to 25.6 Gbps. DDR5 takes this up to 38.4 Gbps.

The standard influences the design of chip interfaces. The latest LGA 4677 IC sockets provide a link bandwidth of up to 128 Gbps, typically with support for 8-channel DDR5 memory. The 4677 tightly spaced connection points can carry up to 0.5 A – an indication of the power demands of today’s high-performance processors.

Dual inline memory modules (DIMM) DDR5 memory sockets now support up to 6.4 Gbps bandwidth and the best mechanical designs save space while improving airflow around components on the printed circuit board.

Connecting AI beyond the board

PCI Express: Most processor boards will have several PCI Express (PCIe) slots for connectors. Slot types are x1, x4, x8, and x16. The largest is typically used for high-speed connectivity to GPUs. The PCIe protocol standard allows for up to 32 bidirectional, low latency, serial communications “lanes”.

For communication between internal system components and to the external world, the following protocols and connector types are amongst the most popular.

PCI Express: Most processor boards will have several PCI Express (PCIe) slots for connectors. Slot types are x1, x4, x8, and x16. The largest is typically used for high-speed connectivity to GPUs. The PCIe protocol standard allows for up to 32 bidirectional, low latency, serial communications “lanes”.

Each lane is a set of differential pairs, one for transmitting and one for receiving data. The latest iteration of the standard, announced in January 2022, is PCIe 6.0. This offers double the bandwidth of its predecessor at up to 256 Gbps and operates at a frequency of 32 GHz (the same as PCIe 5.0).

PCIe 6.0 is still in its early adoption phase and hardware availability is currently limited, although new compatible connectors have been announced by some leading suppliers represented by Avnet.

InfiniBand: InfiniBand is a high speed, low latency protocol commonly found in HPC clusters. Its maximum link performance is 400 Gbps and can support many thousands of nodes in a subnet. InfiniBand can use a board form factor connection and supports both active and passive copper cabling, active optical cabling, and optical transceivers. InfiniBand is complementary to Fibre Channel and Ethernet protocols but the InfiniBand Trade Association claims that it “offers higher performance and better I/O efficiency” then either of these. Common connector types for high-speed applications are QSFP+, zQSFP+, microQSFP, and CXP.

Ethernet: Traditionally associated with standard networking, high-speed Gigabit Ethernet has become more common in HPC. The main connector types for these applications are CFP, CFP2, CFP4, and CFP8, in addition to those listed for InfiniBand.

CFP stands for C-form factor pluggable and CFP2/CFP4 versions offer bandwidths of up to 28 Gbps per lane and support 40 Gbps and 100 Gbps Ethernet CFP-compliant optical transceivers. CFP8 connectors feature 16, 25 Gbps lanes to support up to 400 Gbps connectivity.

Fiber Channel: This mature protocol is specific to storage area networks (SANs) and is widely deployed in HPC environments. It supports both fiber and copper media, offering low latency, high bandwidth, and high throughput.

It currently supports up to 128 Gbps connectivity, or 128 Gigabit Fiber Channel (GFC), and there is an industry association roadmap to support the protocol at up to 1 Terabit Fiber Channel (TFC). Connectors used for fiber channel range from traditional LC types to zQSFP+ for the highest bandwidth connections.

Serial Attached Technology Attachment (SATA) and Serial Attached SCSI (SAS): These are high-speed data transfer protocols for connecting hard disks and solid-state storage devices in HPC clusters.

Both protocols have dedicated connector formats with internal and external variants. SAS is generally the preferred protocol for HPC but it’s more expensive than SATA, so both are still in widespread use. SAS is the higher speed option, offering interface connectivity at up to 12 Gbps but the operating speed of the storage device is often the limiting factor for data transfer rates.

Passive components and powering AI processors

As processing speed and data transfer rates rise, so do the demands on passive components.

Powering AI processors in data centers means the ferrite core inductors used for EMI filtering in the decentralized power architectures may need to carry tens of amps. Low DC resistance and low core losses are essential. Recent innovations include single-turn, flat wire ferrite inductors. With versions from 47 nH to 230 nH, they’re designed for use in point-of-load power converters – those located close to processors to minimize board resistive losses. These kinds of inductor are rated at up to 53 A and feature maximum DC resistance ratings of just 0.32 mOhms, minimizing losses and heat dissipation.

High performance processing demands high current and power rails with good voltage regulation and fast response to transients. For example, it’s important to avoid voltage drops caused by large load-current variations. Several capacitor technologies are deployed to achieve these design goals. 

Often, designers need to look beyond capacitance and voltage ratings to consider how frequency-dependent characteristics affect performance.

Where high capacitance values are needed, aluminum electrolytic capacitors have been the traditional choice, thanks to their high CV values in small packages, and their low cost. However, both polymer dielectric and hybrid (solid polymer + liquid electrolyte dielectric) have become more popular for their lower equivalent series resistance (ESR) and longer operating life.

The high power consumed by data centers has pushed the voltage used in rack architectures up from 12 V to 48 V to improve power efficiency. 48 V-rated aluminum polymer capacitors designed for high ripple current capabilities (up to 26 A) are now available in values up to 1,100 µF. One manufacturer offers these in a rectangular shape, making them suitable for stacking into modules to achieve higher capacitances with good volumetric efficiency.

Multilayer ceramic capacitors (MLCCs) are widely used in power supply filtering and decoupling, not least because of their low ESR and low equivalent series inductance (ESL). They’re available in a vast range of values spanning nine orders of magnitude.

Manufacturers have continuously improved the volumetric efficiency of MLCCs through materials and manufacturing process developments. One company recently announced a 1608M (1.6 mm x 0.8 mm) size multilayer ceramic capacitor (MLCC) at 1 µF/100 V rating for use in 48 V power supply lines in servers and data centers. It’s believed to be the largest capacitance for a 100V-rated multilayer ceramic capacitor in this package size, saving 67% in volume and 49% in surface area compared with those in the 2012M packages, the previous smallest package for this CV rating.

Other recent developments include innovative packaging technology for bonding MLCCs together without using metal frames. The technology keeps ESR, ESL, and thermal resistance low by using a highly conductive bonding material to produce a single surface mountable component comprising the required number of MLCCs. Ceramic capacitors with dielectric materials that exhibit only small capacitance shift with voltage and predictable, linear capacitance change with changes ambient temperature are preferred for filtering and decoupling applications.

Conclusion

In summary, the need for high processor performance in AI systems places specific demands on the selection of passive and electromechanical components. These components must be chosen with a focus on high-speed data transfer, efficient power delivery, thermal management, reliability, signal integrity, size constraints, and the specific requirements of the AI application, ensuring that the electronic system can meet the demands of AI workloads effectively and reliably.

Exit mobile version