COM-HPC carrier board designs: preparation is half the battle
COM-HPC is the upcoming new standard for modular high-end edge servers. It provides significantly faster and almost twice as many interfaces as COM Express. As a result, carrier board design requirements are increasing exponentially. How can developers prepare for the new challenges?
At the end of 2019, the PICMG COM-HPC technical subcommittee approved the pinout for the new high-performance computer-on-module specification. Soon, the standard will be ratified officially and first modules available. System developers are already specifying their first carrier board designs and getting ready to lay out the first PCBs in preparation for the launch of their own solutions, ideally in parallel with the next embedded server processors from Intel and AMD. But the great density of high-speed interfaces on the connectors poses unprecedented challenges for carrier board developers, especially regarding signal compliance.
The two 400-pin connectors provide interfaces with extremely high clock rates of up to 25 Gbps as well as PCIe Gen 4 and 5, with each of the new PCIe generations doubling the transfer rate to increase performance. While PCIe Gen 3.0 offers 8 gigatransfers per second (8 GT/s), this doubles to 16 GT/s for PCIe Gen 4 and then again to 32 GT/s for PCIe Gen 5. Recently published preliminary details of PCIe Gen 6, however, do not indicate a change of clock frequency. But in this case, 2 instead of only 1 bit per clock will be transmitted by using 4-step pulse amplitude modulation (PAM4). Presumably, COM-HPC will also be able to support this PCIe Gen 6 technology leap since it uses an optimised version of a 56 Gbps PAM4 specified connector.
To put it statistically, developers are prepared for just 12.5% (1/8 of the maximum possible bandwidth of COM-HPC) of what they will have to handle in the future. That’s a gigantic learning curve in regard to PCIe alone. Sure, PCIe Gen 6 is still a distant concept and it will be years before the first serial products hit the market. But the current leap to the next generation is already challenging enough: changing from PCIe 3 to 4 yields +100%, changing from USB 3.2 Gen 2 (the former USB 3.1 Gen 2 or SuperSpeed+) to USB 4.0 (40 Gbps) +400%, and from 10 GbE to 25 GbE +150% more performance. So it’s easy to see why developers must get prepared. An important element here is early compliance testing to ensure that the final solution functions flawlessly not only during tests but also in the field. Because even if the design observes proven RF, dimensioning and layout rules for optimum signal quality, only comprehensive compliance tests can reveal critical anomalies in the application.
Statistically distributed and sporadic errors
Systems that operate outside compliance, however marginally, are prone to unexpected outages. While such systems work fine much of the time, they may in practice fail in conjunction with external components. Such sporadic errors are very difficult to analyse. At the same time, they are highly critical, and experience has shown their consequences to be costly. In addition to EMC compliance tests, which ensure that set radiation levels are not exceeded, the transmitters and receivers of the high-speed communication interfaces must also meet defined signal quality standards.
Proof of compliance
Let’s take PCIe interfaces as an example. Here, compliance with the PCIe specification guarantees successful communication between the motherboard and any peripheral — provided both sides comply with the specification. If the values are outside but close to the set limits, communication between board and devices can still work. However, in real-use cases transmission errors may occur. And since a detected communication error triggers retransmission of the data packet, the achieved data transfer rate will drop. That a communication has successfully passed laboratory testing is no proof of compliance. This requires detailed design characterisation based on precise measurements.
For this purpose, computer-on-module specialists such as congatec have built their own test labs, equipped with expensive high-precision test equipment. The advantage of an in-house lab is obvious: the deeper you work in the safe zone, the higher the reliability and functionality in the long term. Knowledge of the precise safety margins facilitates quality assurance as well as quality improvement. The first step is to optimise products with the aid of simulation. However, verifying the optimisations requires extensive receiver (RX) and transmitter (TX) compliance measurements. To be able to futureproof high-speed interfaces such as those offered by COM-HPC, congatec installed a new test station back in 2018 for the characterisation of the next generation of transmitters and receivers — including PCIe Gen 5.0, USB 4.0 and beyond. The company further has access to signal integrity experts, enabling it to offer comprehensive services.
More heat dissipation
Thermal stress is becoming another increasing concern for ensuring compliance of these embedded designs, as thermal and electrical properties interact. Physical causal chains, which were of little consequence for the low-power products that were typical in the classic embedded computing market up to now, are becoming significant since the connectors of COM HPC Server and COM-HPC Client modules are specified for up to 300 and 200 W respectively. This increases the complexity of designing compliant modules and application-specific carrier boards even further.
Next to the CPU, heat dissipation must also be optimised for all semiconductor and power elements. A good thermal connection usually also means a good electrical connection for possible interference on signal lines. Two types of thermal simulation are recommended for optimisation. First, there is the purely thermal approach that focuses exclusively on dissipating component heat. The other is electrothermal simulation. It also takes the current in the circuit board into account — especially in the power supply. Here, vias dissipate the heat to the large copper surfaces of a GND plane. The effect and placement of these vias in the design must therefore also be considered. However, heat from the circuit boards also affects the resistance in adjacent signal lines, and this impedance shift then influences the signal quality. This makes signal integrity an issue whose effects ripple right through to the optimisation of the cooling concept.
The ultimate goal is to provide evidence that customers can achieve the desired compliance and to prove this with a measurement routine. To provide proof of compliance, the complete unit — consisting of the computer-on-module and matching carrier board — should ideally be tested. To ensure customers later pass their own compliance tests, congatec exhausts all technical possibilities on its side. For this purpose, a functional unit consisting of a COM-HPC module and a congatec reference carrier must first pass a pre-compliance test that is carried out in the company’s own dedicated test lab. Making such pre-compliance tests for transmit and receive the norm for its products is proof of the high quality standards of the computer-on-module specialist. congatec also offers the same compliance tests for COM-HPC based carrier board/module combinations. This saves customers from having to invest in their own or external test labs. It will also make it easy to find an expert who can help solve problems in case of carrier board incompliance.
Fugaku has taken out the top spot on the TOP500 list — a ranking of the world's fastest...
Costing aside, choosing the right industrial joystick when such requirement is called for can be...
Australian researchers have successfully tested and recorded what is claimed to be the...