A classic way for engineers to solve a particularly vexing technical problem is to move things in a completely different direction—typically by “thinking outside the box.” Such is the case with challenges facing the semiconductor industry. With the laws of physics quickly closing in on them, the traditional brute force means of maintaining Moore’s Law, by shrinking the size of transistors, is quickly coming to an end. Whether things stall at the current 7nm (nanometer) size, drop down to 5nm, or at best, reach 4nm, the reality of a nearly insurmountable wall is fast approaching today’s leading vendors.
As a result, semiconductor companies are having to develop different ways to keep the essential performance progress they need moving in a positive direction. One of the most compelling ideas, chiplets, isn’t a terribly new one, but it’s being deployed in interesting new ways. Chiplets are key IP blocks taken from a more complete chip design that are broken out on their own and then connected together with clever new packaging and interconnect technologies. Basically, it’s a new version of an SoC (system on chip), which combined various pieces of independent silicon onto a multi-chip module (MCM) to provide a complete solution.
So, for example, a modern CPU typically includes the main compute engine, a memory controller for connecting to main system memory, an I/O hub for talking to other peripherals, and several other different elements. In the world of chiplets, some of these elements can be broken back out into separate parts (essentially reversing the integration trend that has fueled semiconductor advances for such a long time), optimized for their own best performance (and for their own best manufacturing node size), and then connected back together in Lego block-type fashion.
While that may seem a bit counter-intuitive compared to typical semiconductor industry trends, chiplet designs help address several issues that have arisen as a result of traditional advances. First, while integration of multiple components into a single chip arguably makes things simpler, the truth is that today’s chips have become both enormously complex and quite large as a result. Ensuring high-quality, defect-free manufacturing of these large, complex chips—especially while you’re trying to reduce transistor size at the same time—has proven to be an overwhelming challenge. That’s one of the key reasons why we’ve seen delays or even cancellations of moves to current 10nm and 7nm production from many major chip foundries.
Second, it turns out not every type of chip element actually benefits from smaller sizes. The basic argument for shrinking transistors is to reduce costs, reduce power consumption, and improve performance. With elements like the analog circuitry in I/O components, however, it turns out there’s a point of diminishing returns where smaller transistors are actually more expensive and don’t get the performance benefits you might expect from smaller production geometries. As a result, it just doesn’t make sense to try and move current monolithic chip designs to these smaller sizes.
Finally, some of the more interesting advancements in the semiconductor world are now occurring in interconnect and packaging technologies. From the 3D stacking of components being used to increase the capacity of flash memory chips, to the high-speed interfaces being developed to enable both high-speed on-chip and chip-to-chip communications, the need to keep all the critical components of a chip design at the same process level are simply going away. Instead, companies are focusing on creating clever new ways to interconnect IP blocks/components in order to achieve the performance enhancements they used to only be able to get through traditional Moore’s Law transistor shrinks.
AMD, for example, has made its Infinity Fabric interconnect technology a critical part of its Zen CPU designs, and at last week’s 7nm event, the company highlighted how they’ve extended it to their new data center-focused CPUs and new GPUs now as well. The next generation Epyc server CPU, codenamed “Rome,” scheduled for release in 2019, leverages up to 8 separate Zen2-based CPU chiplets interconnected over their latest generation Infinity Fabric to provide 64 cores in a single SoC. The result, they claim, is performance in a single socket server that can beat Intel’s current best two-socket server CPU configuration.
In addition, AMD highlighted how its new 7nm data center-focused Radeon Instinct GPU designs can now also be connected over Infinity Fabric both for GPU-to-GPU connections as well as for faster CPU-to-GPU connections (similar to Nvidia’s existing NVLink protocol), which could prove to be very important for advanced workloads like AI training, supercomputing, and more.
Interestingly, AMD and Intel worked together on a combined CPU/GPU part earlier this year that leveraged a slightly different interconnect technology but allowed them to put an Intel CPU together with a discrete AMD Radeon GPU (for high-powered PCs like the Dell XPS15 and HP 15” Spectre X360) onto a single chip.
Semiconductor IP creator Arm has been enabling an architecture for chiplet-like mobile SoC designs with its CCI (Cache Coherent Interconnect) technology for several years now. In fact, companies like Apple and Qualcomm use that type of technology for their A-Series and Snapdragon series chips, respectively.
Intel, for its part, is also planning to leverage chiplet technology for future designs. Though specific details are still to come, the company has discussed not just CPU-to-CPU connections, but also being able to integrate high-speed links with other chip IP blocks, such as Nervana AI accelerators, FPGAs and more.
In fact, the whole future of semiconductor design could be revolutionized by standardized, high-speed interconnections among various different chip components (each of which may be produced with different transistor sizes). Imagine, for example, the possibility of more specialized accelerators being developed by small innovative semiconductor companies for a variety of different applications and then integrated into final system designs that incorporate the main CPUs or GPUs from larger players, like Intel, AMD, or Nvidia.
Unfortunately, right now, a single industry standard for chiplet interconnect doesn’t exist—in the near term we may see individual companies choose to license their specific implementations to specific partners—but there’s likely to be pressure to create that standard in the future. There are several tech standards for chip-related interconnect, including CCIX (Cache Coherent Interconnect for Accelerators), which builds on the PCIe 4.0 standard, and the system-level Gen-Z standard, but nothing that all the critical players in the semiconductor ecosystem have completely embraced. In addition, standards need to be developed as to how different chiplets can be pieced together and manufactured in a consistent way.
Exactly how the advancements in chiplets and associated technologies relate to the ability to maintain traditional Moore’s law metrics isn’t entirely clear right now, but what is clear is that the semiconductor industry isn’t letting potential roadblocks stop it from making important new advances that will keep the tech industry evolving for some time to come.