This power technology, it's about to explode
2024-09-19 10:44:42 683
Processors and data center architectures are changing to meet the higher voltage demands of servers running AI and large language models (LLMs).
At one time, servers ran on a few hundred watts of power. But things have changed dramatically in the last few decades as the amount of data to be processed has increased dramatically and users demand faster data processing.NVIDIA's Grace Blackwell chips consume 5 to 6 kilowatts, which is roughly 10 times the total power consumed by servers in the past.1 The power consumed by servers in the last few decades has increased dramatically as the amount of data to be processed has increased dramatically and users have demanded faster data processing.
Power is voltage times current. “If I need 5 kilowatts, I can do it at 120 volts standard.” said Steven Woo, Rambus' brilliant inventor. “But I need 40 amps, which is a lot of current.”
This is similar to the kind of wire you buy at the hardware store. “High-current wires come in many different diameters and are very thick.” Woo says. “It used to be thought that a server might be 1 or 2 kilowatts, and for 120 volts you'd only need to supply 10 amps of current. Now, with much higher power requirements, if I keep the voltage at 120 volts, I have to supply four times that or more, but the wires can't handle that much current; they melt.”
If you can't increase the current, then the other option is to increase the voltage. “Current times voltage equals 5 kilowatts,” notes Woo. “Today's servers are at 48 volts, whereas they used to be at 12 volts. Now that NVIDIA is talking about 48 volts, they've quadrupled the voltage, which allows them to quadruple the power while keeping the current the same.”
This change is reflected in power supplies, noted Rod Dudzinski, market development manager for Siemens EDA's Embedded Board Systems Division, “We are seeing customers looking for different ways to deliver the power needed to run rackmount systems as they build out large data centers. Some data center companies are borrowing ideas and concepts from high-performance power modules and related power electronics to achieve this goal, such as efficient power conversion to thermal efficiency to lifetime reliability. With power consumption in traditional data centers expected to increase by 50 percent by 2025, board-level power conversion efficiency and power density are primary considerations for system architects and should be used as a means to reduce losses in every PCB distribution network (PDN) in the system.”
Similar changes are reflected in EDA, and Lee Vick, vice president of strategic marketing at Movellus, said there are parallels between what is happening in the data center power space and what is happening in EDA. “In the chip design space, we used to have a situation where transistors were made individually through an EDA tool flow, but those tools were a series of different tools - layout tools, timing tools, routing tools. Eventually, we had to move to a world where we integrated these tools, integrated flows, and integrated data to meet the performance demands of the modern world. Now, even EDA companies don't stop at design, because you have to manage the lifecycle of the chip from design through test and manufacturing, all the way to the field, where they inspect the device and capture telemetry data to feed back into the design process and improve testing. It's a complete lifecycle. It's a fully integrated vertical process (even if it's horizontal in timeframe), which is critical.”
A similar trend applies to data center power. “It used to be that when you designed a chip, you'd have a power budget,” Vick says. “Or, if you were an engineer and you were assigned to design a module, you'd have a power budget for that particular module, and you wouldn't dare go beyond that. But that's all you need to care about - inputs and outputs. Things are different now. In the data center, we're seeing demand extend well beyond subsets or chips to the motherboard, rack, and data center level.”
The ripple effect is important here, not just the need to minimize power consumption. “Everybody has to minimize power consumption,” he said. “There are constraints, there are requirements, and there are changes. You have to be able to react to them. The other key thing is that we've gone well beyond assumptions, beyond 'this is the future' hyperbole. At the recent DAC, we had a panel on managing kilowatt power budgets. We had industry experts from IC design, EDA, IP, and system design, all people and organizations playing a role. This is not a problem that can be solved by IP providers, chip designers or EDA companies alone; it takes everyone working together. Similarly, in data centers, we must improve power distribution and heat dissipation, which only increases energy consumption at the macro level. But the sheer size of modern data centers with their chips and the large number of chips and compute components inside them only exacerbates the situation.”
According to Ashutosh Srivastava, principal application engineer at Ansys, this goes both ways, as chip design can lead to a surge in power consumption as the latest AI chips, including GPUs, consume more energy when performing larger, faster computations. In some cases, each server consumes more than 2 kilowatts of power. “At the same time, chip architects are looking to design a chip to optimize power consumption without compromising performance because they will cost more to run - not just the cost of power, but also the cooling infrastructure.”
In addition, upstream power distribution in data centers is changing to accommodate greater power demands, which includes changing the distributed bus voltage in the rack from the old 12 V to 48 V. Srivastava says, “By increasing the voltage by a factor of four, the current can be reduced by a factor of four, and the conduction losses reduced by a factor of 16. Each converter in the rack has also been redesigned to improve efficiency. The power losses associated with supplying power directly to the chip can be optimized with high-efficiency converters. Stacking the power supply for the chip directly on top, for example, helps reduce this power loss.”
Addressing the “last mile” of power supply
A number of power supply vendors have introduced technologies including 48V and vertical power supplies to reduce losses and improve transient response.
Vicor, for example, has introduced a Fractional Power Architecture (FPA) that replaces traditional multi-phase regulators to improve density and power system efficiency. The FPA breaks down the power conversion into separate regulator and converter functions, which can be individually optimized to maximize performance. The voltage regulator module can be deployed anywhere on the motherboard, while the critical current output module, the current multiplier, can be optimized for density, efficiency and low noise, and can be deployed very close to the processor. The current multiplier not only delivers high currents in excess of 1000Amp, but also provides a sharp 50x drop in PDN resistance. Vicor offers both lateral and vertical proportional power supply options, depending on the processor current.
Lateral Power Delivery (LPD): High-current delivery is achieved through modular current multiplier (MCM) modules that are placed on the motherboard or processor substrate adjacent to the processor. Locating the MCM on the substrate not only minimizes PDN losses, but also reduces the processor substrate BGA pins required for power.LPD is designed to support the power supply requirements and unique packaging of OCP Accelerator Module (OAM) cards and custom AI gas pedal cards.
Vertical Power Delivery (VPD): For very high processor currents, VPD deploys the current multiplier module directly below the processor, which reduces the PDN resistance by up to a factor of 10 compared to LPD. Another advantage of vertical power is that it opens up board area on the upper PCB for high-speed I/O and memory. the VPD utilizes a current multiplier similar to Vicor's LPD solution, but integrates the high-frequency bypass capacitors, which are typically deployed below the processor, into a transmission package that interfaces with the MCM. In addition, the transmission allows for necessary modifications to the spacing from the MCM's output pins to the processor's power pins, and its output power pins are also matched to the processor's or ASIC's power mapping to maximize performance.
The MPS also offers both horizontal and vertical power supply modes, where the first stage of the horizontal power supply solution uses the 800W MPC12109, which utilizes a high-performance LLC topology to fully realize soft switching, and can reach 98% peak efficiency while being extremely small in size; and the second stage uses multiple MPC22167 modules in parallel to achieve powerful output capability. Among them, the single module integrates two sets of DrMOS and inductors with top heat dissipation. With industry-leading low-voltage, high-current processes and high-performance digital COT controllers, MPS's overall solution not only boasts high power quality, but is also simple and flexible in design.
The MPS vertical power supply solution also uses a two-stage architecture featuring a 10:1 LLC module to convert the 48V input voltage to a lower 4.8V, further leveraging the advantages of MPS' low-voltage, high-current process. Meanwhile, advanced inductor technology is used to compress the overall height of the second-stage multi-phase power module within 5mm, breaking through the difficulty of multi-phase power supply layout on the back of the main chip. As the length of the power transmission path of the vertical power supply program is only the thickness of the PCB board, it greatly reduces the impact of the transmission path parasitic parameters on the quality of the power supply, and at the same time greatly reduces the path power loss. This novel vertical power supply will shine in the future development process of AI.
Infineon's TDM2254xD dual-phase power module supports vertical power supply, reduces PDN losses and increases power density. Its package size is 10x9x8 mm + 10x9x5 mm, with a peak current of 160 A and 2% higher efficiency at full load than comparable products.
New Data Center Considerations
Another important consideration in data center design is its location. “Often, these data centers are located in urban areas, so data centers are not energy efficient-competing with the power needs of the population can limit their capacity,” Srivastava says. “As a result, some areas prohibit the construction of new data centers, and if the situation is urgent, the data center will need to reduce its electrical load in order to power other important areas of the community. This means either building energy-efficient computing hardware or finding alternative power sources. This has led to another trend, where large data centers are now considering building their own power plants to provide the power they need, especially from sustainable and reliable sources. This could take the form of traditional solar or wind power combined with energy storage, or even small modular nuclear reactors (SMRs), which are under development.”
Power management in the data center is an evolving challenge, said Mark Fenton, director of product engineering at Cadence: “IT loads can fluctuate greatly throughout the day, influenced by the demands of various applications. The power of a cabinet is a complex set of changing variables - its current power usage, budgeted capacity for future projects, and maximum design constraints. In turn, power allocation and capacity can be shared across multiple data centers.”
In a co-location environment, for example, users are constantly adjusting their needs for shared systems with little knowledge of what IT has installed or is about to install. “New GPU workloads exhibit different power behaviors, often resulting in large and almost instantaneous power spikes,” said Fenton. “These fluctuations pose a significant risk of failure to the data center power infrastructure, which is a major concern. To optimize efficiency and maximize available power, it is beneficial to utilize three-phase power, but it is also critical that the phases need to be balanced to prevent inefficiencies.”
Power Losses in Voltage Conversion
Voltage conversion in data centers involves multiple conversion and regulation phases, which can lead to significant power loss. “If my server is now at 48 volts, the problem is that the chip itself still needs to run at 12 or 5 or even 1 volt.” Rambus' Woo said. “That means the voltage has to be reduced. But every time you lower the voltage, you lose some power, so the efficiency starts to drop. That's because it takes power to switch voltage levels, so that's a big problem, and switching different voltages consumes a lot of power.”
This means that the data center infrastructure must convert building utility power to single-phase or three-phase power at the rack level. “The voltage may drop from 13.8 kV (medium voltage) to 480 V or 208 V (low voltage) and then to 240 V or 120 V,” Fenton says. “Efficiency tends to be higher at partial loads, and since most power supplies utilize 2N redundant supplies, a large portion of the system operates at these partial load conditions.”
Steve Chwirka, senior application engineer at Ansys, noted that the losses begin with the large transformers that step down the utility power supply from 480V AC. “This new lower AC voltage is distributed through many types of cables and PDUs (Power Distribution Units) that are essentially very large bus bars. All of this leads to conduction losses in the system. There are several power conversion levels that are also associated with power losses. These include the Uninterruptible Power Supply (UPS), which supplies power to the rack under fault conditions just long enough for the backup generator to kick in. The main conversion occurs at the rack, where the AC voltage is converted to high voltage DC, which is then converted to a lower DC voltage by a power supply unit (PSU). This DC voltage now has to go through several levels of conversion before it reaches the chip.”
At each level, the amount of power loss is different. From the utility input to the chip, Chwirka made some estimates of power loss. “Power transformers are very efficient machines, with losses of only 1% to 2%. the efficiency of a UPS system will vary depending on its design and load conditions. On-line UPS systems that provide the highest level of protection typically have efficiencies between 90 and 95 percent. As a result, they lose 5% to 10% of their power.PDUs also have some inherent losses. PDUs also have some inherent losses, which can result in additional losses of about 1% to 2%. Modern PSUs typically have an efficiency of 80% to 95%. This means that 5% to 20% of the power may be lost in the conversion from AC to DC. An additional converter, sometimes called an intermediate bus converter (IBS), converts the rack's 48 V DC to 8 to 12 V DC, with efficiencies up to about 98%. Due to size constraints, the chip's final conversion efficiency for the lower voltages required is slightly lower than the IBC.”
What you need to know about power delivery
There are many factors to consider when designing a data center environment, and one of the most important things to consider is the infrastructure around high voltage. “If high voltages are coming into the system, you need to know how to reduce the voltage to the level you need,” Woo notes. “There may be some external circuitry that is doing the voltage reduction. There are on-chip methods for voltage management over a small range of voltages. The most important thing is to really understand how much power your chip is going to consume and to understand where that power is coming from. This is usually a system level issue. There are also questions about aging, because sometimes chips expand as they warm up. The different materials used to make the chip all expand at different rates, and if thermal cycling (i.e., switching frequently between high and low temperatures) is performed, this can lead to cracking and other reliability issues.”
Architecture also has an impact. ansys researcher Norman Chang explains that as 3D-IC chipsets get larger, chip architects need to consider the design of power systems that distribute power vertically to the chipset, such as in the Tesla D1 Dojo chip. “Architects also need to consider thermal distribution because dozens of chips are placed in the 3D-IC through system technology co-optimization,” he said. “The analog/mixed-signal designs in the 3D-IC need to be placed in locations that are less sensitive to thermal/stress variations generated by peak computing workloads.
Eventually, the challenges of delivering power to data centers will fall to chip and system architects, said Movellus' Vick: “As a computer architect, I was very digitally and processor oriented. Then I started working for hard IP companies, and they would ask, 'How many pulses do you have on your power supply?' I'd say, 'I don't know. The power supply is right there. It's always clean and you don't have to worry about it.' But factors like implementation and integration matter - how clean your power supply is and how you wire it. One of the things we see at the architectural level is that when you have an analog portion of your integrated circuit, whether it's power conditioning, sensors, or clocks, you have to run the analog voltages in a traditionally digital area, and that simple fact can seriously undermine your design. Suppose I have a large chunk of digital logic that consumes a lot of energy. I want to see what's happening on the grid side, and I want to see if there's a signal drop. But that requires cramming an analog sensor into that digital logic, and that's hard to do.”
Migrating from analog to digital design gives you more freedom to do more in the form of instrumentation and to understand what's going on. “This is an example of going beyond the functional scope of the module,” Vick says. “Oh sure, it has a lot to do with implementation, so we're moving away from esoteric stuff to the real world, and real-world implementation is important. It's not about can I design this thing, or can I get the best TOPS/w. can I actually implement it in a real design? Can I handle noisy power supplies? Can I handle an unstable grid? The amount of margin and over-design required suggests that I can no longer afford it, and today the grid itself is subject to the same design constraints that logic encounters. It's traveling on that rough edge, and sometimes it will drift and struggle, and I have to think about that from a hardware and software perspective, rather than assuming an infinite amount of clean energy.”