While headlines are all about the rise of AI, the reality on the ground involves plenty of headaches.
The main issue is liquid cooling.
GPU systems run hot, with racks consuming tens of thousands of watts of power.

Traditional air cooling is insufficient, which has led to widespread adoption of liquid cooling systems.
This shift has driven up the stock prices of companies like Vertiv, which deploy these systems.
As a result, liquid cooling has become the leading cause of failures in data centers.
There are also many challenges in configuring GPUs.
There’s simply a lot of new technology for professionals to master in a very short timeframe.
In the grand scheme of things, these are just speed bumps.
We expect hyperscalers to delay or slow down their GPU rollouts to address these challenges.
To be more precise, we’re likely to hear more about these delays because they’ve already begun.
Let’s say Acme Semiconductor wants to enter the data center market.
They spend a few hundred million dollars to design a processor.
Acme builds a few dozen of these servers and hands them out to their top sales prospects.
The hyperscalers then spend a few months testing the system.
The hyperscaler also wants Acme to build these test systems with their preferred ODM.
While the first hyperscaler is running their multi-month evaluation, a second customer expresses interest.
Of course, they want their own server configuration with their own preferred ODM.
Acme, needing the business, covers the cost of this design as well.
Acme approaches all the OEMs to see if any will design a catalog system to streamline the process.
The OEMs are all very friendly and interested in what Acme is doing.
Great job guys, but they’ll only commit to designing once Acme secures more business.
Finally, a customer wants to buy in volume a big win for Acme.
This time, because there’s real volume involved, the ODM agrees to do the design.
This is expected; bugs are everywhere.
Their chip is the least familiar component to the ODM and the customer.
Acme worked with the customer to iron out bugs during the evaluation cycle, but this is different.
Acme sends its field engineers to the super-remote data center to get hands-on with the system.
The three teams work through the bugs, finding more along the way.
The deployment drags on as the teams work through the issues.
At some point, something significant needs to be entirely replaced, adding more delays and costs.
But after months of work, the system finally enters production.
And if that doesn’t sound painful enough, we haven’t even mentioned the lawyers.
This is how servers have been built for years.
Then Nvidia entered the market, bringing their own server designs.
Not only that, but they brought designs for entire racks.
Nvidia has been designing systems for 25 years, dating back to their work on graphics cards.
To compete with Nvidia, AMD can either spend five years replicating Nvidia’s team or buy ZT.
In theory, ZT can help AMD eliminate almost all of the friction outlined above.