Since the disastrous launch of the RTX 50 series, NVIDIA has been unable to escape negative headlines: scalper bots are snatching GPUs away from consumers before official sales even begin, power connectors continue to melt, with no fix in sight, marketing is becoming increasingly deceptive, GPUs are missing processing units when they leave the factory, and the drivers, for which NVIDIA has always been praised, are currently falling apart. And to top it all off, NVIDIA is becoming increasingly insistent that media push a certain narrative when reporting on their hardware.
this openai partnership really stands out, because the server world is dominated by nvidia, even more than in consumer cards.
Yup. You want a server? Dell just plain doesn’t offer anything but Nvidia cards. You want to build your own? The GPGPU stuff like zluda is brand new and not really supported by anyone. You want to participate in the development community, you buy Nvidia and use CUDA.
Fortunately, even that tide is shifting.
I’ve been talking to Dell about it recently, they’ve just announced new servers (releasing later this year) which can have either Nvidia’s B300 or AMD’s MI355x GPUs. Available in a hilarious 19" 10RU air-cooled form factor (XE9685), or ORv3 3OU water-cooled (XE9685L).
It’s the first time they’ve offered a system using both CPU and GPU from AMD - previously they had some Intel CPU / AMD GPU options, and AMD CPU / Nvidia GPU, but never before AMD / AMD.
With AMD promising release day support for PyTorch and other popular programming libraries, we’re also part-way there on software. I’m not going to pretend like needing CUDA isn’t still a massive hump in the road, but “everyone uses CUDA” <-> “everyone needs CUDA” is one hell of a chicken-and-egg problem which isn’t getting solved overnight.
Realistically facing that kind of uphill battle, AMD is just going to have to compete on price - they’re quoting 40% performance/dollar improvement over Nvidia for these upcoming GPUs, so perhaps they are - and trying to win hearts and minds with rock-solid driver/software support so people who do have the option (ie in-house code, not 3rd-party software) look to write it with not-CUDA.
To note, this is the 3rd generation of the MI3xx series (MI300, MI325, now MI350/355). I think it might be the first one to make the market splash that AMD has been hoping for.
I know Dell has been doing a lot of AMD CPUs recently, and those have definitely been beating Intel, so hopefully this continues. But I’ll believe it when I see it. Often, these things rarely pan out in terms of price/performance and support.
AMD’s also apparently unifying their server and consumer gpu departments for RDNA5/UDNA iirc, which I’m really hoping helps with this too
yeah, I helped raise hw requirements for two servers recently, an alternative to nvidia wasn’t even on the table
Actually…not true. Nvidia recently became bigger in the DC because of their terrible inference cards being bought up, but AMD overtook Intel on chips with all major cloud platforms last year, and their Xilinix chips are slowly overtaking the sales of regular CPUs for special purposes processing. By the end of this year, I bet AMD will be the most deployed brand in datacenters globally. FPGA is the only path forward in the architecture world at this point for speed and efficiency in single-purpose processing. Nvidia doesn’t have a competing product.
we’re talking GPUs, idk why you’re bringing FPGA and CPUs in the mix