Nivida isn't as greedy as you think

No, they just misread the room

Nvidia this past week announced the 4080 12GB (hereafter the 4080-12) for an eye-watering $899 and the 4080 16GB (4080-16) for $1199.

But wait, the 2080 was $699 and the 3080 was $699, why are we suddenly jumping to $899 and $1199 for 4080, a 29% and 71% price increase respectively! This isn’t even the 4080ti!

The instinctive conclusion is that Nvidia’s being greedy. I’ve speculated for some time before the announcement that we could see prices go up given Nvidia has seen what people are willing to spend if they have no other choice since gamers still bought GPUs during the mining rush. I pointed to how the 20 series (Turing) increased prices pretty significantly even after the 2018 mining spike was over. I thought this would be another case of something similar.

Upon further reflection, I don’t think this is the case. I think it’s possible that Nvidia simply could not get away with charging much less if they want to maintain reasonable profit margins.

Nvidia likes to use old, inexpensive product nodes

Broadly speaking, there are generational manufacturing nodes that processors are made on. While the specifics will differ a bit from company to company, the generations are roughly consistent. The last few generations were each sort of around 14nm, 10nm, 7nm, and 5nm, with each generation bringing a pretty significant improvement to transistor density. Higher transistor density generally means smaller processors, or more performance within a size. Other nodes that you may hear about like 12nm, 8nm, or 6nm are sort of optimizations or customizations of the larger generations that bring smaller improvements.

Again, broadly speaking, newer nodes tend to be more expensive while older nodes come down in price over time.

RTX 30 series (Ampere) was announced Sept 2020. It used Samsung 8nm (an optimization of 10nm) while AMD had released the Radeon VII nearly two years earlier in February of 2019 on TSMC 7nm and the Apple A14 using TSMC 5nm was released only 2 weeks later. Nvidia was very conservative with this node, which is likely why they could price the large GA102 at only $699.

The 20 series in late 2018 was on TSMC 12nm, an optimization of 16nm (not quite as dense as Intel’s 14nm launched in 2015) Apple released a 7nm mobile CPU in the same month, and AMD released the 7nm Radeon VII only a few months after.

The 10 series, released in 2016 with the 1080, launched on TSMC 16nm was actually the most advanced node for its time in this retrospective. Though it was notably behind Intel’s 14nm class which at the time held process leadership (though Intel being stuck on that node for half a decade saw it age poorly) it was TSMC’s best. However, this was mitigated by the fact that the dies Nvidia made on that node were much smaller. The TU102 (2080ti) was a massive 754mm² but in 2016, the 1080 was only 314mm², and 2017’s 1080ti was 471mm². Small dies saved Nvidia’s prices then.

Ada stopped the inexpensive silicon trend

As we’ve discussed, for several generations, Nvidia has intentionally made their processors either with small dies or on older, inexpensive nodes to keep costs down. This all changed with Ada Lovelace GPUs, which used TSMC N4, which is a customization of TSMC N5, and is at present the densest, most advanced, and notably most expensive node available today, with the next generation TSMC N3 delayed and not yet shipping as of writing.

However, unlike the last time they went with a modern node, these things aren’t small. The AD102 (4090) is a massive die, about 608mm².

Ok, so… well, large dies, expensive node (and keep in mind TSMC has been increasing prices lately because of their market dominance) something somewhere has to give. The necessary result is a pretty expensive product.

Plus something (sort of) outside Nvidia’s control

It’s fun just crapping on Nvidia, but the fact that GDDR7 isn’t ready yet is actually a pretty massive problem for them. Normally, as GPUs get more powerful, we get faster RAM to keep them fed. Fast RAM alone for the sake of fast RAM doesn’t do a lot on its own, its job is to keep the cores busy working on fresh data. The faster the cores, the more bandwidth you need. The 2080 was still as fast as the 1080ti with a smaller memory bus (256 vs 352) because it used faster GDDR6 instead of GDDR5. The 30 series used expensive GDDR6X to keep things coming… while Lovelace uses GDDR6X as well. When you don’t have faster RAM to work with, the solution (as AMD found with the RDNA2 desktop GPUs) is to add a bunch of cache to keep the cores fed. Cache is large and expensive and causes the die size to balloon. This means far more transistors for the given performance.

I say “sort of” outside Nvidia’s control as Radeon has a solution to this problem; they’ve put the cache in adjacent dies to keep the GPU fed, allowing them to scale performance without bloating the size of the GPU die, as detailed here https://www.angstronomics.com/p/amds-rdna-3-graphics. I guess Nvidia didn’t think of this in time.

Ok, so Ada is expensive to manufacture… Why did Nvidia do this?

Well, the silicon shortage was going on for quite a while. Basically, immediately after Ampere was announced, and through the entire development phase of Ada, all market indicators suggested that demand was crazy high, and processors would sell for a lot. My initial hunch that the high prices sort of told Nvidia people would just pay them if they had to was sort of right, but it missed an important factor; design. Nvidia didn’t simply take the opportunity to charge high prices. They designed a very expensive chip that *had* to be sold at high prices because they were banking on the market being happy to accept these prices.

Unfortunately, the end result is that since these products are so expensive to manufacture, there’s little that can be done about the prices. Perhaps if they don’t sell well Nvidia will be willing to shrink their margins to move the product and in that case we could see the prices come down, but for that we will have to vote with our wallets.

In the meantime, all we can do is hope that on November 3^rd, AMD releases some very compelling and competitive products. While AMD is using a very similar process node to Nvidia (TSMC N5) their dies are about half as large because they’ve focused on getting as much performance as they can per transistor, and they’ve removed a lot of the things that don’t scale well into separate dies. The top N31 die is rumoured to be around 300mm²while rivaling the 4090 in performance. In a little over a month now, I suppose we’ll find out how accurate this is!

Now make sure to go follow me on twitter and give me a follow and all that good stuff.

Search This Blog

Meyer Tech Rants