Nivida isn't as greedy as you think
No, they just misread the room
Nvidia this past week announced the 4080 12GB (hereafter the
4080-12) for an eye-watering $899 and the 4080 16GB (4080-16) for $1199.
But wait, the 2080 was $699 and the 3080 was $699, why are
we suddenly jumping to $899 and $1199 for 4080, a 29% and 71% price increase respectively!
This isn’t even the 4080ti!
The instinctive conclusion is that Nvidia’s being greedy. I’ve
speculated for some time before the announcement that we could see prices go up
given Nvidia has seen what people are willing to spend if they have no other
choice since gamers still bought GPUs during the mining rush. I pointed to how
the 20 series (Turing) increased prices pretty significantly even after the
2018 mining spike was over. I thought this would be another case of something
similar.
Upon further reflection, I don’t think this is the case. I
think it’s possible that Nvidia simply could not get away with charging much
less if they want to maintain reasonable profit margins.
Nvidia likes to use old, inexpensive product nodes
Broadly speaking, there are generational manufacturing nodes
that processors are made on. While the specifics will differ a bit from company
to company, the generations are roughly consistent. The last few generations
were each sort of around 14nm, 10nm, 7nm, and 5nm, with each generation
bringing a pretty significant improvement to transistor density. Higher
transistor density generally means smaller processors, or more performance
within a size. Other nodes that you may hear about like 12nm, 8nm, or 6nm are
sort of optimizations or customizations of the larger generations that bring
smaller improvements.
Again, broadly speaking, newer nodes tend to be more
expensive while older nodes come down in price over time.
RTX 30 series (Ampere) was announced Sept 2020. It used
Samsung 8nm (an optimization of 10nm) while AMD had released the Radeon VII
nearly two years earlier in February of 2019 on TSMC 7nm and the Apple A14
using TSMC 5nm was released only 2 weeks later. Nvidia was very conservative
with this node, which is likely why they could price the large GA102 at only
$699.
The 20 series in late 2018 was on TSMC 12nm, an optimization
of 16nm (not quite as dense as Intel’s 14nm launched in 2015) Apple released a
7nm mobile CPU in the same month, and AMD released the 7nm Radeon VII only a
few months after.
The 10 series, released in 2016 with the 1080, launched on
TSMC 16nm was actually the most advanced node for its time in this retrospective.
Though it was notably behind Intel’s 14nm class which at the time held process
leadership (though Intel being stuck on that node for half a decade saw it age
poorly) it was TSMC’s best. However, this was mitigated by the fact that the
dies Nvidia made on that node were much smaller. The TU102 (2080ti) was a
massive 754mm2 but in 2016, the 1080 was only 314mm2, and
2017’s 1080ti was 471mm2. Small dies saved Nvidia’s prices then.
Ada stopped the inexpensive silicon trend
As we’ve discussed, for several generations, Nvidia has intentionally
made their processors either with small dies or on older, inexpensive nodes to
keep costs down. This all changed with Ada Lovelace GPUs, which used TSMC N4,
which is a customization of TSMC N5, and is at present the densest, most
advanced, and notably most expensive node available today, with the next
generation TSMC N3 delayed and not yet shipping as of writing.
However, unlike the last time they went with a modern node,
these things aren’t small. The AD102 (4090) is a massive die, about 608mm2.
Ok, so… well, large dies, expensive node (and keep in mind TSMC
has been increasing prices lately because of their market dominance) something
somewhere has to give. The necessary result is a pretty expensive product.
Plus something (sort of) outside Nvidia’s control
It’s fun just crapping on Nvidia, but the fact that GDDR7
isn’t ready yet is actually a pretty massive problem for them. Normally, as GPUs
get more powerful, we get faster RAM to keep them fed. Fast RAM alone for the
sake of fast RAM doesn’t do a lot on its own, its job is to keep the cores busy
working on fresh data. The faster the cores, the more bandwidth you need. The
2080 was still as fast as the 1080ti with a smaller memory bus (256 vs 352)
because it used faster GDDR6 instead of GDDR5. The 30 series used expensive
GDDR6X to keep things coming… while Lovelace uses GDDR6X as well. When you don’t
have faster RAM to work with, the solution (as AMD found with the RDNA2 desktop
GPUs) is to add a bunch of cache to keep the cores fed. Cache is large and
expensive and causes the die size to balloon. This means far more transistors
for the given performance.
I say “sort of” outside Nvidia’s control as Radeon has a
solution to this problem; they’ve put the cache in adjacent dies to keep the GPU
fed, allowing them to scale performance without bloating the size of the GPU
die, as detailed here https://www.angstronomics.com/p/amds-rdna-3-graphics. I
guess Nvidia didn’t think of this in time.
Ok, so Ada is expensive to manufacture… Why did Nvidia do this?
Well, the silicon shortage was going on for quite a while. Basically,
immediately after Ampere was announced, and through the entire development
phase of Ada, all market indicators suggested that demand was crazy high, and processors
would sell for a lot. My initial hunch that the high prices sort of told Nvidia
people would just pay them if they had to was sort of right, but it missed an
important factor; design. Nvidia didn’t simply take the opportunity to charge
high prices. They designed a very expensive chip that *had* to be sold at high
prices because they were banking on the market being happy to accept these
prices.
Unfortunately, the end result is that since these products
are so expensive to manufacture, there’s little that can be done about the
prices. Perhaps if they don’t sell well Nvidia will be willing to shrink their
margins to move the product and in that case we could see the prices come down,
but for that we will have to vote with our wallets.
In the meantime, all we can do is hope that on November 3rd,
AMD releases some very compelling and competitive products. While AMD is using
a very similar process node to Nvidia (TSMC N5) their dies are about half as
large because they’ve focused on getting as much performance as they can per
transistor, and they’ve removed a lot of the things that don’t scale well into
separate dies. The top N31 die is rumoured to be around 300mm2 while
rivaling the 4090 in performance. In a little over a month now, I suppose we’ll
find out how accurate this is!
Now make sure to go follow me on twitter and give me a follow
and all that good stuff.
Comments
Post a Comment