Because Apple isn't playing the same game as everyone else. They have the money and clout to buy out TSMCs bleeding-edge processes and leave everyone else with the scraps, and their silicon is only sold in machines with extremely fat margins that can easily absorb the BOM cost of making huge chips on the most expensive processes money can buy.
Bleeding edge processes is what Intel specializes in. Unlike Apple, they don’t need TSMC. This should have been a huge advantage for Intel. Maybe that’s why Gelsinger got the boot.
> Bleeding edge processes is what Intel specializes in. Unlike Apple, they don’t need TSMC.
Intel literally outsourced their Arrow Lake manufacturing to TSMC because they couldn't fabricate the parts themselves - their 20A (2nm) process node never reached a production-ready state, and was eventually cancelled about a month ago.
Intel is maybe a year or two behind TSMC right now. They might or might not catch up since it is a moving target, but I dont think there is anything TSMC is doing today that Intel wont be doing in the near future.
These days, Intel merely specializes in bleeding processes. They spent far too many years believing the unrealistic promises from their fab division, and in the past few years they've been suffering the consequences as the problems are too big to be covered up by the cost savings of vertical integration.
Intel's foundry side has been floundering so hard that they've resorted to using TSMC themselves in an attempt to keep up with AMD. Their recently launched CPUs are a mix of Intel-made and TSMC-made chiplets, but the latter accounts for most of the die area.
I'm not certain this is quite as damning as it sounds. My understanding is that the foundry business was intentionally walled off from the product business, and that the latter wasn't going to be treated as a privileged customer.
no, in fact, it sounds even more damning because client side was able to pick whatever was best on the market, and it wasn't intel. Client side could to learn and customize their designs to use another company's processes (this is an extremely hard thing to do by the way) faster than intel foundry could even get their pants up in the morning.
Intel foundry screwed up so badly that Nokia's server division was almost shut down because of Intel Foundry's failure. (imagine being so bad at your job, that your clients go out of business) If Intel client side chose to use Foundry, there just wouldn't be any chips to sell.
Transistor IO logic scaling died a while ago, which is what prompted AMD to go with a chiplet architecture. Being on a more advanced process does not make implementing an 512-bit memory bus any easier for Apple. If anything, it makes it more expensive for Apple than it would be for Intel.
Everyone else wants configurable RAM that scales both down (to 16GB) and up (to 2TB), to cover smaller laptops and bigger servers.
GPUs with soldered on RAM has 500GB/sec bandwidths, far in excess of Apples chips. So the 8GB or 16GB offered by NVidia or AMD is just far superior at vid o game graphics (where textures are the priority)
> GPUs with soldered on RAM has 500GB/sec bandwidths, far in excess of Apples chips.
Apple is doing 800GB/sec on the M2 Ultra and should reach about 1TB/sec with the M4 Ultra, but that's still lagging behind GPUs. The 4090 was already at the 1TB/sec mark two years ago, the 5090 is supposedly aiming for 1.5TB/sec, and the H200 is doing 5TB/sec.
HBM is kind of not fair lol. But 4096-line bus is gonna have more bandwidth than any competitor.
It's pretty expensive though.
The 500GB/sec number is for a more ordinary GPU like the B580 Battlemage in the $250ish price range. Obviously the $2000ish 4090 will be better, but I don't expect the typical consumer to be using those.
But an on-package memory bus has some of the advantages of HBM, just to a lesser extent, so it's arguably comparable as an "intermediate stage" between RAM chips and HBM. Distances are shorter (so voltage drop and capacitance are lower, so can be driven at lower power), routing is more complex but can be worked around by more layers, which increases cost but on a significantly smaller area than required for dimms, and the dimms connections themselves can hurt performance (reflection from poor contacts, optional termination makes things more complex, and the expectations of mix-and-match for dimm vendors and products likely reduce fine tuning possibilities).
There's pretty much a direct opposite scaling between flexibility and performance - dimms > soldered ram > on-package ram > die-interconnects.
The question is why Intel GPUs, which already have soldered memory, aren't sold with more of it. The market here isn't something that can beat enterprise GPUs at training, it's something that can beat desktop CPUs at inference with enough VRAM to fit large models at an affordable price.