Meta, hell-bent on catching as much as rivals within the generative AI area, is spending billions by itself AI efforts. A portion of these billions goes towards recruiting AI researchers. However an excellent bigger chunk is being spent growing {hardware}, particularly chips to run and practice Meta’s AI fashions.
Meta unveiled the most recent fruit of its chip dev efforts right this moment, conspicuously a day after Intel introduced its newest AI accelerator {hardware}. Known as the “next-gen” Meta Coaching and Inference Accelerator (MTIA), the successor to final yr’s MTIA v1, the chip runs fashions together with for rating and recommending show advertisements on Meta’s properties (e.g. Fb).
In comparison with MTIA v1, which was constructed on a 7nm course of, the next-gen MTIA is 5nm. (In chip manufacturing, “course of” refers back to the dimension of the smallest element that may be constructed on the chip.) The subsequent-gen MTIA is a bodily bigger design, filled with extra processing cores than its predecessor. And whereas it consumes extra energy — 90W versus 25W — it additionally boasts extra inside reminiscence (128MB versus 64MB) and runs at the next common clock velocity (1.35GHz up from 800MHz).
Meta says the next-gen MTIA is at the moment stay in 16 of its knowledge middle areas and delivering as much as 3x general higher efficiency in comparison with MTIA v1. If that “3x” declare sounds a bit obscure, you’re not unsuitable — we thought so too. However Meta would solely volunteer that the determine got here from testing the efficiency of “4 key fashions” throughout each chips.
“As a result of we management the entire stack, we will obtain higher effectivity in comparison with commercially accessible GPUs,” Meta writes in a weblog submit shared with Trendster.
Meta’s {hardware} showcase — which comes a mere 24 hours after a press briefing on the corporate’s varied ongoing generative AI initiatives — is uncommon for a number of causes.
One, Meta reveals within the weblog submit that it’s not utilizing the next-gen MTIA for generative AI coaching workloads for the time being, though the corporate claims it has “a number of packages underway” exploring this. Two, Meta admits that the next-gen MTIA received’t exchange GPUs for operating or coaching fashions — however as a substitute will complement them.
Studying between the traces, Meta is transferring slowly — maybe extra slowly than it’d like.
Meta’s AI groups are nearly actually underneath stress to chop prices. The corporate’s set to spend an estimated $18 billion by the tip of 2024 on GPUs for coaching and operating generative AI fashions, and — with coaching prices for cutting-edge generative fashions ranging within the tens of tens of millions of {dollars} — in-house {hardware} presents a lovely various.
And whereas Meta’s {hardware} drags, rivals are pulling forward, a lot to the consternation of Meta’s management, I’d suspect.
Google this week made its fifth-generation customized chip for coaching AI fashions, TPU v5p, usually accessible to Google Cloud clients, and revealed its first devoted chip for operating fashions, Axion. Amazon has a number of customized AI chip households underneath its belt. And Microsoft final yr jumped into the fray with the Azure Maia AI Accelerator and the Azure Cobalt 100 CPU.
Within the weblog submit, Meta says it took fewer than 9 months to “go from first silicon to manufacturing fashions” of the next-gen MTIA, which to be truthful is shorter than the standard window between Google TPUs. However Meta has a whole lot of catching as much as do if it hopes to attain a measure of independence from third-party GPUs — and match its stiff competitors.