Shortly after Amazon CEO Andy Jassy introduced AWSβs groundbreaking $50 billion funding take care of OpenAI, Amazon invited me on a personal tour of the chip growth lab on the coronary heart of the deal, at (principally*) its personal expense.Β
Trade specialists are watching Amazonβs Trainium chip, created at that facility, for its implications for lower-cost AI inference and, doubtlessly, a dent in Nvidiaβs close to monopoly.Β Β
Curious, I agreed to go.Β Β
My tour guides for the day have been the labβs director, Kristopher King (pictured under proper) and director of engineering Mark Carroll (under left), in addition to the workforceβs PR one that organized the go to, Doron Aronson (pictured with yours actually later within the story).Β
AWS has been Anthropicβs main cloud platform for the reason that AI labβs early daysΒ β a relationship vital sufficient to outlive Anthropic later including Microsoft as a cloud companion as effectively, and Amazonβs rising partnership with OpenAI.
The OpenAI deal makes AWS the unique supplier of the mannequin makerβs new AI agent builder, Frontier, which might change into an vital a part of OpenAIβs enterprise if brokers change into as huge as Silicon Valley thinks they are going to.Β Weβll see if that exclusivity stands precisely as introduced. The Monetary Occasions reported this week that Microsoft could consider OpenAIβs take care of Amazon violates its personal take care of OpenAI, specifically with Redmond gaining access to all of OpenAIβs fashions and tech.
What makes AWS so interesting to OpenAI? As a part of this deal, the cloud big has agreed to produce OpenAI with 2 gigawatts of Trainium computing capability. It is a big dedication, on condition that Anthropic and Amazonβs personal Bedrock service are already consuming Trainium chips quicker than Amazon can produce them.Β
Techcrunch occasion
San Francisco, CA
|
October 13-15, 2026
There are 1.4 million Trainium chips deployed throughout all three generations, and Anthropicβs Claude runs on over 1 million of the Trainium2 chips deployed, the corporate mentioned.
Itβs value noting that whereas Trainium was initially geared towards quicker, cheaper mannequin coaching (a much bigger precedence a few years in the past), itβs now tuned and used for inference as effectively. Inference β the method of really operating an AI mannequin to generate responses β is at the moment the largest efficiency bottleneck within the trade.Β
Working example: Trainium2 handles the vast majority of the inference visitors on Amazonβs Bedrock service, which helps the constructing of AI functions by Amazonβs many enterprise clients and permits the apps to make use of a number of fashions.
βOur buyer base is simply increasing as quick as we are able to get capability on the market,β King mentioned. βBedrock could possibly be as huge as EC2 someday,β he added, referring to AWSβs behemoth compute cloud service.Β
Trainium vs. Nvidia
Past providing an alternative choice to Nvidiaβs backlogged, hard-to-acquire GPUs, Amazon says its new chips operating on its new specialty Trn3 UltraServers value as much as 50% much less to run for comparable efficiency than utilizing traditional cloud servers.Β
Together with Trainium3, launched in December, this AWS workforce additionally constructed new Neuron switches, and Carroll says that combo is transformative.
βWhat that offers us is one thing large,β Carroll mentioned. The switches enable each Trainium3 chip to speak to each different chip in a mesh configuration, decreasing latency. βThatβs why Trainium3 is breaking all types of data,β notably in βvalue per energy,β he mentioned.Β
When trillions of tokens a day are concerned, such enhancements add up.Β Β
In truth, Amazonβs chip workforce was lauded by Apple in 2024. In a uncommon second of openness for the secretive firm, Appleβs director of AI publicly described the way it used one other of the workforceβs chips β Graviton,Β a low-power, ARM-based server CPU and the primary breakout chip this workforce designed. Apple additionally lauded Inferentia β a chip particularly designed for inference β and gave a nod to Trainium, which was new on the time.Β
These chips signify the traditional Amazon playbook: See what individuals need to purchase, then construct an in-house different that competes on value.Β
The catch for chips, traditionally, has been switching prices. Purposes written for Nvidiaβs chips should be re-architected to work with others β a time-consuming course of that daunts builders from switching.
However the AWS chip workforce proudly instructed me that Trainium now helps PyTorch, a well-liked open supply framework for constructing AI fashions. That features most of the ones hosted on Hugging Face, an enormous library the place builders share open supply fashions.
The transition, Carroll instructed me, requires βprincipally a one-line change, after which recompile, after which run on Trainium.βΒ In different phrases, Amazon is trying to chip away at Nvidiaβs market dominance wherever potential.
AWS has additionally this month introduced a partnership with Cerebras Programs, integrating that firmβs inference chip on servers operating Trainium for what Amazon guarantees can be superpowered, low-latency AI efficiency.Β
However Amazonβs ambitions transcend the chips themselves. It additionally designs the server that hosts the chips. Moreover the networking elements, this workforce has designed βNitro,β a hardware-software combo that gives virtualization tech (which permits many situations of software program to run individually on the identical server); new state-of-the-art liquid cooling expertise; and the server sleds (pictured under) that host this gear.Β
All of that’s to regulate value and efficiency.Β
Working 24/7 on the βbring-upβΒ
Amazonβs {custom} chip-designing unit was born when the cloud big purchased Israeli chip designer Annapurna Labs in January 2015 for about $350 million. So this workforce has now had greater than 10 years designing chips for AWS.Β The unit has retained its Annapurna roots and identify βΒ its emblem is in all places within the workplace.Β
This chip lab is situated in a shiny, chrome-windowed constructing in Austinβs upscale βThe Areaβ district, a walkable space crammed with retailers and eating places thatβs generally referred to as Austinβs Silicon Valley.Β
The places of work have your traditional tech company vibe: desks in cubicles, gathering spots, and convention rooms. However tucked away in the back of a excessive ground within the constructing is the precise lab, with sweeping views of town.Β Β
The shelving-filled lab, in regards to the measurement of two massive convention rooms, is a loud industrial area due to the followers on the tools. It appears to be like like a cross between a highschool store class and a Hollywood set for a high-end lab, besides the engineers are wearing denims, not white lab coats.
Notice that this isn’t the place the chips are manufactured, so no white hazmat fits have been crucial. The Trainium3 is a state-of-the-art 3-nanometer chip, produced by TSMC, arguably the chief in 3-nanometer manufacturing, with different chips produced by Marvell.Β
However that is the room the place the magic of the βbring-upβ happens.Β Β
βA silicon bring-up is once you get the chip for the primary time, and itβs like a giant in a single day get together. You keep right here, like a lock-in,β King explains. After 18 months of labor, the chip is activated for the primary time to confirm it really works as designed.Β The workforce even filmed among the Trainium3 bring-up and posted it on YouTube.
Spoiler alert: Itβs by no means problem-free.Β Β
For Trainium3, the prototype chip was initially air-cooled, like earlier variations. The present chip is now liquid-cooled, which affords vitality benefits and was fairly an engineering feat.
Through the bring-up, the scale for the way the chip hooked up to the air-cooling warmth sink have been off, so the chip couldnβt be activated.Β
Unfazed, the workforce βinstantly acquired a grinder and simply began grinding off the metallic,β King mentioned. As a result of they didnβt need the noise disrupting the bring-up pizza get together environment, they snuck off and did the grinding in a convention room.Β Β
Staying up all evening and fixing issues βis what silicon bring-up is all about,β King mentioned.Β
The lab even has a welding station, the place {hardware} lab engineer and grasp welder Isaac Guevara demonstrated welding tiny built-in circuit elements by means of a microscope. That is such insanely tough work that senior chief Carroll brazenly admitted he couldnβt do it, to the guffaws of Guevara and the remainder of the engineers within the room.Β
The lab additionally comprises each custom-made and business instruments for testing and analyzing points with chips.Β Right hereβs sign engineer Arvind Srinivasan demonstrating how the lab checks every tiny part on the chip:
Sleds are the star of the labΒ
However the star of the lab is a complete row showcasing every technology of the βsledsβ the workforce designed.Β
Sleds are the trays that home the Trainium AI chips, Graviton CPU chips, and supporting boards and elements. Stack them collectively on a rack with the networking part, additionally custom-designed by this workforce, and also you get the techniques which are on the coronary heart of Anthropic Claudeβs success.Β
Right hereβs the sled that was proven off in the course of the AWS re:invent convention in December:Β
Confirmed by Anthropic and OpenAI
I anticipated my guides to crow in regards to the OpenAI deal in the course of the tour. However they didnβt.Β
The reticence might have been associated to the aforementioned potential authorized haze that may grasp over the deal. However the sense I acquired was that these boots-on-the-ground engineers (who’re at the moment designing the following model, Trainium4) havenβt had a lot probability to work with OpenAI but. Their day-to-day work has up to now been centered on Anthropicβs and Amazonβs wants.
Presently, the largest chunk of Trainium2 chips is deployed in Mission Rainier β one of many worldβs largest AI compute clusters β which went dwell in late 2025 with 500,000 chips. Itβs utilized by Anthropic.Β
However there was a wall monitor in the principle workplace displaying a quote about how OpenAI can be utilizingΒ Trainium. The satisfaction was there, if refined.Β Β
Along with this lab, the workforce additionally has its personal personal information middle for high quality and testing functions. A brief drive away, it doesnβt run buyer workloads, so itβs housed at a co-location facility, not an AWS information middle.
Safety is tight: There are strict protocols to enter the constructing and to entry Amazonβs space inside.
The information middleβs cooling system is so loud that earplugs are obligatory, and the air is thick with the acrid odor of heated metallic. Itβs not a nice place for the common individual to hang around.Β
At this information middle, there are rows and rows of servers crammed with sleds that combine all of Amazonβs latest {custom} chips: Graviton CPU, liquid-cooled Trainium3, Amazon Nitro, all fortunately computing away. The liquid runs on a closed system, that means it’s reused, which also needs to assist cut back the environmental affect, the engineers mentioned.Β
Right hereβs what a present Trn3 UltraServer appears to be like like: A number of sleds are on prime and backside, with the Neuron switches within the center. {Hardware} growth engineer David Martinez-Darrow is seen right here performing upkeep on a sled:
Whereas consideration on the workforce has at all times been excessive, the scrutiny has actually ratcheted up as of late.Β
Amazon CEO Andy Jassy retains a detailed eye on this lab, publicly bragging about its merchandise like a proud dad. In December, he mentioned Trainium was already a multibillion-dollar enterprise for AWS and referred to as it one piece of AWS tech heβs most enthusiastic about. He additionally gave the chip a shout-out when saying the OpenAI settlement.Β Β
The workforce feels the strain, too. Engineers will work 24/7 for 3 to 4 weeks round every bring-up occasion to repair any points so the chips may be mass-produced and put into information facilities.
βItβs crucial that we get as quick as potential to show that itβs really going to work,β Carroll mentioned. βThus far, weβve been doing very well.βΒ
*Disclosure: Amazon supplied airfare and coated the price of one evening at a neighborhood resort. Honoring its Management Precept of Frugality, this was a back-of-the-plane center seat and a modest room. Trendster picked up the opposite related journey prices like Ubers and baggage charges. (Sure, I checked a bag for an in a single day journey. Iβm excessive upkeep that means.)Β





