This chip startup just raised $135M on a bet that AI’s biggest bottleneck isn’t compute — it’s memory

Each time you ask ChatGPT a query, your request triggers a knowledge relay race. Info leaves reminiscence, passes by a CPU for preprocessing, travels to a GPU for heavy computation, after which makes its manner again — and that whole journey repeats for each single phrase the AI generates.

The bottleneck is structural — it means routing by a number of the costliest and power-intensive chips within the trade on each single request. That inefficiency is strictly what XCENA, a startup with places of work in South Korea and the U.S., is making an attempt to unravel. The four-year-old startup has designed a chip that locations compute capabilities a lot nearer to DRAM — the quick, short-term reminiscence chips that retailer information a processor is actively utilizing — permitting routine information operations to be dealt with close to reminiscence, with out the pricey spherical journeys between CPUs, GPUs, and reminiscence.

If it really works at scale, the implications for AI infrastructure prices may very well be vital, which largely explains investor enthusiasm across the firm. Certainly, XCENA simply raised $135 million in a Collection B at a valuation of $570 million, bringing its complete raised to $185 million.

XCENA CEO Jin Kim co-founded the startup in 2022 alongside CTO Dohun Kim and CPO Harry Juhyun Kim, all veterans of Samsung and SK Hynix, the reminiscence giants that offer chips powering Nvidia’s GPUs. “CPUs and GPUs have each gotten smarter over the many years. Reminiscence by no means did. XCENA desires to alter that,” Jin Kim stated in an interview with Trendster. “The latest rise in reminiscence costs and associated shares factors to a broader shift in AI infrastructure towards memory-centric architectures,” he added. (This month, the three corporations that dominate the worldwide reminiscence chip market — Samsung, SK Hynix, and Micron — every crossed a trillion-dollar valuation for the primary time.)

XCENA is betting its enterprise on the thesis that “inference isn’t only a compute downside; it’s more and more a reminiscence scaling downside,” stated Kim.

XCENA’s chip, the MX1, connects to the CPU by CXL (Compute Specific Hyperlink) — basically a devoted categorical lane between the processor and reminiscence — processing information earlier than it ever wants to go away the reminiscence module. It brings compute to the information, not the opposite manner round. The corporate claims that what used to require 10 servers may doubtlessly run on only one.

“Whereas GPUs excel at matrix multiplication — the heavy math behind AI mannequin coaching — a lot of the encircling information orchestration, together with preprocessing, KV cache administration (the system that shops prior dialog context so a mannequin doesn’t must reprocess it), and information caching, nonetheless runs on CPUs. Our chip handles these duties immediately inside the reminiscence module itself,” Kim stated.

Demand for reminiscence options has surged for the reason that second half of final 12 months, and the corporate believes the timing is working in its favor.

Conversations with a number of world reminiscence distributors are in early phases, although Kim declined to call them. The corporate’s very best prospects are hyperscalers spending tens of billions a 12 months on AI infrastructure, the place even a small acquire in reminiscence effectivity can imply a whole lot of hundreds of thousands in financial savings.

The MX1 continues to be a prototype. Mass manufacturing chips are scheduled to roll off Samsung’s foundry strains by the tip of 2026, with the corporate anticipating to generate income beginning in 2027.

Whereas neural processing unit (NPU) makers are competing to problem Nvidia for coaching workloads, XCENA is focusing on the memory-intensive layer that sits beneath all of it.

XCENA’s closest rivals embody Astera Labs and Marvell, each Nasdaq-listed corporations engaged on next-generation reminiscence connectivity. Marvell is a big, established participant already working in the identical house, Kim stated, including that the differentiator comes all the way down to mental property. “We have now 1000’s of cores,” Kim stated. Based mostly on public specs, Marvell’s method depends on a handful of general-purpose cores by comparability.

These cores are constructed on RISC-V — an open supply chip design blueprint — and optimized particularly for information processing, with every core intentionally stored small and environment friendly. Past the cores themselves, XCENA designs its personal inner reminiscence hierarchy, interconnect bus, and DRAM controller — a stage of vertical integration that the majority chip corporations, together with bigger rivals, sometimes outsource.

Seoul-based VC companies Atinum and IMM Funding co-led the Collection B spherical, together with Corstone Asia and current buyers SBI Funding and Mirae Asset Capital. The corporate, which has greater than 90 workers throughout places of work in Pangyo, a tech hub outdoors Seoul, and Sunnyvale, can also be in conversations with worldwide buyers about extra funding.

If you buy by hyperlinks in our articles, we might earn a small fee. This doesn’t have an effect on our editorial independence.