Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way

Stanford adjunct professor and efficiently exited founder Zain Asgar simply raised an $80 million Sequence A for a startup that remedy the AI inference bottleneck downside in an astute approach. The spherical was led by Menlo Ventures.

The corporate, Gimlet Labs, has created what it claims is the primary and solely “multi-silicon inference cloud” which is software program that enables an AI workload to be concurrently run throughout numerous forms of {hardware}. It may well cut up an AI app’s work throughout each conventional CPUs and AI-tuned GPUs, in addition to high-memory programs.

“We principally run throughout no matter completely different {hardware} that’s accessible,” Asgar instructed Trendster.

A single agent might chain collectively a number of steps, and every “requires completely different {hardware}: Inference is compute-bound; decode is memory-bound; and gear calls are network-bound,” writes lead investor, Menlo’s Tim Tully, in a weblog put up concerning the funding.

No chip but does all of it, however as new {hardware} will get rolled out, and getting old GPUs get redeployed, “the multi-silicon fleet is prepared — it’s simply lacking the software program layer to make it work.” That’s what Tully believes Gimlet Labs provides.

If the present deploy-more-compute development continues, McKinsey estimates information heart spending will tally almost $7 trillion by 2030. Asgar says that apps are solely utilizing the prevailing {hardware} already deployed “someplace between 15 to 30 %” of the time.

“One other approach to consider this: you’re losing lots of of billions of {dollars} since you’re simply leaving idle sources,” he mentioned. “Our aim was principally to strive to determine how one can get AI workloads to be 10x extra environment friendly than ever, right this moment.”

Techcrunch occasion

San Francisco, CA
|
October 13-15, 2026

So he and his cofounders, Michelle Nguyen, Omid Azizi, and Natalie Serrino, set about constructing orchestration software program that slices up agentic workloads in order that they are often simultaneous unfold throughout all types of {hardware}.

Gimlet Labs claims it reliably speeds AI inference up by 3x to 10x for a similar price and energy. Gimlet says it could even slice the underlying mannequin in order that it runs throughout completely different architectures, utilizing the perfect chip for every portion of the mannequin.

The corporate has already partnered with chip makers NVIDIA, AMD, Intel, ARM, Cerebras and d-Matrix.

Gimlet’s product, delivered both as software program or by means of an API to its personal Gimlet Cloud, isn’t for the rank-and-file AI app developer. It’s for the most important AI mannequin labs and information facilities.

The corporate publicly launched in October with, it mentioned, eight-figure revenues out of the gate (so at the very least $10 million). Asgar mentioned that his buyer base has greater than doubled within the final 4 months and now features a main mannequin maker and a particularly massive cloud computing firm, though he declined to call them.

The cofounders had beforehand labored collectively at Pixie, a startup that created an open supply observability software for Kubernetes. Pixie was acquired by New Relic in 2020, simply two months after it launched with a $9 million Sequence A led by Benchmark. (Pixie’s tech is now a part of the open supply org that oversees Kubernetes.)

After Asgar randomly bumped into Tully a few yr in the past and likewise obtained angel investments from Stanford professors, VCs began calling. After launch, a time period sheet landed on Asgar’s desk. When VCs heard Asgar was taking a look at provides, “we bought a fairly large swarm of funding,” and the spherical was shortly oversubscribed, he mentioned.

With the earlier seed, the startup has now raised a complete of $92 million, together with from a slew of angels like Sequoia’s Invoice Coughran, Stanford Professor Nick McKeown, former CEO of VMware Raghu Raghuram and Intel CEO Lip-Bu Tan. The corporate presently employs 30 individuals.

Different buyers embrace Manufacturing unit, who led the seed, Eclipse Ventures, Prosperity7 and Triatomic.