Transforming AI: d-Matrix’s Pivotal Moments in Pursuit of Gen AI Inference At Scale

d-Matrix founders: Sudeep Bhoja — CTO (left); Sid Sheth — President & CEO (right)

You’ve probably heard the adage, “hardware is hard.” That’s definitely true. When you’re making chips for the rapidly changing world of AI, things really get hard early on. The AI chip space was plenty crowded when we started (100 start-ups) and d-Matrix recently celebrated five years of successfully anticipating the market need, while working as quickly as possible to get there. But like any successful start-up, there were pivots along the way. I wanted to share our journey at d-Matrix over the past five years, and how we’ve gone from being told, “this sounds crazy, you are late and why would you want to leave what you have,” to being told now that, “you have prescient vision,” as the generative AI market has boomed. By the way, none of those descriptions really describe us.

Year Zero (2019): “Taking the leap”

When we left Inphi to start d-Matrix, it’s not like we were in our 20’s and 30’s — we were well into our 40’s. We had young teenage kids in various stages of schooling and starting from scratch seemed like a foolhardy thing to do.

However, we had the distinct advantage of seeing the AI wave in its early infancy with the rapid adoption of Inphi’s datacenter interconnect products in distributed AI compute applications across all the large cloud service providers. In speaking with these customers it was becoming clear that AI adoption would only accelerate over time, and specifically the AI inference opportunity would dwarf AI training.

I came from a family of generational “Gujarati Bania” entrepreneurs, my grandfathers, father, uncles were all entrepreneurs and ‘working for someone’ was generally frowned upon. Seeing this massive opportunity in front of us triggered my DNA and we took the leap outside of our comfort zone.

Sid Sheth in his first office at d-Matrix, 3 months after formation in 2019

Year One (2020): “Two weeks from running out of cash”

We started building, then 2020 brought a global pandemic, Sudeep had recently lost his father, the team was working from home for the first time, there were #BlackLivesMatter movements around the country, and to top it, the worst fire season in California that led to mass evacuations close to home. There was a feeling of unease. Through it all, our former company was acquired for $10B while we had a “near-death experience” where we almost ran out of money at d-Matrix. But we focused on what we could control, our team grew to 10 people, and we successfully built and packaged our first chip in 12 months, complete with a software stack that could run live workloads.

Sudeep was adamant that we have the software working in tandem with the hardware from the beginning, and we’ve kept that focus on software and hardware throughout our growth. We initially used a charge based analog In Memory Compute architecture with a single slope analog to digital converter — but we quickly concluded that it was challenging to fit converters on each bitline and the effort was not worth the reward. We had to pivot.

There was a lot of uncertainty on the horizon. But we continued digging into the technology-market fit and the product-market fit problem relentlessly. This is where we ran into Transformer based workloads and the idea of using a digital approach to in-memory compute with SRAM bit cells combined with adders in a custom circuit fabric to make inference more efficient. So we quickly pivoted to an all digital solution that could scale and not sacrifice accuracy, predictability and programmability — a must for the inference at-scale datacenter market. It was a pretty crazy time!

We knew we needed at least an additional $2 million to get silicon manufactured for a customer demo, but were struggling to get investors to give us that extra cash. After much struggle, and coming within two weeks of running out of cash, we had an a-ha moment “What kind of chip company were we building with an additional $2m? Why don’t we go ask for $40 million and show a big vision to match the size of the opportunity we are going after?” That was one early lesson — don’t sell yourself short when you have the opportunity to make a bigger impact. Sometimes it’s easier to get $40 million than to get $2 million when you are serving a large market opportunity — you need to signal as such to investors. We pivoted our plans, became bold and decided to swing for the fences.

d-Matrix first POC chip — Nighthawk

Year Two (2021): “Wait, I think we are onto something”

Our new bold approach to fundraising, focus on Transformers and digital in-memory compute technology for inference, yielded results quickly. While our first Nighthawk chip wasn’t a home run (SRAM Analog in-memory compute), it showcased the grit and experience of the team’s ability to execute through adversity. All this led to a $40M Series A from Playground Global and Microsoft M12, and we realized we might be onto something. We felt like we were finally catching some wind.

d-Matrix labs
d-Matrix first Nighthawk demo for investors in 2021 at their Cupertino labs

Year Three (2022): “Pre-ChatGPT”

Now that we had set a course and made a bet, it was time to start the process of building a product. Our first product was built for non-generative Transformers (BERT, T5) being all the rage in early 2022. However, GPT3 had been launched and there was a lot of chatter about the “generative” capabilities of these new models. We had numerous conversations with our customers only to realize that we needed to pivot yet again. Our product architecture needed to accommodate these generative abilities of Transformers. By now, the team had become familiar with pivots and had been tested multiple times. So, we did what we do best — reacted quickly and pivoted to a new generative AI inference architecture. This is what became Corsair, our first chiplet based PCIe accelerator for Generative AI inference. Then ChatGPT took the world by storm that November. Our world, like everyone else’s, was about to change.

d-Matrix Jayhawk I and Jayhawk II chips
d-Matrix Jayhawk I and Jayhawk II chips

Year Four (2023): “Holy smokes moment”

The launch of ChatGPT was a “holy smokes” moment for d-Matrix, to put it politely. We realized that not only were we on to something, we were on to something really, really big. It was time to accelerate our plans and raise more capital to double down on our bet. But as the world tried to figure out what the post-pandemic “new normal” was, VC funding dried up across the board as the Fed hiked rates, and almost 3,000 startups failed in a single year — what many called a mass extinction event for startups.

Amidst this headwind, we went out and raised $110 million for our Series B round. At least by this point, we were looking less like crazy and more prescient. It only took us four years! But we got through the trial by fire and it seemed we now had the capital to scale our dreams and ambitions.

Sid Sheth (President & CEO) holding up their Nighthawk and Jayhawk chips
Sid Sheth (President & CEO) holding up their Nighthawk and Jayhawk chips

Year Five and Beyond (2024-): “The world comes to you”

d-Matrix turned five last month, and the team’s relentless faith and grit have paid off. Corsair promises to be the transformative platform for generative AI inference in the datacenter, featuring 2GB high-density SRAM, 8 chiplets, blazing 10 PetaOPS performance, 256G of DDR, all on a single card and a full-stack software suite built using open source frameworks, MLIR compiler stack and libraries. We are slowly but surely getting there, long way still to go, but you’ve just got to keep your head down, listen, stay humble, stay paranoid, execute and hopefully the world will come to you over time.

I want to take a moment to thank all of our d-Matrix employees. Beyond our Silicon Valley headquarters, we now have design centers in Bangalore, Sydney, Seattle and Toronto to capitalize on fast-growing talent hubs. With all this progress, we’ve reached an inflection point where talk turns to action for deploying AI. The world has seen what training can create, now we need superior inference solutions to bring the benefits of gen AI to companies across the business spectrum. With our launch on the horizon, we’re more excited than ever to drive greater efficiency and value, reshaping a more sustainable future for AI.

To join this industry-defining journey, check out our open roles at d-matrix.ai/careers. Learn more about our platform here.