Enables 60,000 tokens/second at 1 ms/token for Llama3 8B in a single server and 30,000 tokens/second at 2 ms/token for Llama3 70B in a single rack
SC24 – ATLANTA, GA — November 19, 2024 – d-Matrix today unveiled Corsair™, an entirely new computing paradigm designed from the ground-up for the next era of AI inference in modern datacenters. Corsair leverages d-Matrix’s innovative Digital In-Memory Compute architecture (DIMC), an industry first, to accelerate AI inference workloads with industry-leading real-time performance, energy efficiency, and cost savings as compared to GPUs and other alternatives.
The emergence of reasoning agents and interactive video generation represents the next level of AI capabilities. These leverage more inference computing power to enable models to “think” more and produce higher quality outputs. Corsair is the ideal inference compute solution with which enterprises can unlock new levels of automation and intelligence without compromising on performance, cost or power.
“We saw transformers and generative AI coming, and founded d-Matrix to address inference challenges around the largest computing opportunity of our time,” said Sid Sheth, cofounder and CEO of d-Matrix. “The first-of-its-kind Corsair compute platform brings blazing fast token generation for high interactivity applications with multiple users, making Gen AI commercially viable.”
Analyst firm Gartner predicts a 160% increase in data center energy consumption over the next two years, driven by AI and GenAI. As a result, Gartner estimates 40% of existing AI data centers will be operationally constrained by power availability by 2027. (1) Deploying AI models at scale could make them quickly cost-prohibitive.
d-Matrix Industry Firsts and Breakthroughs
d-Matrix combines several world’s first innovations in silicon, software, chiplet packaging and interconnect fabrics to accelerate AI inference.
Generative inference is inherently memory bound. d-Matrix breaks through this memory bandwidth barrier with a novel DIMC architecture that tightly integrates memory and compute. Scaling is achieved using DMX Link™ for high-speed energy-efficient die-to-die connectivity across chiplets in a package, and DMX Bridge™ for connecting packages across two cards. d-Matrix is among the first in the industry to natively support block floating point numerical formats, now an OCP standard called Micro-scaling (MX), for greater inference efficiency. These industry-first innovations are seamlessly integrated under the hood by d-Matrix’s Aviator software stack that gives AI developers a familiar user experience and tooling.
Corsair comes in an industry standard PCIe Gen5 full height full length card form factor, with pairs of cards connected via DMX Bridge cards. Each Corsair card is powered by DIMC compute cores with 2400 TFLOPs of 8-bit peak compute, 2 GB of integrated Performance Memory, and up to 256 GB of off-chip Capacity Memory. The DIMC architecture delivers ultra-high memory bandwidth of 150 TB/s, significantly higher than HBM. Corsair delivers up to 10x faster interactive speed, 3x better performance per total cost of ownership (TCO), and 3x greater energy efficiency*.
“d-Matrix is at the forefront of a monumental shift in Gen AI as the first company to fully address the pain points of AI in the enterprise”, said Michael Stewart, managing partner of M12, Microsoft’s Venture Fund. “Built by a world-class team and introducing category-defining breakthroughs, d-Matrix’s compute platform radically changes the ability for enterprises to access infrastructure for AI operations and enable them to incrementally scale out operations without the energy constraints and latency concerns that have held AI back from enterprise adoption. d-Matrix is democratizing access to the hardware needed to power AI in standard form factor to make Gen AI finally attainable for everyone.”
Availability of d-Matrix Corsair inference solutions
Corsair is sampling to early-access customers and will be broadly available in Q2’2025. d-Matrix is proud to be collaborating with OEMs and System Integrators to bring Corsair based solutions to the market.
“We are excited to collaborate with d-Matrix on their Corsair ultra-high bandwidth in-memory compute solution, which is purpose-built for generative AI, and accelerate the adoption of sustainable AI computing,” said Vik Malyala, Senior Vice President for Technology and AI, Supermicro. “Our high-performance end-to-end liquid- and air- cooled systems incorporating Corsair are ideal for next-level AI compute.”
“Combining d-Matrix’s Corsair PCIe card with GigaIO SuperNODE’s industry-leading scale-up architecture creates a transformative solution for enterprises deploying next-generation AI inference at scale,” said Alan Benjamin, CEO at GigaIO. “Our single-node server supports 64 or more Corsairs, delivering massive processing power and low-latency communication between cards. The Corsair SuperNODE eliminates complex multi-node configurations and simplifies deployment, enabling enterprises to quickly adapt to evolving AI workloads while significantly improving their TCO and operational efficiency.”
“By integrating d-Matrix Corsair, Liqid enables unmatched capability, flexibility, and efficiency, overcoming traditional limitations to deliver exceptional inference performance. In the rapidly advancing AI landscape, we enable customers to meet stringent inference demands with Corsair’s ultra-low latency solution,” said Sumit Puri, Co-Founder at Liqid.
d-Matrix is headquartered in Santa Clara, California with offices in Bengaluru, Toronto, Sydney and Belgrade.
To learn more about Corsair, please visit d-matrix.ai.
About d-Matrix
d-Matrix advances compute for modern AI workloads and breaks through key barriers that have stymied chips for AI, including performance, efficiency, and scalability. d-Matrix is a pioneer in Digital In-Memory Computing (DIMC) solutions that solve limitations on generative AI inference acceleration. d-Matrix creates flexible solutions for inference at scale using innovative circuits and packaging techniques, a chiplet-based DIMC architecture, innovative interconnect fabric, and hardware-software codesign. Founded in 2019, the company is backed by top investors and strategic partners including Playground Global, M12 (Microsoft Venture Fund), Temasek, Triatomic Capital, Industry Ventures, Nautilus Venture Partners, Mirae Asset and Entrada Ventures.
Visit d-matrix.ai for more information and follow d-Matrix on LinkedIn for the latest updates.
References
1- Gartner. Emerging Tech: Power Shortages Will Restrict GenAI Growth and Implementation, October 2024
* Performance, cost and power estimates are preliminary and subject to change. Results may vary.