November 19, 2024
Karl Freund, Founder and Principal Analyst, Cambrian-AI Research LLC:
Startup launches “Corsair” AI platform with Digital In-Memory Computing, using on-chip SRAM memory that can produce 30,000 tokens/second at 2 ms/token latency for Llama3 70B in a single rack.
d-Matrix uses a hybrid approach to memory that appears to deliver excellent results, using SRAM as “Performance Memory” and a larger DRAM store for “Capacity Memory”. Use the Performance Memory for on-line operations that require low-latency for interactivity, and use the Capacity Memory for off-line work.