Spintronics and the Memory Architecture AI Actually Needs

Magnetic tunnel junction memory cell cross-section diagram

The AI memory problem doesn't get the attention it deserves. GPU compute performance has been scaling roughly 2x every two years. Memory bandwidth has improved, but more slowly. The gap between what processors can compute and what memory can supply is a fundamental architectural bottleneck — and it's getting worse as model sizes grow. Training a large language model isn't primarily a compute problem at the largest scales; it's a memory bandwidth and energy problem.

The conventional response has been to add more DRAM, organize it in stacked configurations (HBM — high bandwidth memory), and push the memory closer to the compute die using 2.5D integration with silicon interposers. HBM3 provides roughly 900 GB/s per stack. For the largest models training on thousands of chips, that's still not enough. And HBM is expensive — it uses custom DRAM processes, advanced packaging, and consumes significant power in the refresh and data movement cycles that are intrinsic to volatile DRAM architecture.

Why In-Memory Computing Changes the Arithmetic

The standard compute paradigm moves data from memory to the compute unit, processes it, and moves results back. Every byte of model weights traverses the memory bus multiple times per forward pass. At scale, this movement dominates energy consumption. Neural network inference on a transformer model moves roughly 10-15 bits of data per multiply-accumulate operation in the compute unit — the arithmetic is nearly free compared to the data movement.

In-memory computing inverts this. Instead of moving weights to compute, you perform the multiply-accumulate operations inside the memory array itself. The result is a dramatic reduction in data movement energy. Early demonstrations of resistive memory (ReRAM) and phase change memory (PCM) based in-memory computing show energy efficiency improvements of 10-100x for matrix-vector multiply operations compared to digital compute with conventional memory.

The challenge with ReRAM and PCM is endurance and variability. These devices degrade after a finite number of write cycles (typically 10^6 to 10^8), and their resistance states have stochastic variation that requires analog-to-digital conversion overhead to read reliably. For inference — where you write the weights once and read them many times — these are manageable constraints. For training — which requires continuous weight updates — they're more problematic.

STT-MRAM: The Magnetic Alternative

Spin-transfer torque magnetoresistive RAM (STT-MRAM) operates on a different physical principle. Instead of trapping charge or inducing a phase change, it stores information in the magnetic orientation of a thin film. A magnetic tunnel junction (MTJ) — the core device — consists of two magnetic layers separated by a thin insulating barrier. The electrical resistance of the junction depends on whether the two magnetic layers are aligned parallel or antiparallel. Switching between the two states requires passing a spin-polarized current through the junction, which transfers angular momentum and flips the free layer's magnetization.

The advantages over charge-based nonvolatile memories are compelling for certain applications:

Endurance: STT-MRAM devices have demonstrated endurance exceeding 10^12 write cycles — orders of magnitude better than flash or PCM
Speed: Write and read operations in the nanosecond range, comparable to SRAM
Non-volatility: No power required to maintain stored state — unlike SRAM, which requires continuous power
Scalability: MTJ dimensions have been demonstrated below 10nm without significant degradation in thermal stability

The tradeoff is write energy. Switching an MTJ requires a critical current density that, while lower than earlier generations, is still higher than writing to SRAM. For applications where you write infrequently and read frequently — the weight matrix in a deployed neural network, the lookup tables in edge inference chips — this tradeoff is favorable. You're essentially exchanging write energy for the elimination of static refresh power and the addition of non-volatility.

The Neuromorphic Argument

The neuromorphic computing community's interest in spintronics goes beyond conventional memory replacement. MTJ devices can exhibit analog behavior — particularly in the stochastic switching regime near the thermal stability boundary — that makes them interesting as physical implementations of probabilistic synaptic weights. A network of stochastically switching MTJs can, under the right circuit conditions, implement something analogous to synaptic plasticity rules from neuroscience. This is the deep research thread that connects spintronics to neuromorphic computing architecture.

We're less excited about the neuromorphic angle as a near-term commercial proposition — the programming models for stochastic analog networks are still maturing, and the application-specific integrated circuit (ASIC) design complexity for analog in-memory inference is significant. The nearer-term opportunity is STT-MRAM as embedded non-volatile memory in edge AI chips: replacing the Flash-based weight storage that current microcontrollers use with a faster, more energy-efficient, and more write-tolerant solution.

NovaSpin's Position

NovaSpin, our Zurich-headquartered (US entity) seed investment, is working on MTJ memory specifically for neuromorphic chip integration. Their differentiation is in the MTJ material stack — specifically, a perpendicular magnetic anisotropy (PMA) design using L10-FePt as the pinned layer, which achieves high thermal stability at smaller device dimensions than conventional CoFeB-based stacks. Smaller devices mean higher density and lower switching currents, both of which improve the energy efficiency argument.

The manufacturing integration is the hard problem they're solving. Getting a PMA MTJ process to integrate with a standard CMOS back-end-of-line flow — staying within the thermal budget, achieving acceptable yields on the MgO tunnel barrier, and building the right read/write circuit architecture — requires deep collaboration between materials scientists, device physicists, and circuit designers. The team spans all three disciplines, which is unusual. That's why we invested at seed.

Whether STT-MRAM captures significant share of the AI accelerator memory market in this decade depends partly on whether the large memory manufacturers (Samsung, SK Hynix, Micron) accelerate their own MRAM development programs beyond the embedded applications where they currently sell it. The competitive dynamics are interesting: the startups have better PMA material stacks, but the large players have the manufacturing scale. This typically resolves via acquisition or licensing, not direct competition.

Working on spintronic devices, magnetic memory, or neuromorphic computing architectures? We want to hear about it.

Continue Reading

RISC-V processor die photograph showing core layout

March 21, 2026

Spintronics and the Memory Architecture AI Actually Needs

Why In-Memory Computing Changes the Arithmetic

STT-MRAM: The Magnetic Alternative

The Neuromorphic Argument

NovaSpin's Position

Continue Reading

Custom Silicon for the Edge: RISC-V's Moment Has Arrived

Why Silicon Photonics Is the Quietest Revolution in AI Hardware

Why Deep Tech Investing Requires a Different Clock