max-enginemodulardgx-sparkinfrastructure
Running Gemma-4-31B on DGX Spark with MAX Engine — What Broke and How We Fixed It
Getting Modular's MAX Engine to serve a 31B parameter model on NVIDIA's smallest Grace Blackwell system.
Read
Research notes, infrastructure war stories, and updates from the workshop.
Getting Modular's MAX Engine to serve a 31B parameter model on NVIDIA's smallest Grace Blackwell system.
3,000 interpreted and verified sparse autoencoder features for every layer of Google's Gemma-4-31B.