Ollama: Gemma 4 MLX Support and Mixed-Precision Improvements

The Ollama team merged significant MLX backend improvements including mixed-precision quantization and capability detection enhancements. A major addition brings Gemma 4 model support to the MLX runtime with text-only functionality.

2026-04-13T00:00:00Z

Duration: PT2M11S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-04-13T00:00:00Z
Audio duration: PT2M11S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Good morning, this is your Ollama development briefing for April 13th, 2026.

Daniel Hiltgen merged pull request 15409, delivering mixed-precision quantization and capability detection improvements for the MLX backend. This substantial change adds 368 lines across seven files, introducing audio encoder detection, improved vision support using vision_config instead of architecture names, and…

In additional commits, we see the implementation of Gemma 4 model support for MLX with text-only runtime capability. This includes two performance optimizations: memoized sliding-window prefill masks across layers and optimized softmax operations over selected experts in router forwarding.

The team addressed a Metal compiler error affecting Gemma 4 on some systems, fixing an uninitialized variable issue. There was notable activity around Gemma 4 prompt rendering, with multiple commits and reverts as developers refined the handling of different model sizes, particularly distinguishing behavior between…

Devon Rifkin contributed improvements to restore proper e2b-style nothink prompts, addressing regression issues and separating chat templates for different model sizes. The team also merged updates…

Ad…

Nearby episodes from Ollama

MLX Sampler Improvements 2026-04-18T00:00:00Z
Windows WSL Integration Simplified 2026-04-17T00:00:00Z
Gemma4 Enhancements and Copilot CLI Integration 2026-04-16T00:00:00Z
Hermes Agent Integration and Gemma4 Improvements 2026-04-15T00:00:00Z
Weekly Recap - Model Integration and Tooling Enhancements 2026-04-13T00:00:00Z
ROCm 7.2.1 Performance Update 2026-04-12T00:00:00Z
Gemma4 Parser Improvements 2026-04-11T00:00:00Z
Model Updates and Tool Call Fixes 2026-04-10T00:00:00Z