Ollama: Gemma 4 MLX Support and Mixed-Precision Improvements
The Ollama team merged significant MLX backend improvements including mixed-precision quantization and capability detection enhancements. A major addition brings Gemma 4 model support to the MLX runtime with text-only functionality.
Duration: PT2M11S
Episode overview
This episode is a short developer briefing from Ollama.
It explains recent repository work in plain language.
- Show: Ollama
- Published: 2026-04-13T00:00:00Z
- Audio duration: PT2M11S
Transcript excerpt
This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.
Good morning, this is your Ollama development briefing for April 13th, 2026.
Daniel Hiltgen merged pull request 15409, delivering mixed-precision quantization and capability detection improvements for the MLX backend. This substantial change adds 368 lines across seven files, introducing audio encoder detection, improved vision support using vision_config instead of architecture names, and…
In additional commits, we see the implementation of Gemma 4 model support for MLX with text-only runtime capability. This includes two performance optimizations: memoized sliding-window prefill masks across layers and optimized softmax operations over selected experts in router forwarding.
The team addressed a Metal compiler error affecting Gemma 4 on some systems, fixing an uninitialized variable issue. There was notable activity around Gemma 4 prompt rendering, with multiple commits and reverts as developers refined the handling of different model sizes, particularly distinguishing behavior between…
Devon Rifkin contributed improvements to restore proper e2b-style nothink prompts, addressing regression issues and separating chat templates for different model sizes. The team also merged updates…
Ad…
Nearby episodes from Ollama
- MLX Sampler Improvements
- Windows WSL Integration Simplified
- Gemma4 Enhancements and Copilot CLI Integration
- Hermes Agent Integration and Gemma4 Improvements
- Weekly Recap - Model Integration and Tooling Enhancements
- ROCm 7.2.1 Performance Update
- Gemma4 Parser Improvements
- Model Updates and Tool Call Fixes