Ollama: Speculative Decoding and Codex App Updates

The Ollama team merged five pull requests focusing on MLX runner performance improvements through DFlash speculative decoding and several Codex app refinements including restart mechanisms and documentation updates.

2026-05-15T10:01:04Z

Duration: PT0S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-05-15T10:01:04Z
Audio duration: PT0S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Good morning, this is your Ollama developer briefing for May 15th, 2026.

Patrick Devine merged a significant performance enhancement, adding DFlash speculative decoding to the MLX runner. This 1,900-line addition introduces block diffusion speculative decoding with support for Qwen 3.6 models, both mixture-of-experts and dense variants. The implementation includes draft model recurrent…

Parth Sareen contributed four merged pull requests centered around the Codex app. The first addressed restart reliability issues by implementing more robust restart mechanisms while maintaining existing safeguards. Sareen also updated UI copy across the launch commands and registry components, and made substantial…

The additional commits mirror these merged pull requests, with no standalone changes beyond the integrated work.

What's next: The team appears focused on stabilizing the Codex app for launch while continuing MLX runner optimizations. Performance testing of the new speculative decoding implementation will likely be a priority.

That's your Ollama update for today. Back tomorrow with more developer news.

Nearby episodes from Ollama

Startup Performance Optimization 2026-05-20T10:01:13Z
Codex Integration Enhancement 2026-05-19T10:00:56Z
Weekly Recap - MLX Performance & Codex Integration 2026-05-17T10:00:53Z
Release Build Optimization 2026-05-16T10:01:05Z
MLX Sampler Overhaul and Codex Integration 2026-05-14T10:00:56Z
Vision Model Integration Enhancement 2026-05-13T10:00:58Z
MLX Threading and Claude Image Fixes 2026-05-12T10:00:54Z
Model Transfer Optimization and Test Reliability 2026-05-09T10:00:51Z