Ollama: MLX Runner Gets Rock Solid

Jesse Gross delivered a comprehensive overhaul of the MLX runner with two major pull requests and supporting commits focused on memory management and reliability. The changes include proper memory reporting through `ollama ps`, context limit enforcement similar to cloud services, and critical panic fixes that make the MLX runner much more stable for production use.

2026-02-28T11:03:09Z

Duration: PT4M7S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-02-28T11:03:09Z
Audio duration: PT4M7S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Hey there, fellow developers! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have some exciting updates to dive into today. Grab your favorite beverage because we're talking about some seriously solid improvements that are going to make your MLX experience so much better.

So February 28th was quite the day in the Ollama repository, and I have to give a huge shoutout to Jesse Gross who has been absolutely crushing it with MLX runner improvements. We've got two merged pull requests and a handful of supporting commits that tell a really compelling story about taking software from "works…

Let's start with the big story here. The first major pull request is titled "MLX runner memory fixes" and friends, this is exactly the kind of work that makes me excited about software engineering. You know how frustrating it can be when you're running models and you have no idea what's actually happening with your…

Jesse tackled three core problems that were making the MLX runner feel a bit unpredictable. First up, memory reporting. Before this change, when you ran `ollama ps` to check what was going on, you'd get these static estimates that only included model…

The…

Nearby episodes from Ollama

Smarter Constraints and Qwen3.5 Boost 2026-03-05T11:04:48Z
Cloud Integration Drama and AI Model Expansion 2026-03-04T11:10:53Z
Smarter Sampling and Crash Prevention 2026-03-02T11:01:42Z
Building Bridges for Better Model Compatibility 2026-03-01T11:01:49Z
Tool Calling Gets Smarter 2026-02-27T11:02:49Z
Cleaner Shutdowns and Faster Startups 2026-02-26T11:03:23Z
Qwen 3.5 Architecture Lands with Safety Upgrades 2026-02-25T11:01:49Z
Memory Management Revolution 2026-02-24T11:06:08Z