Ollama: Smarter Sampling and Crash Prevention

Jeffrey Morgan merged two key improvements today - a substantial enhancement to the sampling system with repeat-based sampling capabilities, and a crucial fix preventing crashes in the Qwen3Next model's DeltaNet when using split offloading. The team also collaborated with community contributor Yossi Ovadia on the crash fix.

2026-03-02T11:01:42Z

Duration: PT3M49S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-03-02T11:01:42Z
Audio duration: PT3M49S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Hey there, amazing developers! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have some exciting updates to dig into today, March 2nd, 2026. Grab your favorite beverage because we're talking about some really thoughtful improvements that are going to make your AI experiences…

Let's jump right into the big story of the day - Jeffrey Morgan just merged a fantastic enhancement to Ollama's sampling system. This is one of those changes that might sound technical at first, but it's actually pretty exciting when you think about what it means for your models.

So what's repeat-based sampling? Think of it like giving your AI a better memory about what it just said. You know how sometimes when you're talking, you might catch yourself repeating a word or phrase, and you naturally course-correct? That's essentially what this new sampling system does for language models. It…

The implementation here is really solid too. Jeffrey added 193 lines of new code across 8 files, touching everything from the core API types to the documentation, and even adding comprehensive tests. I love seeing changes that come with proper test coverage - that's exactly the kind of…

Wha…

Nearby episodes from Ollama

Cloud Models Get Smarter & Build Performance Boost 2026-03-07T11:18:50Z
Cloud Integrations Get Some Love 2026-03-06T11:04:22Z
Smarter Constraints and Qwen3.5 Boost 2026-03-05T11:04:48Z
Cloud Integration Drama and AI Model Expansion 2026-03-04T11:10:53Z
Building Bridges for Better Model Compatibility 2026-03-01T11:01:49Z
MLX Runner Gets Rock Solid 2026-02-28T11:03:09Z
Tool Calling Gets Smarter 2026-02-27T11:02:49Z
Cleaner Shutdowns and Faster Startups 2026-02-26T11:03:23Z