Ollama: Tokenizer Love and Better Model Support

Today we're diving into some fantastic tokenizer improvements that make Ollama even more versatile! Daniel Hiltgen delivered two key enhancements - adding SentencePiece-style BPE support for better model compatibility, and fixing a tokenizer configuration bug in the MLX pipeline. Plus, Parth Sareen updated the Pi integration docs to help more developers get started.

2026-04-01T10:00:33Z

Duration: PT3M52S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-04-01T10:00:33Z
Audio duration: PT3M52S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Hey there, fellow developers! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have some exciting updates to chat about today. Grab your favorite beverage because we're diving into some really cool tokenizer improvements that are going to make your AI models work even better.

So picture this - you know how different AI models sometimes have their own quirky ways of handling text? Well, the Ollama team has been hard at work making sure we can support even more of these models seamlessly. And today's changes are a perfect example of that dedication to compatibility and correctness.

Let's start with the star of the show - Daniel Hiltgen just merged a fantastic enhancement that adds SentencePiece-style BPE support to our tokenizer. Now, if you're thinking "wait, what's that?" - don't worry, I've got you covered. Byte Pair Encoding, or BPE, is basically how we break down text into smaller pieces…

The cool thing about this update is that some models use a special Unicode character - U+2581 - to represent spaces. It's like a secret code for spaces that certain models prefer. Daniel's implementation adds a new option called WithSentencePieceNormalizer…

The…

…

Nearby episodes from Ollama

Weekly Recap - Gemma4 Integration & Audio Support 2026-04-06T00:00:00Z
Performance Lessons and Gemma4 Refinements 2026-04-04T10:00:33Z
Gemma4 Arrives with Audio Magic 2026-04-03T10:00:29Z
Modernizing Codex Configuration 2026-04-02T10:00:33Z
Legacy Compatibility and Developer Experience Wins 2026-03-30T10:00:58Z
Smoothing the Launch Experience 2026-03-29T10:00:54Z
Fixing the Inconsistencies That Matter 2026-03-28T10:11:50Z
Smart Caching and Better User Experience 2026-03-27T10:11:09Z