Ollama: Nemotron Architecture Lands with Unified Cache Vision

Jeffrey Morgan merged a massive pull request adding Nemotron architecture support to Ollama, bringing over 3,000 lines of new code across 22 files. This foundational change introduces a unified recurrent cache system that paves the way for supporting multiple advanced architectures like Qwen3.5 and LFM models.

2026-02-23T11:03:10Z

Duration: PT3M49S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-02-23T11:03:10Z
Audio duration: PT3M49S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Hey there, code friends! Welcome back to another episode of the Ollama podcast. I'm so excited to be here with you on this beautiful February 23rd morning, and wow, do we have some incredible progress to dive into today. Grab your favorite beverage because we're talking about some seriously impressive architectural…

So picture this - you're building a house, and instead of just adding another room, you decide to completely reimagine the foundation to support not just your current needs, but three different house styles you want to build in the future. That's exactly what Jeffrey Morgan accomplished with yesterday's massive pull…

This isn't just any ordinary feature add, folks. We're talking about over three thousand lines of new code spread across twenty-two files. But here's what makes this really exciting - Jeffrey didn't just bolt on Nemotron support. He took a step back and said, "You know what? Let's build something that's going to…

Let's talk about what actually landed in the codebase. We've got brand new converter files specifically for Nemotron, complete with comprehensive test suites - because good developers always write tests, right? There's a whole new kvcache package…

Wha…

Nearby episodes from Ollama

Tool Calling Gets Smarter 2026-02-27T11:02:49Z
Cleaner Shutdowns and Faster Startups 2026-02-26T11:03:23Z
Qwen 3.5 Architecture Lands with Safety Upgrades 2026-02-25T11:01:49Z
Memory Management Revolution 2026-02-24T11:06:08Z
Fixing the WSL Plugin Problem 2026-02-22T11:04:30Z
Smarter UIs and Smoother Onboarding 2026-02-21T11:02:07Z
Tokenizer Consolidation & MLX Library Improvements 2026-02-20T11:03:06Z
Rolling Back and Rolling Forward 2026-02-19T11:03:01Z