Ollama: MLX Runner Gets Major Model Upgrades
Patrick Devine delivered two significant PRs expanding MLX Runner support with Gemma3 and Llama3 architectures, plus streamlined quantization code. The team also cleaned up documentation with a macOS download link fix, making it a solid day of both feature development and user experience improvements.
Duration: PT3M57S
Transcript
Hey there, fantastic developers! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have some exciting updates to dive into today. If you're just tuning in, this is where we catch up on all the latest happenings in the Ollama codebase - think of it as your daily dose of what's cooking in the world of local AI models.
So grab your coffee, tea, or whatever fuel keeps your code flowing, because February 16th brought us some really substantial improvements that I think you're going to love.
Let's jump right into the main event - and honestly, today feels like Patrick Devine had a bit of a coding marathon! We got not one, but two major pull requests that are going to make MLX Runner significantly more powerful.
First up is PR 14276, which adds Gemma3 support to the MLX Runner. Now, this isn't just a simple "hey, let's add another model" kind of change. We're talking about nearly 900 lines of additions across six files, including a brand new quantization module and a complete Gemma3 implementation. What I love about this PR is that Patrick didn't just add the feature - he took the opportunity to simplify and clean up the quantization code for loading weights. That's the kind of thoughtful development that makes codebases healthier over time.
The really cool part? This change touched everything from the imports and pipeline logic to creating an entirely new models directory for Gemma3. Plus, there was some cleanup work on the existing GLM4 MOE Lite implementation. It's like getting a new feature and a refactor rolled into one.
But wait, there's more! Hot on the heels of that first PR, we got PR 14277, which builds directly on that foundation to add Llama3 architecture support. This one was a bit more focused - 324 lines across two files - but it shows how good architectural decisions pay off. Because Patrick had already done the heavy lifting with the MLX Runner infrastructure in the first PR, adding Llama3 became much more straightforward.
Now, I know some of you might be thinking, "Okay, but what does this actually mean for me?" Here's the thing - having both Gemma3 and Llama3 support in MLX Runner means you're getting access to some of the most capable open-source language models out there, optimized for Apple Silicon. If you're developing on a Mac, this is huge for your local AI workflow.
We also got a nice little quality-of-life improvement from Saumil Shah with PR 14271. Sometimes the smallest changes make the biggest difference for users - in this case, it was updating a macOS download link in the README. It's just one character changed, but it means users won't hit a broken link when they're trying to get started with Ollama. Never underestimate the power of good documentation!
What I find really encouraging about today's activity is the collaborative nature of it all. You can see how the PRs build on each other, how the infrastructure work enables new features, and how the community is paying attention to both the big architectural improvements and the small user experience details.
For Today's Focus, if you're working with Ollama, this is a great time to explore what these new model architectures can do for your projects. Try out Gemma3 or Llama3 if they fit your use case, and see how the performance feels with the updated MLX Runner. And hey, if you notice any documentation that could use some love like Saumil did, don't hesitate to contribute - even small fixes make a real difference.
That's a wrap on today's episode! The Ollama project continues to expand its model support while keeping the codebase clean and the user experience smooth. Keep coding, keep learning, and I'll catch you tomorrow for another round of what's new in the world of Ollama. Until then, happy developing!