Ollama

Ollama: Cloud Models Get Smarter & Build Performance Boost

Today we're diving into a busy day with 6 merged PRs and 7 commits that brought some major improvements to Ollama! The team tackled cloud model handling, fixed XML parsing issues with GLM models, and made Docker builds way more efficient. Special shoutouts to the collaborative effort on cloud model stubs and Bruce MacDonald's clever fix for GLM tool call parsing.

Duration: PT3M51S

https://podlog.io/listen/ollama-3aed006f/episode/ollama-cloud-models-get-smarter-build-performance-boost-ec65b197

Transcript

Hey there, amazing developers! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have a packed show for you today! Grab your favorite beverage because we're diving into some really exciting changes that dropped on March 6th and 7th.

So picture this: you're working with cloud models, and you're tired of having to pull stub files every single time. Well, the team heard you loud and clear! Jeffrey Morgan landed a pretty significant change that eliminates the need to pull stubs for cloud models. Now, I know what you're thinking - "didn't we hear about this before?" And you're absolutely right! This is actually a "reapply" of a previous change, which tells us the team really believes in this improvement. Sometimes the best features need a couple of attempts to get just right, and that's totally normal in software development.

But here's where it gets really interesting - Bruce MacDonald tackled one of those wonderfully specific real-world problems that make you go "oh, that's so clever!" The GLM models were being a bit... let's say quirky... with their XML formatting. They were leaving closing tags unclosed in tool calls, which was causing the XML parser to throw a fit. Bruce wrote a sanitizer that fixes these malformed closing tags before they hit the parser. It's like having a really good editor who fixes typos before the publisher sees them!

Now, if you're running Ollama in Docker - and let's be honest, many of you are - Daniel Hiltgen just made your builds way smarter. The old parallel settings were, in his words, "naive," and I love that honesty! The new approach uses Ninja for all Docker cmake builds with much more intelligent load balancing. Instead of either overwhelming your system or leaving CPU cores sitting idle, it now aims for full utilization without oversaturation. It's like upgrading from a sledgehammer approach to surgical precision.

Speaking of Daniel's contributions, he also fixed localhost handling in the create command. Sometimes it's these seemingly small fixes that make the biggest difference in daily workflow. And Devon Rifkin made cloud proxy stream disconnections much more graceful - no more scary error messages when clients disconnect normally. These kinds of polish improvements show a maturing codebase that really cares about user experience.

Michael Yang rounded things out with some documentation formatting - and yes, documentation improvements absolutely deserve celebration! Clean, well-formatted docs make everyone's life easier.

What I love about today's changes is how they represent different aspects of software craftsmanship. We've got performance optimizations, parser robustness, user experience improvements, and documentation polish. It's like watching a well-orchestrated symphony where each instrument plays its part.

The cloud model stub elimination is particularly exciting because it shows the team is thinking about workflow efficiency. Every time you don't have to wait for an unnecessary download, that's time you get back to focus on building something amazing.

For today's focus, if you're using Ollama with cloud models, definitely test out this new stub-free workflow - you should notice things moving faster. If you're building with Docker, the improved parallelism should speed up your builds, especially if you were seeing inconsistent performance before. And if you're working with GLM models and tool calls, you should see fewer mysterious XML parsing errors.

The collaborative nature of these changes really stands out too. We saw co-authored commits and multiple people contributing to getting the cloud model changes just right. That's the kind of teamwork that makes open source beautiful.

That's a wrap on today's episode! Keep coding, keep building, and remember - every bug fix and optimization brings us all closer to more powerful and reliable AI tools. Until next time, happy coding!