Ollama: Memory Management and Multimodal Parsing Fixes

A significant push to fix memory reporting bugs and improve file handling, with three memory-related fixes addressing double-counting issues and unified memory support, plus multimodal parsing improvements and new architecture support.

Duration: PT2M9S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

  • Show: Ollama
  • Published: 2026-06-14T13:00:49Z
  • Audio duration: PT2M9S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Good morning, it's June 14th, 2026.

The Ollama team focused heavily on memory management reliability yesterday, with three separate pull requests tackling different aspects of how the system handles GPU and system memory accounting.

The most impactful fix addresses a long-standing issue where the process status command was double-counting memory-mapped weights on partially offloaded models. Pull request 16709 from discobot fixes the root cause in how llama.cpp's model loader calculates buffer sizes, which was causing false CPU-GPU splits and…

Related to memory handling, pull request 16720 from KI3P improves scheduler behavior on unified-memory APUs like AMD's new Strix Halo chips. The scheduler was incorrectly clamping GPU memory budgets to system memory limits, even though these APUs carve GPU memory directly from system RAM. This fix should…

The team also addressed several multimodal workflow pain points. Two pull requests from xy200303 tackle file path parsing issues that were breaking drag-and-drop image functionality in the CLI. One preserves nested file paths when directory names look like file extensions, while another handles quoted paths and…

On the architecture front, pull…

Nearby episodes from Ollama

  1. Gemma 4 Support and Platform Improvements
  2. Weekly Recap - MLX Performance & Path Handling
  3. GPU Offloading and Tool Call Fixes
  4. Performance Optimizations and Model Handling Improvements
  5. Infrastructure Updates and Platform Fixes
  6. Multimodal Fixes and Developer Experience Updates
  7. Cache Architecture Overhaul and Data Race Fixes
  8. Developer Tools and Cross-Platform Reliability