Ollama: Agent Harness Lands, Hardware Support Gets a Cleanup

The agent harness and its terminal interface moved through review this cycle, while a cluster of hardware and compatibility fixes tightened GPU support across CUDA, ROCm, and Jetson devices.

2026-07-03T14:05:17Z

Duration: PT2M24S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-07-03T14:05:17Z
Audio duration: PT2M24S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Good day. It's July 3rd, 2026, and here's what moved in Ollama.

The big story is agents. Parth Sareen's core harness, PR sixteen-nine-six-three, merged with chat loop, tool execution, approval handling, and compaction logic — the foundation for running agents in Ollama. Right behind it, PR seventeen-oh-seventeen adds the terminal interface on top: model selection, approval…

Second theme: hardware support is getting rationalized. Daniel Hiltgen shipped three related fixes. PR sixteen-nine-nine-four re-enables Flash Attention on older CUDA compute-capability six GPUs, now that upstream Pascal kernel fixes let those build natively again — closing two long-standing issues. PR…

Third, a pair of correctness fixes worth flagging: PR sixteen-nine-nine-nine switches compatibility tensor reads to a UTF-8-safe file open, fixing model loads on Windows paths with Unicode characters. And PR seventeen-oh-oh-nine addresses a capability-reporting mismatch — the tags endpoint was missing the tools…

What's next: watch for the agent TUI to finish review, and keep an eye on PR seventeen-oh-twenty, which fixes a tensor-duplication bug corrupting Qwen 3.5 and 3.6 generation during MLX import — worth…

That's…

Nearby episodes from Ollama

Gemma 4 Support and Platform Improvements 2026-06-15T13:01:10Z
Weekly Recap - MLX Performance & Path Handling 2026-06-15T09:08:58Z
Memory Management and Multimodal Parsing Fixes 2026-06-14T13:00:49Z
GPU Offloading and Tool Call Fixes 2026-06-13T13:01:39Z
Performance Optimizations and Model Handling Improvements 2026-06-12T13:01:28Z
Infrastructure Updates and Platform Fixes 2026-06-11T13:00:54Z
Multimodal Fixes and Developer Experience Updates 2026-06-10T13:00:43Z
Cache Architecture Overhaul and Data Race Fixes 2026-06-09T13:02:33Z