Ollama: Agent Harness Lands, Hardware Support Gets a Cleanup
The agent harness and its terminal interface moved through review this cycle, while a cluster of hardware and compatibility fixes tightened GPU support across CUDA, ROCm, and Jetson devices.
Duration: PT2M24S
Episode overview
This episode is a short developer briefing from Ollama.
It explains recent repository work in plain language.
- Show: Ollama
- Published: 2026-07-03T14:05:17Z
- Audio duration: PT2M24S
Transcript excerpt
This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.
Good day. It's July 3rd, 2026, and here's what moved in Ollama.
The big story is agents. Parth Sareen's core harness, PR sixteen-nine-six-three, merged with chat loop, tool execution, approval handling, and compaction logic — the foundation for running agents in Ollama. Right behind it, PR seventeen-oh-seventeen adds the terminal interface on top: model selection, approval…
Second theme: hardware support is getting rationalized. Daniel Hiltgen shipped three related fixes. PR sixteen-nine-nine-four re-enables Flash Attention on older CUDA compute-capability six GPUs, now that upstream Pascal kernel fixes let those build natively again — closing two long-standing issues. PR…
Third, a pair of correctness fixes worth flagging: PR sixteen-nine-nine-nine switches compatibility tensor reads to a UTF-8-safe file open, fixing model loads on Windows paths with Unicode characters. And PR seventeen-oh-oh-nine addresses a capability-reporting mismatch — the tags endpoint was missing the tools…
What's next: watch for the agent TUI to finish review, and keep an eye on PR seventeen-oh-twenty, which fixes a tensor-duplication bug corrupting Qwen 3.5 and 3.6 generation during MLX import — worth…
That's…
Nearby episodes from Ollama
- Gemma 4 Support and Platform Improvements
- Weekly Recap - MLX Performance & Path Handling
- Memory Management and Multimodal Parsing Fixes
- GPU Offloading and Tool Call Fixes
- Performance Optimizations and Model Handling Improvements
- Infrastructure Updates and Platform Fixes
- Multimodal Fixes and Developer Experience Updates
- Cache Architecture Overhaul and Data Race Fixes