Redis: Bug Squashing and Test Hardening Day
Today we dive into four solid commits from the Redis team focused on fixing critical bugs and making tests more robust. Joan Fontanals tackled a tricky memory safety issue in HSETNX, while Vitah Lin delivered three important fixes addressing test failures under Valgrind, TCP deadlocks, and debug assertion problems.
Duration: PT3M59S
https://podlog.io/listen/redis-84394f5e/episode/redis-bug-squashing-and-test-hardening-day-bb78f5c7
Transcript
Hey there, fellow developers! Welcome back to another episode of the Redis podcast. I'm your host, and wow, do I have some satisfying fixes to share with you today - March 26th, 2026. You know those days where you roll up your sleeves and just tackle the gnarly bugs head-on? That's exactly what happened in the Redis codebase yesterday, and honestly, it's the kind of work that makes me appreciate how thoughtful this community is.
So we didn't have any merged pull requests today, but we've got four commits that are absolutely worth talking about. These aren't flashy new features, but they're the kind of solid engineering work that keeps Redis running smoothly in production.
Let's start with what I think is the most interesting fix from Joan Fontanals. They solved a really subtle memory safety issue in the HSETNX command. Here's the story: when you're using HSETNX, the code was sending key space notifications before it was completely done working with the key-value object. Now, this sounds innocent enough, right? But here's where it gets tricky - if you have a module that writes to key metadata during that notification, the original object can get reallocated. Then boom, you're working with a dangling pointer and potentially crashing.
Joan actually discovered this while integrating key metadata support in a module they were working on. I love how real-world usage often reveals these edge cases that are nearly impossible to catch otherwise. The fix was elegant - just flip the order so notifications happen after we're done with the object. Simple in concept, but it probably took some serious debugging to figure out what was going wrong.
Now, Vitah Lin was absolutely on fire yesterday with three separate fixes. First up, they tackled test failures under Valgrind. A recent change had added slowlog statistics to the INFO commandstats, which sounds great, but it was causing tests to fail because they were expecting exact matches and suddenly there were extra fields showing up. Vitah's solution? Add trailing wildcards to the test patterns so they're more tolerant of these optional fields. It's one of those "why didn't we do this before" moments.
The second fix from Vitah addressed something that could bite you in production - a potential TCP deadlock in the Active defrag test for streams. The test was trying to send 100,000 commands all at once before reading any replies. If you've ever worked with TCP, you can probably see where this is going. When those buffers fill up, everything just... stops. Vitah fixed it by batching the writes and reads - 1000 iterations at a time. Much more civilized, and it mirrors what they were already doing elsewhere in the codebase.
The third fix is particularly interesting for anyone working with debug builds. Under DEBUG_ASSERT_KEYSPACE builds, every single command triggers a full key scan. That's incredibly thorough for catching bugs, but it was causing ASM migrations to stall because each command had so much overhead. Vitah's fix was surgical - skip the debug assertions during ASM import and background trimming, where you really need things to move quickly.
What I love about all these fixes is how they show the depth of testing and real-world usage that Redis gets. These aren't theoretical problems - they're issues that people actually ran into while using Redis in different configurations and scenarios.
Today's focus? If you're maintaining any kind of infrastructure code, take a page from this playbook. Pay attention to the order of operations when you're dealing with callbacks and notifications. Test your code under different build configurations. And always, always think about what happens when buffers fill up or when you're processing large volumes of data.
That's a wrap for today! Keep building amazing things, and remember - some of the most important code you'll write are the fixes that prevent crashes at 3 AM. Until next time, happy coding!