A little while ago, my colleague Sebastian started complaining about OOMs caused by Evolution taking up tens of gigabytes of memory. We discussed using sysprof to debug it, but it was too busy a time for Sebastian to set aside a few hours to do that.
Funnily enough, the most efficient fix at the time was to buy more RAM, since rust-analyzer was also causing OOM issues.
A few weeks went by. Restarting Evolution had become a daily ritual for Sebastian.
Then, on a whim, I decided investigating this might be a good test for an LLM.
I updated my Evolution git repo, built it, and started up Claude Code in the source root. This was the only prompt I supplied:
Find memory leaks in Evolution, current sourcedir. Particularly leaks that could accumulate over several hours. A colleague has a leak that slowly accumulates memory usage to several GB over the course of a day, requiring a restart of Evolution. That is the main focus, but we can fix other leaks in the process.
I wish I was lying, but that was all Claude Code needed to find the problem: Evolution just needed to call malloc_trim(0) from time to time.
I refused to believe it at first. I was only convinced when we saw the memory drop after running gdb -p $(pidof evolution) -batch -ex "call malloc_trim(0)" -ex detach
This seems absurd! Doesn't glibc reclaim freed memory from time to time?
Yes, it does. It calls sbrk() to do that. However, sbrk() can only reclaim free memory at the top of the heap, since it simply moves the program break downward to do so. malloc_trim(0) calls sbrk() and then also calls madvise(..., MADV_DONTNEED) on the free pages, which allows the kernel to reclaim them.
So if you have 10GB of unused memory followed by 4 bytes allocated at the top of the heap, your RSS is >10GB, even if you're using a few hundred megs. Till you call malloc_trim(0).
Note that you can only get into this situation if you have hundreds of thousands of small allocs/deallocs happening repeatedly. If your alloc is >128KB, mmap() is used for the allocation, and none of this applies.
Coincidentally, GLib's use of GSlice for GObject allocations was masking this issue in the past, but GSlice has been a no-op for some time now (for good reasons). Ideally, Evolution should not be using GObject for such ephemeral objects.
Lesson learned: if you have memory usage issues and you suspect fragmentation, try malloc_trim(0) before you go thinking about fancy allocators.