I notice that the paper doesn't claim to eliminate all reasoning about undefined behavior for optimizations. For example:
int f() {
int arr[3], i = 0;
arr[3] = 5;
return i;
}
Optimizing this to "return 0" is relying on UB, because it's assuming that i wasn't laid out directly after arr in the stack frame. I believe this is what the paper calls "non-guardable UB".
I don't agree with the claim in the paper that their semantics offers a "flat memory model". A flat memory model would rule out the optimization above. Rather, the memory model still has the notion of object bounds; it's just simplified in some ways.
One peculiar thing about the benchmark results is that disabling individual UB seems to fairly consistently reduce performance without LTO, but improve it with LTO. I could see how the UB may be less useful with LTO, but it's not obvious to me why reducing UB would actually help LTO. As far as I can tell, the paper does not attempt to explain this effect.
Another interesting thing is that there is clearly synergy between different UB. For the LTO results, disabling each individual UB seems to be either neutral or an improvement, but if you disable all of them at once, then you get a significant regression.
Reading e.g. the 13% perf regression in simdjson from disabling UB:
A simpler alternative is to compile the program with LTO. We confirmed that LLVM’s inter-procedural analyses can propagate both alignment and dereferenceability information for this function, which allows the LTO build to recover the performance loss.
"can" is doing a lot of heavy-lifting here. Guaranteeing expected optimizations "will" be applied are hard-enough, without leaving it entirely to an easily-derailed indirect side-effect.
perfect, this is right up my alley - honestly i keep wondering if teams avoid optimizations like lto just because build pain sucks or if theres some deeper trust issues around letting the toolchain be clever. you think peopled deal with slow builds if it bought way more speed for the final product?
> The results show that, in the cases we evaluated, the performance gains from exploiting UB are minimal. Furthermore, in the cases where performance regresses, it can often be recovered by either small to moderate changes to the compiler or by using link-time optimizations.
Exploiting Undefined Behavior in C/C++ Programs: The Performance Impact [pdf]
(web.ist.utl.pt)89 points by luu 22 April 2025 | 73 comments
Comments
I don't agree with the claim in the paper that their semantics offers a "flat memory model". A flat memory model would rule out the optimization above. Rather, the memory model still has the notion of object bounds; it's just simplified in some ways.
Another interesting thing is that there is clearly synergy between different UB. For the LTO results, disabling each individual UB seems to be either neutral or an improvement, but if you disable all of them at once, then you get a significant regression.
_THANK YOU._