Exploiting Undefined Behavior in C/C++ Programs: The Performance Impact [pdf] Hackernews Viewer

Exploiting Undefined Behavior in C/C++ Programs: The Performance Impact [pdf]

(web.ist.utl.pt)

97 points by luu 22 April 2025 | 78 comments

Comments

pcwalton 25 April 2025

I notice that the paper doesn't claim to eliminate all reasoning about undefined behavior for optimizations. For example:

    int f() {
        int arr[3], i = 0;
        arr[3] = 5;
        return i;
    }

Optimizing this to "return 0" is relying on UB, because it's assuming that i wasn't laid out directly after arr in the stack frame. I believe this is what the paper calls "non-guardable UB".

I don't agree with the claim in the paper that their semantics offers a "flat memory model". A flat memory model would rule out the optimization above. Rather, the memory model still has the notion of object bounds; it's just simplified in some ways.

jonstewart 25 April 2025

> The results show that, in the cases we evaluated, the performance gains from exploiting UB are minimal. Furthermore, in the cases where performance regresses, it can often be recovered by either small to moderate changes to the compiler or by using link-time optimizations.

_THANK YOU._

nikic 25 April 2025

One peculiar thing about the benchmark results is that disabling individual UB seems to fairly consistently reduce performance without LTO, but improve it with LTO. I could see how the UB may be less useful with LTO, but it's not obvious to me why reducing UB would actually help LTO. As far as I can tell, the paper does not attempt to explain this effect.

Another interesting thing is that there is clearly synergy between different UB. For the LTO results, disabling each individual UB seems to be either neutral or an improvement, but if you disable all of them at once, then you get a significant regression.

mwkaufma 25 April 2025

Reading e.g. the 13% perf regression in simdjson from disabling UB:

  A simpler alternative is to compile the program with LTO. We confirmed that LLVM’s inter-procedural analyses can propagate both alignment and dereferenceability information for this function, which allows the LTO build to recover the performance loss.

"can" is doing a lot of heavy-lifting here. Guaranteeing expected optimizations "will" be applied are hard-enough, without leaving it entirely to an easily-derailed indirect side-effect.

hnaccountme 28 April 2025

You don't program C. You program your OS using C.

If you look at it this way, does most complaints about undefined behavior go away?

gitroom 25 April 2025

perfect, this is right up my alley - honestly i keep wondering if teams avoid optimizations like lto just because build pain sucks or if theres some deeper trust issues around letting the toolchain be clever. you think peopled deal with slow builds if it bought way more speed for the final product?

imtringued 27 April 2025

Amazing that the pain of C is unnecessary and offers few benefits.