Push Ifs Up and Fors Down Hackernews Viewer

Push Ifs Up and Fors Down

(matklad.github.io)

564 points by goranmoomin 17 May 2025 | 197 comments

Comments

Waterluvian 17 May 2025

My weird mental model: You have a tree of possible states/program flow. Conditions prune the tree. Prune the tree as early as possible so that you have to do work on fewer branches.

Don’t meticulously evaluate and potentially prune every single branch, only to find you have to prune the whole limb anyways.

Or even weirder: conditionals are about figuring out what work doesn’t need to be done. Loops are the “work.”

Ultimately I want my functions to be about one thing: walking the program tree or doing work.

andyg_blog 17 May 2025

A more general rule is to push ifs close to the source of input: https://gieseanw.wordpress.com/2024/06/24/dont-push-ifs-up-p...

It's really about finding the entry points into your program from the outside (including data you fetch from another service), and then massaging in such a way that you make as many guarantees as possible (preferably encoded into your types) by the time it reaches any core logic, especially the resource heavy parts.

kazinator 17 May 2025

> If there’s an if condition inside a function, consider if it could be moved to the caller instead

This idle conjecture is too rife with counterexamples.

- If the function is called from 37 places, should they all repeat the if statement?

- What if the function is getaddrinfo, or EnterCriticalSection; do we push an if out to the users of the API?

I think that we can only think about this transformation for internal functions which are called from at most two places, and only if the decision is out of their scope of concern.

Another idea is to make the function perform only the if statement, which calls two other helper functions.

If the caller needs to write a loop where the decision is to be hoisted out of the loop, the caller can use the lower-level "decoded-condition helpers". Callers which would only have a single if, not in or around a loop, can use the convenience function which hides the if. But we have to keep in mind that we are doing this for optimization. Optimization often conflicts with good program organization! Maybe it is not good design for the caller to know about the condition; we only opened it up so that we could hoist the condition outside of the caller's loop.

These dilemmas show up in OOP, where the "if" decision that is in the callee is the method dispatch: selecting which method is called.

Techniques to get method dispatch out of loops can also go against the grain of the design. There are some patterns for it.

E.g. wouldn't want to fill a canvas object with a raster image by looping over the image and calling canvas.putpixel(x, y, color). We'd have some method for blitting an image into a canvas (or a rectangular region thereof).

layer8 17 May 2025

The example listed as “dissolving enum refactor” is essentially polymorphism, i.e. you could replace the match by a polymorphic method invocation on the enum. Its purpose is to decouple the point where a case distinction is established (the initial if) from the point where it is acted upon (the invocation of foo/bar). The case distinction is carried by the object (enum value in this case) or closure and need not to be reiterated at the point of invocation (if the match were replaced by polymorphic dispatch). That means that if the case distinction changes, only the point where it is established needs to be changed, not the points where the distinct actions based on it are triggered.

This is a trade-off: It can be beneficial to see the individual cases to be considered at the points where the actions are triggered, at the cost of having an additional code-level dependency on the list of individual cases.

password4321 17 May 2025

Code complexity scanners⁰ eventually force pushing ifs down. The article recommends the opposite:

By pushing ifs up, you often end up centralizing control flow in a single function, which has a complex branching logic, but all the actual work is delegated to straight line subroutines.

⁰ https://docs.sonarsource.com/sonarqube-server/latest/user-gu...

shawnz 17 May 2025

Sometimes I like to put the conditional logic in the callee because it prevents the caller from doing things in the wrong order by accident.

Like for example, if you want to make an idempotent operation, you might first check if the thing has been done already and if not, then do it.

If you push that conditional out to the caller, now every caller of your function has to individually make sure they call it in the right way to get a guarantee of idempotency and you can't abstract that guarantee for them. How do you deal with that kind of thing when applying this philosophy?

Another example might be if you want to execute a sequence of checks before doing an operation within a database transaction. How do you apply this philosophy while keeping the checks within the transaction boundary?

krick 18 May 2025

These are extremely opinionated, and shouldn't be treated as a rule of thumb. As somebody else said, there isn't a rule of thumb here at all, but if I was to make up one, I would probably tell you the opposite:

- You have to push ifs down, because of DRY.

- If performance allows, you should consider pushing fors up, because then you have the power of using filter/map/reduce and function compositions to choose what actions you want to apply to which objects, essentially vectorizing the code.

rco8786 17 May 2025

I'm not sure I buy the idea that this is a "good" rule to follow. Sometimes maybe? But it's so contextually dependent that I have a hard time drawing any conclusions about it.

Feels a lot like "i before e except after c" where there's so many exceptions to the rule that it may as well not exist.

dcre 17 May 2025

I took a version of this away from Sandi Metz’s 99 Bottles of OOP. It’s not really my style overall, but the point about moving logic forks up the call stack was very well taken when I was working on a codebase where we had added a ton of flags that got passed down through many layers.

https://sandimetz.com/99bottles

Kuyawa 17 May 2025

Push everything down for better code readability

  printInvoice(invoice, options) // is much better than

  if(printerReady){
    if(printerHasInk){
      if(printerHasPaper){
        if(invoiceFormatIsPortrait){
  :

The same can be said of loops

  printInvoices(invoices) // much better than

  for(invoice of invoices){
    printInvoice(invoice)
  }

At the end, while code readability is extremely important, encapsulation is much more important, so mix both accordingly.

janosch_123 17 May 2025

If's to the top as guard statements.

Add asserts to the end of the function too.

Loop's can live in the middle, take as much I/O and compute out of the loop as you can :)

sparkie 17 May 2025

In some cases you want to do the opposite - to utilize SIMD.

With AVX-512 for example, trivial branching can be replaced with branchless code using the vector mask registers k0-k7, so an if inside a for is better than the for inside the if, which may have to iterate over a sequence of values twice.

To give a basic example, consider a loop like:

    for (int i = 0; i < length ; i++) {
        if (values[i] % 2 == 1)
            values[i] += 1;
        else
            values[i] -= 2;
    }

We can convert this to one which operates on 16 ints per loop iteration, with the loop body containing no branches, where each int is only read and written to memory once (assuming length % 16 == 0).

    __mmask16 consequents;
    __mmask16 alternatives;
    __mm512i results;
    __mm512i ones = _mm512_set1_epi32(1);
    __mm512i twos = _mm512_set1_epi32(2);
    for (int i = 0; i < length ; i += 16) {
        results = _mm512_load_epi32(&values[i]); 
        consequents = _mm512_cmpeq_epi32_mask(_mm512_mod_epi32(results, twos), ones);
        results = _mm512_mask_add_epi32(results, consequents, results, ones);
        alternatives = _knot_mask16(consequents);
        results = _mm512_mask_sub_epi32(results, alternatives, results, twos);
        _mm512_store_epi32(&values[i], results);
    }

Ideally, the compiler will auto-vectorize the first example and produce something equivalent to the second in the compiled object.

gnabgib 17 May 2025

(2023) Discussion at the time (662 points, 295 comments) https://news.ycombinator.com/item?id=38282950

daxfohl 17 May 2025

I like this a lot. At first, putting ifs inside the fors makes things more concise. But it seems like there's always an edge case or requirement change that eventually requires an if outside the for too. Now you've got ifs on both sides of the for, and you've got to look in multiple places to see what's happening. Or worse, subsequent changes will require updating both places.

So yeah, I agree, pulling conditions up can often be better for long-term maintenance, even if initially it seems like it creates redundancy.

neRok 17 May 2025

I agree that the first example in the article is "bad"...

  fn frobnicate(walrus: Option<Walrus>)`)

but the rest makes no sense to me!

  // GOOD
  frobnicate_batch(walruses)
  // BAD
  for walrus in walruses {
    frobnicate(walrus)
  }

It doesn't follow through with the "GOOD" example though...

  fn frobnicate_batch(walruses)
    for walrus in walruses { frobnicate(walrus) }
  }

What did that achieve?

And the next example...

  // GOOD
  if condition {
    for walrus in walruses { walrus.frobnicate() }
  } else {
    for walrus in walruses { walrus.transmogrify() }
  }
  // BAD
  for walrus in walruses {
    if condition { walrus.frobnicate() }
    else { walrus.transmogrify() }
  }

What good is that when...

  walruses = get_5_closest_walruses()
  // "GOOD"
    if walruses.has_hungry() { feed_them_all() }
    else { dont_feed_any() }
  // "BAD"
    for walrus in walruses {
       if walrus.is_hungry() { feed() }
       else { dont_feed() }

imcritic 17 May 2025

This article doesn't explain the benefits of the suggested approach well enough.

And the last example looks like a poor advice and contradicts previous advice: there's rarely a global condition that is enough to check once at the top: the condition usually is inside the walrus. And why do for walrus in pack {walrus.throbnicate()} instead of making throbnicate a function accepting the whole pack?

jmull 17 May 2025

I really don't think there is any general rule of thumb here.

You've really got to have certain contexts before thinking you ought to be pushing ifs up.

I mean generally, you should consider pushing an if up. But you should also consider pushing it down, and leaving it where it is. That is, you're thinking about whether you have a good structure for your code as you write it... aka programming.

I suppose you might say, push common/general/high-level things up, and push implementation details and low-level details down. It seems almost too obvious to say, but I guess it doesn't hurt to back up a little once in a while and think more broadly about your general approach. I guess the author is feeling that ifs are usually about a higher-level concern and loops about a lower-level concern? Maybe that's true? I just don't think it matters, though, because why wouldn't you think about any given if in terms of whether it specifically ought to move up or down?

Aeyxen 18 May 2025

Many variants of this debate play out in real-world systems: data pipelines, game engines, and large-scale web infra. The only universal law is that local code clarity must never be optimized at the expense of global throughput or maintainability. Pushing ifs up absolutely unlocks performance when you're dealing with a hot loop—early bailouts mean less work per iteration, and in my experience, that's often the difference between a scalable system and a bottleneck. But the real win is batch processing (pushing fors down): it's the only way you get cache locality, vectorization, and real-world performance on modern hardware. No amount of OOP purity or DRY dogma can change the physics of memory bandwidth or the nature of branch misprediction.

manmal 17 May 2025

It’s a bit niche for HN, but SwiftUI rendering works way better when following this. In a ForEach, you really shouldn’t have any branching, or you‘ll pay quite catastrophic performance penalties. I found out the hard way when rendering a massive chart with Swift Charts. All branching must be pushed upwards.

quantadev 17 May 2025

There's always a trade-off between performance v.s. clearness in the code.

If a certain function has many preconditions it needs to check, before running, but needs to potentially run from various places in the code, then moving the precondition checks outside the method results in faster code but destroys readability and breaks DRY principle.

In cases where this kind of tension (DRY v.s. non-DRY) exists I've sometimes named methods like 'maybeDoThing' (emphasis on 'maybe' prefix) indicating I'm calling the method, but that all the precondition checks are inside the function itself rather than duplicate logic all across the code, everywhere the method 'maybe' needs to run.

stuaxo 18 May 2025

Within a function, I'm a fan of early bail out.

While this goes against the usual advice of having the positive branch first, if the positive branch is sufficiently large you avoid having most of the function indented.

ashf023 19 May 2025

IMO pushing ifs up makes sense when the condition is enforceable by the type system, e.g. in the Option<Walrus> vs Walrus example, and when doing so makes the function's purpose more clear, or gives callers more flexibility. I think it's generally intuitive when writing code - can I use the type system to guard this rather than checking in my function, and should this function decide what happens in the unexpected cases?

Pushing fors down is usually not that relevant in Rust if all it achieves is inlining. The compiler can do that for you, or you can force it to, while improving reusability. It can make sense if there's more optimization potential with a loop e.g. lifting some logic outside the loop, and the compiler doesn't catch that. I also would avoid doing this in a way that doesn't work well with iterators, e.g. taking and returning a Vec.

Mystery-Machine 17 May 2025

Terrible advice. It's the exact opposite of "Tell, don't ask".

Performance of an if-statement and for-loop are negligent. That's not the bottleneck of your app. If you're building something that needs to be highly performant, sure. But that's not the majority.

https://martinfowler.com/bliki/TellDontAsk.html

xg15 17 May 2025

Doesn't the second rule already imply some counterexamples for the first?

When I work with batches of data, I often end up with functions like this:

  function process_batch(batch) {
    stuff = setUpNeededHelpers(batch);
    results = [];
    for (item in batch) {
      result = process_item(item, stuff);
      results.add(result);
    }
    return results;
  }

Where "stuff" might be various objects, such as counters, lists or dictionaries to track aggregated state, opened IO connections, etc etc.

So the setUpNeededHelpers() section, while not extremely expensive, can have nontrivial cost.

I usually add a clause like

  if (batch.length == 0) {
    return [];
  }

at the beginning of the function to avoid this initialization cost if the batch is empty anyway.

Also, sometimes the initialization requires to access one element from the batch, e.g. to read metadata. Therefore the check also ensures there is at least one element available.

Wouldn't this violate the rule?

salamanderman 17 May 2025

Moving preconditions up depends what the definition of precondition is. There's some open source code I've done a deep dive in (Open cascade) and at some point they had an algorithm that assumed the precondition that the input was sorted, and that precondition was pushed up. Later they swapped out the algorithm for one that performs significantly better on randomized input and can perform very poorly on certain sorted input. Since the precondition was pushed up, though, it seems they didn't know how the input was transformed between the initial entrance function and the final inner function. Edit - if the precondition is something that can be translated into a Type then absolutely move the precondition up and let the compiler can enforce.

carom 17 May 2025

I strongly disagree with this ifs take. I want to validate data where it is used. I do not trust the caller (myself) to go read some comment about the assumptions on input data a function expects. I also don't want to duplicate that check in every caller.

deepsun 18 May 2025

In my field (server programming) readability trumps them all.

Nested g() and h() can be much better if they are even just 1% easier to understand. No one cares about a few extra CPU cycles, because we don't write system or database code.

jasonjmcghee 17 May 2025

My take on the if-statements example wasn't actually so much about if statements.

And this was obfuscated by author's use of global variables everywhere.

The key change was reducing functions' dependencies on outer parameters. Which is great.

jonstewart 17 May 2025

One reason to move conditionals out of loops is that it makes it easier for the compiler to vectorize and otherwise optimize the loop.

With conditionals, it's also useful to express them as ternary assignment when possible. This makes it more likely the optimizer will generate a conditional move instead of a branch. When the condition is not sufficiently predictable, a conditional move is far faster due to branch misprediction. Sometimes it's not always faster in the moment, but it can still alleviate pressure on the branch prediction cache.

slt2021 17 May 2025

Ifs = control flow

Fors = data flow / compute kernel

it makes sense to keep control flow and data flow separated for greater efficiency, so that you independently evolve either of flows while still maintaining consistent logic

esafak 17 May 2025

I agree, except for this example, where the author effectively (after a substitution) prefers the former:

    fn f() -> E {
      if condition {
        E::Foo(x)
      } else {
        E::Bar(y)
      }
    }
    
    fn g(e: E) {
      match e {
        E::Foo(x) => foo(x),
        E::Bar(y) => bar(y)
      }
    }

The latter is not only more readable, but it is safer, because a match statement can ensure all possibilities are covered.

bluejekyll 17 May 2025

I really like this advice, but aren’t these two examples the same, but yet different advice?

// Good? for walrus in walruses { walrus.frobnicate() }

Is essentially equivalent to

// BAD for walrus in walruses { frobnicate(walrus) }

And this is good,

// GOOD frobnicate_batch(walruses)

So should the first one really be something more like

// impl FrobicateAll for &[Walrus] walruses.frobicate_all()

neilv 17 May 2025

A compiler that can prove that the condition-within-loop is constant for the duration of the looping, can lift up that condition branching, and emit two loops.

But I like to help the compiler with this kind of optimization, by just doing it in the code. Let the compiler focus on optimizations that I can't.

lblume 17 May 2025

In some cases the difference between if and for is not as clear-cut. A for loop over an option? Likely rather to be considered as an if. What about length-limited arrays, where the iteration mainly occurs as a way to control whether executions occurs at all?

wiradikusuma 17 May 2025

I just wrote some code with this "dilemma" a minute ago. But I was worried the callers forget to include the "if" so I put it inside the method. Instead, I renamed the method from "doSomething" to "maybeDoSomething".

renewiltord 17 May 2025

Yep, as a general heuristic pretty good. It avoids problems like n+1 queries and not using SIMD. And the if thing often makes it easier to reason about code. There are exceptions but I have had this same rule and it’s served me well.

stevage 17 May 2025

The author's main concern seems to be optimising performance critical code.

Terr_ 18 May 2025

For optimization, sure, but there are also cases where you care more about a maintainable expression of business rules, or the mental model used by subject experts.

ramesh31 18 May 2025

Becoming disciplined about early returns was lifechanging for me. It will blow your mind how much pointless code you were writing before.

boltzmann_ 17 May 2025

You notice this quickly after working on codebases efficiency is important. Filter pushdown is one of the first database optimizations

billmcneale 18 May 2025

These are some pretty arbitrary rules without much justification, quite reminiscent of the Clean Code fad.

uptownfunk 17 May 2025

Wow this is great where can I find this type of advice that relates to how to structure your code essentially.

throwawaymaths 17 May 2025

i would agree with the push ifs up except if youre doing options parsing. having a clean line of flow that effectively updates a struct with a bunch of "maybe" functions is much better if youre consistent with it.

anywhere else, push ifs up.

hamandcheese 17 May 2025

Isn't this just inversion of control? AKA the I in SOLID?

hk1337 18 May 2025

Stop using Else

99% of the time you can write better code without it.

hello_computer 17 May 2025

This thread is a microcosm of Babel.

zahlman 17 May 2025

> If you have complex control flow, better to fit it on a screen in a single function, rather than spread throughout the file.

This part in particular seems like an aesthetic judgment, and I disagree. I find it more natural to follow a flowchart than to stare at one.

> A related pattern here is what I call “dissolving enum” refactor.... There are two branching instructions here and, by pulling them up, it becomes apparent that it is the exact same condition, triplicated (the third time reified as a data structure):

The problem here isn't the code organization, but the premature abstraction. When you write the enum it should be because "reifying the condition as a data structure" is an intentional, purposeful act. Something that empowers you to, for example, evaluate the condition now and defer the response to the next event tick in a GUI.

> The primary benefit here is performance. Plenty of performance, in extreme cases.

Only if so many other things go right. Last I checked, simply wanting walruses to behave polymorphically already ruins your day, even if you've chosen a sufficiently low-level programming language.

A lot of the time, the "bad" code is the implementation of the function called in the "good" code. That makes said function easier to understand, by properly separating responsibilities (defining frobnication and iterating over walruses). Abstracting the inner loop to a function also makes it sane to express the iteration as a list comprehension without people complaining about how you have these nested list comprehensions spread over multiple lines, and why can't you just code imperatively like the normal programmers, etc.

> The two pieces of advice about fors and ifs even compose!

1. The abstraction needed to make the example comprehensible already ruins the illusion of `frobnicate_batch`.

2. If you're working in an environment where this can get you a meaningful performance benefit and `condition` is indeed a loop invariant (such that the transformation is correct), you are surely working in an environment where the compiler can just hoist that loop invariant.

3. The "good" version becomes longer and noisier because we must repeat the loop syntax.

> jQuery was quite successful back in the day, and it operates on collections of elements.

That's because of how it allowed you to create those collections (and provided iterators for them). It abstracted away the complex logic of iterating over the entire DOM tree to select nodes, so that you could focus on iterating linearly over the selected nodes. And that design implicitly, conceptually separated those steps. Even if it didn't actually build a separate container of the selected nodes, you could reason about what you were doing as if it did.

nfw2 17 May 2025

The performance gap of running a for loop inside or outside a function call is negligible in most real usage.

The premise that you can define best patterns like this, removed from context with toy words like frobnicate, is flawed. You should abstract your code in such a way that the operations contained are clearly intuited by the names and parameters of the abstraction boundaries. Managing cognitive load >>> nickle and dime-ing performance in most cases.