Pipelining might be my favorite programming language feature

(herecomesthemoon.net)

Comments

invalidator 21 April 2025
The author keeps calling it "pipelining", but I think the right term is "method chaining".

Compare with a simple pipeline in bash:

  grep needle < haystack.txt | sed 's/foo/bar/g' | xargs wc -l
Each of those components executes in parallel, with the intermediate results streaming between them. You get a similar effect with coroutines.

Compare Ruby:

  data = File.readlines("haystack.txt")
    .map(&:strip)
    .grep(/needle/)
    .map { |i| i.gsub('foo', 'bar') }
    .map { |i| File.readlines(i).count }
In that case, each line is processed sequentially, with a complete array being created between each step. Nothing actually gets pipelined.

Despite being clean and readable, I don't tend to do it any more, because it's harder to debug. More often these days, I write things like this:

  data = File.readlines("haystack.txt")
  data = data.map(&:strip)
  data = data.grep(/needle/)
  data = data.map { |i| i.gsub('foo', 'bar') }
  data = data.map { |i| File.readlines(i).count }
It's ugly, but you know what? I can set a breakpoint anywhere and inspect the intermediate states without having to edit the script in prod. Sometimes ugly and boring is better.
bnchrch 21 April 2025
I'm personally someone who advocates for languages to keep their feature set small and shoot to achieve a finished feature set quickly.

However.

I would be lying if I didn't secretly wish that all languages adopted the `|>` syntax from Elixir.

```

params

|> Map.get("user")

|> create_user()

|> notify_admin()

```

Straw 21 April 2025
Lisp macros allow a general solution to this that doesn't just handle chained collection operators but allows you to decide the order in which you write any chain of calls.

For example, we can write: (foo (bar (baz x))) as (-> x baz bar foo)

If there are additional arguments, we can accommodate those too: (sin (* x pi) as (-> x (* pi) sin)

Where expression so far gets inserted as the first argument to any form. If you want it inserted as the last argument, you can use ->> instead:

(filter positive? (map sin x)) as (->> x (map sin) (filter positive?))

You can also get full control of where to place the previous expression using as->.

Full details at https://clojure.org/guides/threading_macros

duped 21 April 2025
A pipeline operator is just partial application with less power. You should be able to bind any number of arguments to any places in order to create a new function and "pipe" its output(s) to any other number of functions.

One day, we'll (re)discover that partial application is actually incredibly useful for writing programs and (non-Haskell) languages will start with it as the primitive for composing programs instead of finding out that it would be nice later, and bolting on a restricted subset of the feature.

SimonDorfman 21 April 2025
The tidyverse folks in R have been using that for a while: https://magrittr.tidyverse.org/reference/pipe.html
amai 21 April 2025
Pipelining looks nice until you have to debug it. And exception handling is also very difficult, because that means to add forks into your pipelines. Pipelines are only good for programming the happy path.
kordlessagain 21 April 2025
While the author claims "semantics beat syntax every day of the week," the entire article focuses on syntax preferences rather than semantic differences.

Pipelining can become hard to debug when chains get very long. The author doesn't address how hard it can be to identify which step in a long chain caused an error.

They do make fun of Python, however. But don't say much about why they don't like it other than showing a low-res photo of a rock with a pipe routed around it.

Ambiguity about what constitutes "pipelining" is the real issue here. The definition keeps shifting throughout the article. Is it method chaining? Operator overloading? First-class functions? The author uses examples that function very differently.

epolanski 21 April 2025
I personally like how effect-ts allows you to write both pipelines or imperative code to express the very same things.

Building pipelines:

https://effect.website/docs/getting-started/building-pipelin...

Using generators:

https://effect.website/docs/getting-started/using-generators...

Having both options is great (at the beginning effect had only pipe-based pipelines), after years of writing effect I'm convinced that most of the time you'd rather write and read imperative code than pipelines which definitely have their place in code bases.

In fact most of the community, at large, converged at using imperative-style generators over pipelines and having onboarded many devs and having seen many long-time pipeliners converging to classical imperative control flow seems to confirm both debugging and maintenance seem easier.

osigurdson 21 April 2025
C# has had "Pipelining" (aka Linq) for 17 years. I do miss this kind of stuff in Go a little.
singularity2001 21 April 2025
I tried to convince the julia authors to make a.b(c) synonymous to b(a,c) like in nim (for similar reasons as in the article). They didn't like it.
0xf00ff00f 21 April 2025
First example doesn't look bad in C++23:

    auto get_ids(std::span<const Widget> data)
    {
        return data
            | filter(&Widget::alive)
            | transform(&Widget::id)
            | to<std::vector>();
    }
cutler 21 April 2025
Clojure has pipeline functions -> and ->> without resorting to OO dot syntax.
vitus 21 April 2025
I think the biggest win for pipelining in SQL is the fact that we no longer have to explain that SQL execution order has nothing to do with query order, and we no longer have to pretend that we're mimicking natural language. (That last point stops being the case when you go beyond "SELECT foo FROM table WHERE bar LIMIT 10".)

No longer do we have to explain that expressions are evaluated in the order of FROM -> JOIN -> ON -> SELECT -> WHERE -> GROUP BY -> HAVING -> ORDER BY -> LIMIT (and yes, I know I'm missing several other steps). We can simply just express how our data flows from one statement to the next.

(I'm also stating this as someone who has yet to play around with the pipelining syntax, but honestly anything is better than the status quo.)

rocqua 22 April 2025
The left associativity of functions really doesn't work well with English reading left to right. I found this especially clear with the'composition opperator' of functions. Where f.g has to mean f _after_ g because you really want:

    f.g = f(g(x))
Based on this, I think a reverse polish type of notation would be a lot better. Though perhaps it is a lot nicer to think of "the sine of an angle" than "angle sine-ed".

Not that it matters much, the switching costs are immense. Getting people able to teach it would be impossible, and collaboration with people taught in the other system would be horrible. I am doubtful I could make the switch, even if I wanted.

mrkeen 21 April 2025

  data.iter()
      .filter(|w| w.alive)
      .map(|w| w.id)
      .collect()

  collect(map(filter(iter(data), |w| w.alive), |w| w.id))
The second approach is open for extension - it allows you to write new functions on old datatypes.

> Quick challenge for the curious Rustacean, can you explain why we cannot rewrite the above code like this, even if we import all of the symbols?

Probably for lack of

> weird operators like <$>, <*>, $, or >>=

davemp 22 April 2025
Computer scientists continue to pick terrible names. Pipelining is already an overloaded concept that implies some type of operation level parallelism. Picking names like this does everyone in the field a disservice. Calling it something like “composition chain” would be much clearer with respect to existing literature in the field. Maybe I’m being nitpicky, but sometimes it feels like the tower of babel parable talking to folks who use different ecosystems.
andyferris 22 April 2025
Pipelining is great! Though sometimes you want to put the value in the first argument of a function, or a different location, or else call a method... it can be nice to simply refer to the value directly with `_` or `%` or `$` or something.

In fact, I always thought it would be a good idea for all statement blocks (in any given programming language) to allow an implicit reference to the value of the previous statement. The pipeline operation would essentially be the existing semicolons (in a C-like language) and there would be a new symbol or keyword used to represent the previous value.

For example, the MATLAB REPL allows for referring to the previous value as `ans` and the Julia REPL has inherited the same functionality. You can copy-paste this into the Julia REPL today:

    [1, 2, 3];
    map(x -> x * 2, ans);
    @show ans;
    filter(x -> x > 2, ans);
    @show ans;
    sum(ans)
You can't use this in Julia outside the REPL, and I don't think `ans` is a particularly good keyword for this, but I honestly think the concept is good enough. The same thing in JavaScript using `$` as an example:

    {
      [1 ,2, 3];
      $.map(x => x * 2);
      (console.log($), $);
      $.filter(x => x > 2);
      (console.log($), $);
      $.reduce((acc, next) => acc + next, 0)
    }
I feel it would work best with expression-based languages having blocks that return their final value (like Rust) since you can do all sorts of nesting and so-on.
okayishdefaults 21 April 2025
Surprised that the term "tacit programming" wasn't mentioned once in the article.

Point-free style and pipelining were meant for each other. https://en.m.wikipedia.org/wiki/Tacit_programming

weinzierl 21 April 2025
I suffer from (what I call) bracket claustrophobia. Whenever brackets get nested too deep I makes me uncomfortable. But I fully realize that there are people who are the complete opposite. Lisp programmers are apparently as claustrophil as cats and spelunkers.
snthpy 23 April 2025
Nice post and I very much agree!

In fact I tried to make some similar points in my CMU "SQL or Death" Seminar Series talk on PRQL (https://db.cs.cmu.edu/events/sql-death-prql-pipelined-relati...) in that I would love to see PRQL (or something like it) become a universal DSL for data pipelines. Ideally this wouldn't even have to go through some query engine and could just do some (byte)codegen for your target language.

P.S. Since you mentioned the Google Pipe Syntax HYTRADBOI 2025 talk, I just want to throw out that I also have a 10 min version for the impatient: https://www.hytradboi.com/2025/deafce13-67ac-40fd-ac4b-175d5... That's just a PRQL overview though. The Universal Data Pipeline DSL ideas and comparison to LINQ, F#, ... are only in the CMU talk. I also go a bit into imperative vs declarative and point out that since "pipelining" is just function composition it should really be "functional" rather than imperative or declarative (which also came up in this thread).

RHSeeger 21 April 2025
I feel like, at least in some cases, the article is going out of its way to make the "undesired" look worse than it needs to be. Compairing

    fn get_ids(data: Vec<Widget>) -> Vec<Id> {
        collect(map(filter(map(iter(data), |w| w.toWingding()), |w| w.alive), |w| w.id))
    }
to

    fn get_ids(data: Vec<Widget>) -> Vec<Id> {
        data.iter()
            .map(|w| w.toWingding())
            .filter(|w| w.alive)
            .map(|w| w.id)
            .collect()
    }
The first one would read more easily (and, since it called out, diff better)

    fn get_ids(data: Vec<Widget>) -> Vec<Id> {
        collect(
            map(
                filter(
                    map(iter(data), |w| w.toWingding()), |w| w.alive), |w| w.id))
    }
Admittedly, the chaining is still better. But a fair number of the article's complaints are about the lack of newlines being used; not about chaining itself.
bjourne 22 April 2025
In concatenative languages with an implicit stack (Factor) that expression would read:

    iter [ alive? ] filter [ id>> ] map collect
The beauty of this is that everything can be evaluated strictly left-to-right. Every single symbol. "Pipelines" in other languages are never fully left-to-right evaluated. For example, ".filter(|w| w.alive)" in the author's example requires one to switch from postfix to infix evaluation to evaluate the filter application.

The major advantage is that handling multiple streams is natural. Suppose you want to compute the dot product of two files where each line contains a float:

    fileA fileB [ lines [ str>float ] map ] bi@ [ mul ] 2map 0 [ + ] reduce
relaxing 21 April 2025
These articles never explain what’s wrong with calling each function separately and storing each return value in an intermediate variable.

Being able to inspect the results of each step right at the point you’ve written it is pretty convenient. It’s readable. And the compiler will optimize it out.

EnPissant 22 April 2025
I don't know. I find this:

    fn get_ids(data: Vec<Widget>) -> Vec<Id> {
        let mut result = Vec::new();
    
        for widget in &data {
            if widget.alive {
                result.push(widget.id);
            }
        }
    
        result
    }
more readable than this:

    fn get_ids(data: Vec<Widget>) -> Vec<Id> {
        data.iter()
            .filter(|w| w.alive)
            .map(|w| w.id)
            .collect()
    }
and I also dislike Rust requiring you to write "mut" for function mutable values. It's mostly just busywork and dogma.
hliyan 21 April 2025
I always wondered how programming would be if we hadn't designed the assignment operator to be consistent with mathematics, and instead had it go LHS -> RHS, i.e. you perform the operation and then decide its destination, much like Unix pipes.
dapperdrake 21 April 2025
Pipelining in software is covered by Richard C. Waters (1989a, 1989b). Wrangles this library to work with JavaScript. Incredibly effective. Much faster at writing and composing code. And this code executes much faster.

https://dspace.mit.edu/handle/1721.1/6035

https://dspace.mit.edu/handle/1721.1/6031

https://dapperdrake.neocities.org/faster-loops-javascript.ht...

neuroelectron 21 April 2025
I really like the website layout. I'm guessing that they're optimizing for Kindle or other e-paper readers.
otsukare 21 April 2025
I wish more languages would aim for infix functions (like Haskell and Kotlin), rather than specifically the pipe operator.
huyegn 21 April 2025
I liked the pipelining syntax so much from pyspark and linq that I ended up implementing my own mini linq-like library for python to use in local development. It's mainly used in quick data processing scripts that I run locally. The syntax just makes everything much nicer to work with.

https://datapad.readthedocs.io/en/latest/quickstart.html#ove...

layer8 21 April 2025
The one thing that I don’t like about pipelining (whether using a pipe operator or method chaining), is that assigning the result to a variable goes in the wrong direction, so to speak. There should be an equivalent of the shell’s `>` for piping into a variable as the final step. Of course, if the variable is being declared at the same time, whatever the concrete syntax is would still require some getting used to, being “backwards” compared to regular assignment/initialization.
1899-12-30 21 April 2025
You can somewhat achieve a pipelined like system in sql by breaking down your steps into multiple CTEs. YMMV on the performance though.
wavemode 21 April 2025
> At this point you might wonder if Haskell has some sort of pipelining operator, and yes, it turns out that one was added in 2014! That’s pretty late considering that Haskell exists since 1990.

The tone of this (and the entire Haskell section of the article, tbh) is rather strange. Operators aren't special syntax and they aren't "added" to the language. Operators are just functions that by default use infix position. (In fact, any function can be called in infix position. And operators can be called in prefix position.)

The commit in question added & to the prelude. But if you wanted & (or any other character) to represent pipelining you have always been able to define that yourself.

Some people find this horrifying, which is a perfectly valid opinion (though in practice, when working in Haskell it isn't much of a big deal if you aren't foolish with it). But at least get the facts correct.

shae 21 April 2025
If Python object methods returned `self` by default instead of `None` you could do this in Python too!

This is my biggest complaint about Python.

mexicocitinluez 21 April 2025
LINQ is easily one of C#'s best features.
pxc 21 April 2025
Maybe it's because I love the Unix shell environment so much, but I also really love this style. I try to make good use of it in every language I write code in, and I think it helps make my control flow very simple. With lots of pipelines, and few conditionals or loops, everything becomes very easy to follow.
jmyeet 22 April 2025
Hack (Facebook's PHP fork) has this feature. It's called pipes [1]:

    $x = vec[2,1,3]
      |> Vec\map($$, $a ==> $a * $a) // $$ with value vec[2,1,3]
      |> Vec\sort($$); // $$ with value vec[4,1,9]
It is a nice feature. I do worry about error reporting with any feature that combines multiple statements into a single statement, which is essentially what this does. In Java, there was always an issue with NullPointerExceptiosn being thrown and if you chain several things together you're never sure which one was null.

[1]: https://docs.hhvm.com/hack/expressions-and-operators/pipe

rokob 22 April 2025
I learned the term for this as a fluent interface. Pipelining is in my mind something quite different.
flakiness 21 April 2025
After seeing LangChain abusing the "|" operator overload for pipeline-like DSL, I followed the suite at work and I loved it. It's especially good when you use it in a notebook environment where you literally build the pipeline incrementally through repl.
jiggawatts 22 April 2025
PowerShell has the best pipeline capability of any language I have ever seen.

For comparison, UNIX pipes support only trivial byte streams from output to input.

PowerShell allows typed object streams where the properties of the object are automatically wired up to named parameters of the commands on the pipeline.

Outputs at any stage can not only be wired directly to the next stage but also captured into named variables for use later in the pipeline.

Every command in the pipeline also gets begin/end/cancel handlers automatically invoked so you can set up accumulators, authentication, or whatever.

UNIX scripting advocates don’t know what they’re missing out on…

_heimdall 22 April 2025
Is pipelining the right term here? I've always used the term "transducer" to describe this kind of process, I picked it up from an episode of FunFunFunction if I'm not mistaken.
TrianguloY 21 April 2025
Kotlin sort of have it with let (and run)

    a().let{ b(it) }.let{ c(it) }
taeric 21 April 2025
A thing I really like about pipelines in shell scripts, is all of the buffering and threading implied by them. Semantically, you can see what command is producing output, and what command is consuming it. With some idea of how the CPU will be split by them.

This is far different than the pattern described in the article, though. Small shame they have come to have the same name. I can see how both work with the metaphor; such that I can't really complain. The "pass a single parameter" along is far less attractive to me, though.

zelphirkalt 21 April 2025
To one up this: Of course it is even better, if your language allows you to implement proper pipelining with implicit argument passing by yourself. Then the standard language does not need to provide it and assign meaning to some symbols for pipelining. You can decide for yourself what symbols are used and what you find intuitive.

Pipelining can guide one to write a bit cleaner code, viewing steps of computation as such, and not as modifications of global state. It forces one to make each step return a result, write proper functions. I like proper pipelining a lot.

chewbacha 21 April 2025
Is this pipelining or the builder pattern?
stuaxo 21 April 2025
It's part of why JQuery was so great, and the Django ORM.
dpc_01234 21 April 2025
I think there's a language syntax to be invented that would make everything suffix/pipeline-based. Stack based languages are kind of there, but I don't think exactly the same thing.

BTW. For people complaining about debug-ability of it: https://doc.rust-lang.org/std/iter/trait.Iterator.html#metho... etc.

true_blue 21 April 2025
That new Rhombus language that was featured here recently has an interesting feature where you can use `_` in a function call to act as a "placeholder" for an argument. Essentially it's an easy way to partially apply a function. This works very well with piping because it allows you to pipe into any argument of a function (including optional arguments iirc) rather than just the first like many pipe implementations have. It seems really cool!
immibis 22 April 2025
We had this - it was called variables. You could do:

x = iter(data);

y = filter(x, w=>w.isAlive);

z = map(y, w=>w.id);

return collect(z);

It doesn't need new syntax, but to implement this with the existing syntax you do have to figure out what the intermediate objects are, but you also have that problem with "pipelining" unless it compiles the whole chain into a single thing a la Linq.

raggi 21 April 2025
> (This is not real Rust code. Quick challenge for the curious Rustacean, can you explain why we cannot rewrite the above code like this, even if we import all of the symbols?)

Um, you can:

        #![feature(import_trait_associated_functions)]
        use Iterator::{collect, map, filter};
        
        fn get_ids2(data: Vec<usize>) -> Vec<usize> {
            collect(map(filter(<[_]>::iter(&data), |v| ...), |v| ...))
        }
and you can because it's lazy, which is also the same reason you can write it the other way.. in rust. I think the author was getting at an ownership trap, but that trap is avoided the same way for both arrangements, the call order is the same in both arrangements. If the calls were actually a pipeline (if collect didn't exist and didn't need to be called) then other considerations show up.
XorNot 21 April 2025
Every example of why this is meant to be good is contrived.

You have a create_user function that doesn't error? Has no branches based on type of error?

We're having arguments over the best way break these over multiple lines?

Like.. why not just store intermediate results in variables? Where our branch logic can just be written inline? And then the flow of data can be very simply determined by reading top to bottom?

kissgyorgy 22 April 2025
In Nix, you can do something like this:

    gitRef = with lib;
      pipe .git/HEAD [
        readFile
        trim
        (splitString ":")
        last
        trim
        (ref: ./git/${ref})
        readFile
        trim
      ];
Super clean and cool!
amelius 21 April 2025
Am I the only one who thinks yuck?

Instead of writing: a().b().c().d(), it's much nicer to write: d(c(b(a()))), or perhaps (d ∘ c ∘ b ∘ a)().

middayc 22 April 2025
Ryelang has a little different take on this ... op-words and pipe-words: https://ryelang.org/meet_rye/specifics/opwords/
jesse__ 22 April 2025
I've always wondered why more languages don't do this. It just makes sense
moralestapia 22 April 2025
(Un)surprisingly, the author ignores this is almost already a thing in JS. What a terrible oversight.

Anyway, JS wins again, give it a try if you haven't, it's one of the best languages out there.

drchickensalad 21 April 2025
I miss F#
kuon 21 April 2025
That's also why I enjoy elixir a lot.

The |> operator is really cool.

ZYbCRq22HbJ2y7 21 April 2025
Its nice sugar, but pretty much any modern widely used language supports "pipelining", just not of the SML flavor.
bluSCALE4 21 April 2025
Same. The sad part is that pipelining seems to be something AI is really good at so I'm finding myself writing less of it.
Weryj 22 April 2025
LINQ was my gateway drug into functional programming, Pipelining is so beautiful.
jongjong 21 April 2025
Pipelining is great. Currying is horrible. Though currying superficially looks similar to pipelining.

One difference is that currying returns an incomplete result (another function) which must be called again at a later time. On the other hand, pipelining usually returns raw values. Currying returns functions until the last step. The main philosophical failure of currying is that it treats logic/functions as if they were state which should be passed around. This is bad. Components should be responsible for their own state and should just talk to each other to pass plain information. State moves, logic doesn't move. A module shouldn't have awareness of what tools/logic other modules need to do their jobs. This completely breaks the separation of concerns principle.

When you call a plumber to fix your drain, do you need to provide them with a toolbox? Do you even need to know what's inside their toolbox? The plumber knows what tools they need. You just show them what the problem is. Passing functions to another module is like giving a plumber a toolbox which you put together by guessing what tools they might need. You're not a plumber, why should you decide what tools the plumber needs?

Currying encourages spaghetti code which is difficult to follow when functions are passed between different modules to complete the currying. In practice, if one can design code which gathers all the info it needs before calling the function once; this leads to much cleaner and much more readable code.

bcoates 22 April 2025
Why is the SQL syntax so unnecessarily convoluted? SQL is already an operator language, just an overly constrained one due to historical baggage. If you're going to allow new syntax at all, you can just do

  from customer
  left join orders on c_custkey = o_custkey and o_comment not like '%unusual%'
  group by c_custkey
  alias count(o_orderkey) as count_of_orders
  group by count_of_orders
  alias count(*) as count_of_customers
  order by count_of_customers desc
  select count_of_customers, count_of_orders;
  
I'm using 'alias' here as a strawman keyword for what the slide deck calls a free-standing 'as' operator because you can't reuse that keyword, it makes the grammar a mess.

The aliases aren't really necessary, you could just write the last line as 'select count(count(*)) ncust, count(*) nord' if you aren't afraid of nested aggregations, and if you are you'll never understand window functions, soo...

The |> syntax adds visual noise without expressive power, and the novelty 'aggregate'/'call' operators are weird special-case syntax for something that isn't that complex in the first place.

The implicit projection is unnecessary too, for the same reason any decent SQL linter will flag an ambiguous 'select *'

jaymbo 21 April 2025
This is why I love Scala so much
joeevans1000 21 April 2025
Clojure threading, of course.
wslh 21 April 2025
I also like a syntax that includes pipelining parallelization, for example:

A

.B

.C

  || D

  || E
tantalor 21 April 2025
> allows you to omit a single argument from your parameter list, by instead passing the previous value

I have no idea what this is trying to say, or what it has to do with the rest of the article.

guerrilla 21 April 2025
This is just super basic functional programming. Seems like we're taking the long way around...
HackerThemAll 22 April 2025
I like how they are unaware of neat F# pipelines or else how they deliberately "forgot" to mention them.
tpoacher 21 April 2025
pipelines are great IF you can easily debug them as easily as temp variable assignments

... looking at you R and tidyverse hell.

blindseer 21 April 2025
This article is great, and really distills why the ergonomics of Rust is so great and why languages like Julia are so awful in practice.