> I’ve been interested in speeding up allocations for quite some time. We know that calling a C function from Ruby incurs some overhead, and that the overhead depends on the type of parameters we pass.
> it seemed quite natural to use the triple-dot forwarding syntax (...).
> Unfortunately I found that using ... was quite expensive
> This lead me to implement an optimization for ... .
That’s some excellent yak shaving. And speaking up … in any language is good news even if allocation is not faster.
> It’s very rare for code to allocate exactly the same type of object many times in a row, so the class of the instance local variable will change quite frequently.
That’s dangerous thinking because constructors will be a bimodal distribution.
Either a graph of calls or objects will contain a large number of unique objects, layers of alternating objects, or a lot of one type of object. Any map function for instance will tend to return a bunch of the same object. When the median and the mean diverge like this your thinking about perf gets muddy. An inline cache will make bulk allocations in list comprehensions faster. It won’t make creating DAGs faster. One is better than none.
It seems to me like all languages are converging towards something like WASM. I wonder if in 20 years we will see WASM become the de facto platform that all apps can compile to and all operating systems can run near-natively with only a thin like WASI but more convenient.
I know I may be jumping the gun a little here but I wonder what percentage speedup could we expect on typical rails applications. Especially with Active Record.
Fast Allocations in Ruby 3.5
(railsatscale.com)255 points by tekknolagi 22 May 2025 | 62 comments
Comments
> it seemed quite natural to use the triple-dot forwarding syntax (...).
> Unfortunately I found that using ... was quite expensive
> This lead me to implement an optimization for ... .
That’s some excellent yak shaving. And speaking up … in any language is good news even if allocation is not faster.
And if so, will these YJIT features likes Fast Allocations be brought to ZJIT?
https://railsatscale.com/2025-05-14-merge-zjit/
That’s dangerous thinking because constructors will be a bimodal distribution.
Either a graph of calls or objects will contain a large number of unique objects, layers of alternating objects, or a lot of one type of object. Any map function for instance will tend to return a bunch of the same object. When the median and the mean diverge like this your thinking about perf gets muddy. An inline cache will make bulk allocations in list comprehensions faster. It won’t make creating DAGs faster. One is better than none.