Go Optimization Guide

(goperf.dev)

Comments

nopurpose 31 March 2025
Every perf guide recommends to minimize allocations to reduce GC times, but if you look at pprof of a Go app, GC mark phase is what takes time, not GC sweep. GC mark always starts with known live roots (goroutine stacks, globals, etc) and traverse references from there colouring every pointer. To minimize GC time it is best to avoid _long living_ allocations. Short lived allocations, those which GC mark phase will never reach, has almost neglible effect on GC times.

Allocations of any kind have an effect on triggering GC earlier, but in real apps it is almost hopeless to avoid GC, except for very carefully written programs with no dependenciesm, and if GC happens, then reducing GC mark times gives bigger bang for the buck.

stouset 1 April 2025
Checking out the first example—object pools—I was initially blown away that this is not only possible but it produces no warnings of any kind:

    pool := sync.Pool{
        New: func() any { return 42 }
    }

    a := pool.Get()

    pool.Put("hello")
    pool.Put(struct{}{})

    b := pool.Get()
    c := pool.Get()
    d := pool.Get()

    fmt.Println(a, b, c, d)
Of course, the answer is that this API existed before generics so it just takes and returns `any` (née `interface{}`). It just feels as though golang might be strongly typed in principle, but in practice there are APIs left and rigth that escape out of the type system and lose all of the actual benefits of having it in the first place.

Is a type system all that helpful if you have to keep turning it off any time you want to do something even slightly interesting?

Also I can't help but notice that there's no API to reset values to some initialized default. Shouldn't there be some sort of (perhaps optional) `Clear` callback that resets values back to a sane default, rather than forcing every caller to remember to do so themselves?

kevmo314 31 March 2025
Zero-copy is totally underrated. Like the site alludes to, Go's interfaces make it reasonably accessible to write zero-copy code but it still needs some careful crafting. The payoff is great though, I've often been surprised by how much time is spent allocating and shuffling memory around.
roundup 31 March 2025
Additionally...

- https://go101.org/optimizations/101.html

- https://github.com/uber-go/guide

I wish this content existed as a model context protocol (MCP) tool to connect to my IDE along w/ local LLM.

After 6 months or switching between different language projects, it's challenging to remember all the important things.

donatj 1 April 2025
Unpopular opinion maybe, but sync.Pool is so sharp, dangerous and leaky that I'd avoid using it unless it's your absolute last option. And even then, maybe consider a second server first.
jrockway 31 March 2025
GOMEMLIMIT has saved me a number of times. In containerized production, it's nice, because sometimes jobs are ephemeral and don't even do enough allocations to hit the memory limit, so you don't spend any time in GC. But it's saved me the most times in CI where golangci-lint or govulncheck can't complete without running out of memory on a kind-of-large CI machine. Set GOMEMLIMIT and it eventually completes. (I switched to nogo, though, so at least golangci-lint isn't a problem anymore.)
parhamn 31 March 2025
Noticed the object pooling doc, had me wondering: are there any plans to make packages like `sync` generic?
dennis-tra 1 April 2025
Can someone explain to me why the compiler can’t do struct-field-alignment? This feels like something that can easily be automated.
__turbobrew__ 2 April 2025
Calling mmap “zero copy” is generous. I guess we glaze over the whole page fault thing, or the fact that performance is heavily dependent on how much memory pressure the process is under.

This is the same n00b trap that derailed the llama.cpp project last year because people don’t understand how memory maps and paging works, and the tradeoffs.

neillyons 1 April 2025
Curious to know what people are building where you need to optimise like this? eg Struct Field Alignment https://goperf.dev/01-common-patterns/fields-alignment/#avoi...
EdwardDiego 1 April 2025
Huh, this surprises me about Golang, didn't realise it was so similar to C with struct alignment. https://goperf.dev/01-common-patterns/fields-alignment/#why-...
inadequatespace 2 April 2025
Why doesn’t the compiler pack structs for you if it’s as easy as shuffling around based on type?
jensneuse 31 March 2025
You can often fool yourself by using sync.Pool. pprof looks great because no allocs in benchmarks but memory usage goes through the roof. It's important to measure real world benefits, if any, and not just synthetic benchmarks.
nikolayasdf123 1 April 2025
nicely organised. I feel like this could grow into community driven current state-of-the-art of optimisation tips for Go. just need to allow people edit/comment their input easily (preferably in-place). I see there is github repo, but my bet people would not actively add their input/suggestions/research there, it is hidden too far from the content/website itself
kunley 1 April 2025
"Although the struct Data contains a [1024]int array, which is 4 KB (assuming int is 4 bytes on the architecture used)"

Huh,what?

I mean, who uses 32b architecture by default?

_345 1 April 2025
Anyone know of a resource like this but for Python 3?
nikolayasdf123 1 April 2025
nice article. good to see statements backed up by Benchmarks right there
ljm 31 March 2025
You're not really writing 'Go' anymore when you're optimising it, it's defeating the point of the language as a simple but powerful interface over networked services.