Improving Steam Client Stability on Linux

(ttimo.typepad.com)

Comments

fweimer 11 November 2024
We've got patches under review: https://inbox.sourceware.org/libc-alpha/cover.1722193092.git... (triggered by https://issues.redhat.com/browse/RHEL-42410, a graphics stack stability issue that wasn't as visible in RHEL 9 for some reason)

At least the first one (the getenv thread safety fix) will hopefully make it into glibc 2.41 and it should be quite safe to backport. It turns out that setenv is easier to handle because glibc already never frees environment strings. It's concurrent unsetenv that is rather tricky. Without some snapshot approach, getenv would return null pointers instead of environment variables values that are actually set. I don't want to introduce locking into getenv because getenv without setenv has been async-signal-safe for so long that it would likely break applications.

The environ handling fixes are a bit more controversial because vfork+execve make it complicated to avoid memory leaks, but these further fixes are less important to the stability of the graphics stack.

electromech 11 November 2024
Thank you! I deeply appreciate that Steam works so well on Linux these days. I don't take for granted the hard work happening behind the scenes to make that a reality for us.
vlovich123 12 November 2024
Isn’t best practice to read all environment variables on boot and never use setenv? The only place where setenv would matter is for spawning new processes where you should probably be creating an new environ cloned from the current one and update the new values. Using getenv/setenv as an IPC messaging mechanism seems to be an opportunity for lots of issues aside from it historically not being multithreaded-safe on Linux and having all sorts of potential memory leaks hiding (which is what the post ignores when it says that it’s thread safe on MacOS).
bhawks 12 November 2024
FWIW the decision to leak memory on Mac actually goes back ~26 years to FreeBSD - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=5604 which OSX inherited. I would not be surprised that Windows setenv has BSD roots due to licensing.

26 years ago people knew this API was broken but didn't fix it due to inertia of breaking buggy programs further.

There really shouldn't be a need to change your own process's envvars. For subprocesses just use the proper exec function. For anything else there should be a clear API to call rather than changing a global variable and hoping some code far away from yours rereads it and handles things correctly.

o11c 11 November 2024
The real question is: is there any case where a program calls `setenv` in one thread and actually wants it to take effect in other already-existing threads?

That said, GLIBC is pretty good at documenting all the dangerous functions, so it is possible to add locking/copying yourself.

josephcsible 12 November 2024
> One of my colleagues rightly dubbed setenv "the worst Linux API".

Is setenv really a Linux API, since it's neither defined by Linux (it's in POSIX) nor implemented by the Linux kernel (it's entirely in userspace)?

INTPenis 12 November 2024
The mere existence of Steam is astounding to someone who grew up playing nethack and chess on Linux.

But the Steam client is really strange. Sometimes it works for months, and suddenly a game won't start, or something doesn't work, and I have to do weird stuff to get it working like purging all files or reinstalling. It doesn't make sense, it's like the Steam client rots.

accrual 12 November 2024
This is really cool insight into both the Steam client and Linux programming. I understand why there may not be detailed release notes every release, but wow "Fixed some miscellaneous common crashes" is an understatement when you know about this work!
bhaney 11 November 2024
> If this can be addressed in glibc, it may involve a tradeoff on features, maybe an opt-in mechanism with a slight departure from the "impossible" POSIX spec. That's something we may pursue in the long term if we can propose something sensible.

Yes please

WhyNotHugo 12 November 2024
I'm really curious why they're using setenv(3) so much. The main usages that I can think of is setting an environment variable before calling something like exec(3). That doesn't seem to be the case here.

The article mentions that they use exevpe for spawning children processes. So what usages of setenv(3) would remain?

thrdbndndn 12 November 2024
> We removed the majority of setenv calls. It was mostly used when spawning processes

Could someone elaborate this for a non-developer? Why would you use `setenv` (which I assume is functionally similar to `export key=value`, but correctly me if I'm wrong) (extensively) for spawning processes?

jeffbee 11 November 2024
Maybe among the best decisions Java ever made was hiding setenv. You simply cannot set env vars in Java.
arendtio 13 November 2024
Always cool to rediscover people via HN. This post reminded me of the work ttimo did for the Quake 3 engine more than two decades ago. I remember it because I read so many comments written by him (like 15 years ago):

https://github.com/search?q=repo%3Aioquake%2Fioq3+ttimo&type...

apatheticonion 12 November 2024
Would love an open source Steam client
Pannoniae 11 November 2024
To raise awareness: there's been a bug with the Linux Steam client which has been persistent for a long time.

TL;DR: if you have Steam running for more than a ~day or so, you will run out of window handles so you won't be able to open any new graphical application/window until you restart Steam.

Using Steam Chat appears to make the issue worse (it happens earlier).

This has been documented under https://github.com/ValveSoftware/steam-for-linux/issues/9094 but for some reason that issue has been closed.

I personally just restart Steam every day but if someone else encounters this issue and doesn't know why their windows are not opening, this is why :)

I am using KDE/Wayland but I've observed this under X11 too.

russnes 12 November 2024
I love the steam client on linux these days, especially the compatibility for non-steam games is so great and Ive been using it to play WoW Classic while I have covid
DanielHB 12 November 2024
I can't wait to ditch windows for my tower PC.
matheusmoreira 12 November 2024
I wish they'd make it more virtualization friendly. I don't want to run untrustworthy proprietary software on my main system. Common sandboxing mechanisms are insufficient since Steam and its games need access to the entire device tree anyway. Nothing short of a real virtual machine would do it for me. Will also make compatibility painless since I can just install the Linux distribution they support.

I shopped around for computer parts with complete IOMMU support just so I could map the discrete GPU to the virtual machine and achieve near native performance... Only to discover they are exceendingly hostile to users who do this VFIO stuff.

Just yet another reminder not to "buy" games on these platforms, I guess.

nineteen999 12 November 2024
The stability is fine for me, but the rendering performance in the steam client when the mouse is in the window is abysmal.
AzzyHN 13 November 2024
All of this stuff goes way over my head. I'm on Pop!_OS and am happy to report "it just works" (tm), though it ignores scaling entirely.
Dwedit 12 November 2024
On Windows, "Environment" is stored in the Win32 Thread Information Block/Thread Environment Block (TIB/TEB), so it's thread-local rather than process-global.
Aeolun 12 November 2024
My only issue with Steam at this point is that it’ll just randomly complain it has no connection, no matter which content server I set it to connect to.

If I spam the ‘retry’ button it’ll eventually work, but it’s a massive PITA.

snvzz 11 November 2024
Considering glibc's effort, I have to wonder what the other libc do, and whether they already implement something like this.