At least the first one (the getenv thread safety fix) will hopefully make it into glibc 2.41 and it should be quite safe to backport. It turns out that setenv is easier to handle because glibc already never frees environment strings. It's concurrent unsetenv that is rather tricky. Without some snapshot approach, getenv would return null pointers instead of environment variables values that are actually set. I don't want to introduce locking into getenv because getenv without setenv has been async-signal-safe for so long that it would likely break applications.
The environ handling fixes are a bit more controversial because vfork+execve make it complicated to avoid memory leaks, but these further fixes are less important to the stability of the graphics stack.
Thank you! I deeply appreciate that Steam works so well on Linux these days. I don't take for granted the hard work happening behind the scenes to make that a reality for us.
Isn’t best practice to read all environment variables on boot and never use setenv? The only place where setenv would matter is for spawning new processes where you should probably be creating an new environ cloned from the current one and update the new values. Using getenv/setenv as an IPC messaging mechanism seems to be an opportunity for lots of issues aside from it historically not being multithreaded-safe on Linux and having all sorts of potential memory leaks hiding (which is what the post ignores when it says that it’s thread safe on MacOS).
FWIW the decision to leak memory on Mac actually goes back ~26 years to FreeBSD - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=5604 which OSX inherited. I would not be surprised that Windows setenv has BSD roots due to licensing.
26 years ago people knew this API was broken but didn't fix it due to inertia of breaking buggy programs further.
There really shouldn't be a need to change your own process's envvars. For subprocesses just use the proper exec function. For anything else there should be a clear API to call rather than changing a global variable and hoping some code far away from yours rereads it and handles things correctly.
The real question is: is there any case where a program calls `setenv` in one thread and actually wants it to take effect in other already-existing threads?
That said, GLIBC is pretty good at documenting all the dangerous functions, so it is possible to add locking/copying yourself.
The mere existence of Steam is astounding to someone who grew up playing nethack and chess on Linux.
But the Steam client is really strange. Sometimes it works for months, and suddenly a game won't start, or something doesn't work, and I have to do weird stuff to get it working like purging all files or reinstalling. It doesn't make sense, it's like the Steam client rots.
This is really cool insight into both the Steam client and Linux programming. I understand why there may not be detailed release notes every release, but wow "Fixed some miscellaneous common crashes" is an understatement when you know about this work!
> If this can be addressed in glibc, it may involve a tradeoff on features, maybe an opt-in mechanism with a slight departure from the "impossible" POSIX spec. That's something we may pursue in the long term if we can propose something sensible.
I'm really curious why they're using setenv(3) so much. The main usages that I can think of is setting an environment variable before calling something like exec(3). That doesn't seem to be the case here.
The article mentions that they use exevpe for spawning children processes. So what usages of setenv(3) would remain?
> We removed the majority of setenv calls. It was mostly used when spawning processes
Could someone elaborate this for a non-developer? Why would you use `setenv` (which I assume is functionally similar to `export key=value`, but correctly me if I'm wrong) (extensively) for spawning processes?
Always cool to rediscover people via HN. This post reminded me of the work ttimo did for the Quake 3 engine more than two decades ago. I remember it because I read so many comments written by him (like 15 years ago):
To raise awareness: there's been a bug with the Linux Steam client which has been persistent for a long time.
TL;DR: if you have Steam running for more than a ~day or so, you will run out of window handles so you won't be able to open any new graphical application/window until you restart Steam.
Using Steam Chat appears to make the issue worse (it happens earlier).
I love the steam client on linux these days, especially the compatibility for non-steam games is so great and Ive been using it to play WoW Classic while I have covid
I wish they'd make it more virtualization friendly. I don't want to run untrustworthy proprietary software on my main system. Common sandboxing mechanisms are insufficient since Steam and its games need access to the entire device tree anyway. Nothing short of a real virtual machine would do it for me. Will also make compatibility painless since I can just install the Linux distribution they support.
I shopped around for computer parts with complete IOMMU support just so I could map the discrete GPU to the virtual machine and achieve near native performance... Only to discover they are exceendingly hostile to users who do this VFIO stuff.
Just yet another reminder not to "buy" games on these platforms, I guess.
On Windows, "Environment" is stored in the Win32 Thread Information Block/Thread Environment Block (TIB/TEB), so it's thread-local rather than process-global.
My only issue with Steam at this point is that it’ll just randomly complain it has no connection, no matter which content server I set it to connect to.
If I spam the ‘retry’ button it’ll eventually work, but it’s a massive PITA.
Improving Steam Client Stability on Linux
(ttimo.typepad.com)470 points by Venn1 11 November 2024 | 144 comments
Comments
At least the first one (the getenv thread safety fix) will hopefully make it into glibc 2.41 and it should be quite safe to backport. It turns out that setenv is easier to handle because glibc already never frees environment strings. It's concurrent unsetenv that is rather tricky. Without some snapshot approach, getenv would return null pointers instead of environment variables values that are actually set. I don't want to introduce locking into getenv because getenv without setenv has been async-signal-safe for so long that it would likely break applications.
The environ handling fixes are a bit more controversial because vfork+execve make it complicated to avoid memory leaks, but these further fixes are less important to the stability of the graphics stack.
26 years ago people knew this API was broken but didn't fix it due to inertia of breaking buggy programs further.
There really shouldn't be a need to change your own process's envvars. For subprocesses just use the proper exec function. For anything else there should be a clear API to call rather than changing a global variable and hoping some code far away from yours rereads it and handles things correctly.
That said, GLIBC is pretty good at documenting all the dangerous functions, so it is possible to add locking/copying yourself.
Is setenv really a Linux API, since it's neither defined by Linux (it's in POSIX) nor implemented by the Linux kernel (it's entirely in userspace)?
But the Steam client is really strange. Sometimes it works for months, and suddenly a game won't start, or something doesn't work, and I have to do weird stuff to get it working like purging all files or reinstalling. It doesn't make sense, it's like the Steam client rots.
Yes please
The article mentions that they use exevpe for spawning children processes. So what usages of setenv(3) would remain?
Could someone elaborate this for a non-developer? Why would you use `setenv` (which I assume is functionally similar to `export key=value`, but correctly me if I'm wrong) (extensively) for spawning processes?
https://github.com/search?q=repo%3Aioquake%2Fioq3+ttimo&type...
TL;DR: if you have Steam running for more than a ~day or so, you will run out of window handles so you won't be able to open any new graphical application/window until you restart Steam.
Using Steam Chat appears to make the issue worse (it happens earlier).
This has been documented under https://github.com/ValveSoftware/steam-for-linux/issues/9094 but for some reason that issue has been closed.
I personally just restart Steam every day but if someone else encounters this issue and doesn't know why their windows are not opening, this is why :)
I am using KDE/Wayland but I've observed this under X11 too.
I shopped around for computer parts with complete IOMMU support just so I could map the discrete GPU to the virtual machine and achieve near native performance... Only to discover they are exceendingly hostile to users who do this VFIO stuff.
Just yet another reminder not to "buy" games on these platforms, I guess.
If I spam the ‘retry’ button it’ll eventually work, but it’s a massive PITA.