When good pseudorandom numbers go bad

(blog.djnavarro.net)

Comments

somat 9 hours ago
This is a perpetual problem in computer science, people want a hash function, then decide that the random function very nearly does what they want. then use the random function as a hash function and are dumbfounded that it turns out there is no hard specification for how it works internally. The random function has no guarantee of result across versions, system, time. Hell I would even be harsh enough to say that any reproducibility in random is an accident of implementation. random should always be considered the non-deterministic function, hash is the deterministic function.
AlotOfReading 19 hours ago
Most people don't really care about numerical stability or correctness. What they usually want is reproducibility, but they go down a rabbit hole with those other topics as a way to get it, at least in part because everyone thinks reproducibility is too slow.

It was 20 years ago, but that's not the case today. The vast majority of hardware today implements 754 reproducibly if you're willing to stick to a few basic principles:

1. same inputs

2. same operations, in the same order

3. no "special functions", denormals, or NaNs.

If you accept these restrictions (and realistically you weren't handling NaNs or denormals properly anyway), you can get practical reproducibility on modern hardware for minimal (or no) performance cost if your toolchain cooperates. Sadly, toolchains don't prioritize this because it's easy to get wrong across the scope of a modern language and users don't know that it's possible.

mjcohen 18 hours ago
I found this an enjoyable read. I also have Wilkinson, both text and Algol book, which I used many years ago to write a fortran eigenvalue/vector routine. Worked very nicely. Done in VAX fortran and showed me that having subscript checking on added 30% to the run time.
coolcase 15 hours ago
I don't grok this but if you had to describe it in a nutshell, is this because of a race condition? Differences in HW? Floating point ops have some randomness built in?
jam0wal 9 hours ago
So, you want to use the random function but want a constant output. Simpler to just use a constant array and not impose your bias on a corner case interpretation on random.