XOR

(chiark.greenend.org.uk)

Comments

an_ko 18 February 2025
My favourite cursed XOR trick that I think wasn't mentioned is XOR doubly-linked lists. https://en.m.wikipedia.org/wiki/XOR_linked_list

Instead of each node storing the next- and previous-pointers separately, store a single pointer which is the XOR of the two. Which is obviously an invalid pointer. But when iterating, XOR the previous node's pointer with the combined pointer to get the next node's pointer, and so on. You can iterate this way in both directions. Feels illegal. :)

zero_k 18 February 2025
But you forgot! It's also a 3-wise independent linear hashing function! Which means it can be used for probabilistically approximately uniform sampling and counting of solutions to boolean functions. This is super-duper useful. We use it to build counters that give probabilistic, but proven, counts. I explained the idea here in more understandable terms [1].

Basically, it halves the solution space approximately correctly each time. So you keep on adding them, until you have say, 10 solutions. Then you multiply the 10 with 2^k, where k is the number of XORs you added. That's it! So cool, no? And it's super-scalable, because it haves it each time, so you'll get to, say, 10 pretty quick!

Some research papers are here [2,3]. I work on this, the tools are here [4,5]. In the last model counting competition, it dominated all other competitors, when combined with an exact counter, slides of the competition here [6].

[1] https://www.msoos.org/2018/12/how-approximate-model-counting... [2] https://arxiv.org/abs/1306.5726 [3] https://www.cs.toronto.edu/~meel/Papers/cav20-sgm.pdf [4] https://github.com/meelgroup/approxmc [5] https://github.com/meelgroup/unigen [6] https://mccompetition.org/assets/files/2024/MC2024_awards.pd...

jjice 18 February 2025
One of my favorite XOR stories is from Bryan Cantrill (of Oxide, Joyent, and Sun) in this presentation [0] and this video [1].

To avoid clicking a link: When he was at Sun, he was chatting with a coworker (Roger Faulkner) about the lack of logical XOR in C. Faulkner said it was because you couldn't short circuit it, and Brian thought that was wild. Then Roger emailed Dennis Ritchie to ask and he confirmed it was what Faulkner had said.

That stories gets me every time! First of all, it's very funny and well delivered by Cantrill, but it's also just so incredible that they could ask the man himself.

[0] https://speakerdeck.com/bcantrill/oral-tradition-in-software...

[1] https://www.youtube.com/watch?v=4PaWFYm0kEw

ipython 18 February 2025
Ha. TIL that if you XOR the car emoji with 0x20 (ie. you make it "lower case"), you get the "no pedestrian" emoji. It probably won't encode below but this was copied and pasted from tfa. It seems too coincidental to be an accident - anyone who has insight to know whether this was done on purpose?

If you take this too far, you might get strange ideas, like the lower-case version of the car emoji being a ‘no pedestrians’ sign:

    >>> chr(ord('') ^ 32)
    ''
adrian_b 19 February 2025
I strongly dislike the ubiquitous use of the name XOR a.k.a. "exclusive or" for this logical function, because almost always what is meant is "sum modulo 2" a.k.a. "parity", and not "exclusive or".

"Sum modulo 2" a.k.a. "parity" and "exclusive or" are 2 distinct logical functions, which happen to coincide only for the case of 2 input operands (because there exists only a single odd number less than or equal to 2).

For 3 or more input operands, what most people call XOR is actually parity, i.e. the logical function whose value is "1" when an odd number of input operands are "1".

For 3 or more input operands, "exclusive or" is the logical function whose value is "1" only if there exists only a single input operand whose value is "1", while all the other input operands are "0".

For computer hardware, parity is a much more important logical function than "exclusive or" (mainly because addition modulo 2 is used as a building block for implementing addition modulo greater numbers).

On the other hand, in mathematics "exclusive or" is a much more important logical function than parity.

For example, the quantifiers that express that some predicate is true for some elements of a set, for all elements of a set, or for a unique element of a set (like saying that an equation has solutions, or it has no solutions, or it has a unique solution), are based on the logical functions "or", "and" and "exclusive or".

In natural language, "or" always means either "inclusive or" or "exclusive or". It never means parity, i.e. what many programmers call "exclusive or" a.k.a. XOR.

While in programming the "exclusive or" logical function is a function whose computation is seldom needed, it is very frequently used for describing the behavior of a program, e.g. when saying that in a select/case/switch compound statement of some programming language either the 1st statement or the 2nd statement or the 3rd statement or ..., is executed, or when describing the types that the current value of a union/sum typed variable can have.

grahamlee 19 February 2025
My handy real-world analogy for XOR is the light over a staircase in a home. There's a switch at the bottom, and another switch at the top, and both control the same light. Initially, they're both in the off position. You set the bottom switch, and the light turns on. You climb the stairs, set the top switch, and the light turns off although both switches are now in the "on" position. As long as one switch is in the "on" position and one switch in the "off" position, the light is on; otherwise, it's off.
sigbottle 18 February 2025
There's also the [kademlia distributed hash table](https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia...). The high level idea is that each node gets a random bit in the range [0, 2^m), and we define the distance to be XOR. We want to find a distributed algorithm to quickly get some information from X to Y, without knowing the whole network.

You can just look at the math and prove that it works, but my pet favorite visual interpretation of the algorithm is this:

Suppose you have a start node X, and it wants to find node k. Define the "X-distance tree" to be a binary tree, with leaf indices 0, 1, 2... But you put labels of X^leaf_index in each node, to represent the distance from a certain label to X. E.g. dist(x, x) = x^x = 0, so the label "X" (the original node) is put at the leftmost leaf (0).

The interval [2^i, 2^(i+1)) is some subtree of the X-distance tree. Let's say you know k's distance belongs in that interval, so you query some node Y in there as an approximate neighbor.

No matter what node Y you pick, the resulting prefix in the Y-distance tree will always be some permutation of the [2^i ,2^(i+1)) subtree we picked from the X-distance tree!

More specifically, my claim is that at the sets labels_of_leaves_of_X( [2^i, 2^(i+1)) ) == labels_of_leaves_of_Y( [0, 2^i) ). (recall that we index by distance, but the labels might be different).

There are far more rigorous comparisons to other DHT's like Chord, both mathematically and empircally. But to me, this visual intuition tells me what the "symmetry" of Kademlia means - sort of everybody's in their own local neighborhood, their own subtrees.

Whereas Chord, even if you implement it bi-directionally (which costs 2x memory! and seems even more treacherous to implement...), can't get this level of 'isolation' in some sense. The neighborhood sliding window of size S is always shifting: per bit, there's always 2^m distinct neighborhoods. Yea, even though most neighborhoods "look the same", it's just not pretty.

Kademlia has 1 + 2 + 4 ... + 2^m-1 neighborhoods, and it's all neat and tidy.

OuterVale 19 February 2025
If anyone is wondering, this is the same Simon Tatham as in Simon Tatham's Portable Puzzle Collection. If you're unfamiliar and ever find yourself offline and bored, they're worth checking out.

I know I burnt many hours in high school playing them.

https://www.chiark.greenend.org.uk/~sgtatham/puzzles/

acjohnson55 19 February 2025
When I learned Z80 assembly to program my TI-83, every byte of machine code counted, because the whole calculator only had 24k of storage. To initialize the main accumulator register, `a`, to zero, you would do `XOR a` instead of `LD a, 0`. For the math instructions, `a` is an automatic operand, so `XOR a` XORs `a` with itself, and the entire instruction is only 1 byte. To explicitly load 0 into `a` requires the literal 0 to be represented in the opcode, so `LD a, 0` is a 2 byte instruction.
inasio 18 February 2025
A lot of custom optimization solvers (e.g. Ising Machines) these days benchmark on XOR problems. It's a bit useless in practice because solving a bunch of XOR clauses can be done in polynomial time with Gaussian elimination, yet the solvers all show exponential scaling, but it works as a good way to gauge performance.

A second interesting implementation concerns the McEliece system. It's a public key cryptocipher from the 70s that is enjoying a renaissance these days due to being quantum-resistant. The decoding attack involves finding a solution to a set of XOR equations (again, polynomial), but where the Hamming distance is equal to some number (given, part of the public key).

mdnahas 20 February 2025
I emailed the author to correct the section on XOR swap:

Once upon a time, when I was an aggressive young programmer, I tried to show off by replacing this swap:

void swap(int a, int b) { int c = a; a = b; b = c; }

with xor swap:

void swap(int a, int b) { a ^= b; b ^= a; a ^= b; }

and the program broke!

They’re not identical. The program was calling swap on array elements:

    swap(&(array[i]), &(array[j]))
and occasionally i=j. That is, swap was being called with a==b. And that’s the difference: when you try to swap a location with itself. In that case, XOR as you know zeroed out the value in a. But that was also b. The other xors left the zero there. So, the array element was set to zero.

The original swap called on a single location did not modify the value. So, my XOR-swap show off introduced a huge bug!

antirez 18 February 2025
Among binary quantized vectors, similarity is just popcount(XOR of the two vectors) / num_bits. Can be linearly translated to cosine similarity / distance range just multiplying and centering.
niccl 18 February 2025
XOR is used in one of the LEET fizzbuzz answers: https://tech.marksblogg.com/fastest-fizz-buzz.html
sfink 19 February 2025
One tiny missing bit: drawing with XOR wasn't just to be able to easily recover the previous state. If you're on a black and white display and are dragging around a rectangular outline, then dragging it over a filled portion of the screen makes it invisible. Even if it's only partly over a filled image, you can't see the whole thing. With XOR, the filled portions will invert, so you can always see the extent of the whole rectangle.

Admittedly, if your screen is showing static (random noise), then you're still screwed.

Findecanor 19 February 2025
The article mentions the Amiga... Despite there supposedly having been prior art, a company had got a patent [1] on drawing using XOR, which was then used to successfully sue Commodore.

The costs caused by this lawsuit have been claimed as having contributed to Commodore's downfall.

1. https://patents.google.com/patent/US4197590A

nicknash 19 February 2025
How about this:

What is the most efficient algorithm to generate an N x N array of the integers 0,1,2, … such that each entry is the smallest such integer not appearing either above in the same column, or to the left in the same row?

Posting it in this thread is a bit of a spoiler :)

https://nicknash.me/2012/10/26/happy-halloween/

wvh 19 February 2025
Even just hearing the monicker "xor" brings me back to the nineties, analysing viruses that sometimes "encrypted" – or at least obfuscated – themselves using a `xor` operation, or zeroing out a register in assembly code. A lot of nostalgic feelings and personal history for such a small logical operator...
aap_ 19 February 2025
XOR is also the basic element-wise multiplication in clifford and cayley-dickson algebras. the way that they differ and which gives them their individual character is all in the sign calculation.
iamleppert 19 February 2025
I always remember it as the hetero operator.
daitangio 19 February 2025
On 8bit computers (like Z80, 8086 etc)

  XOR REG,REG 
is often faster than

  REG <-0 
operation.
dominicrose 19 February 2025
In everyday life when we use the word 'or' it often means 'xor', right?
NortySpock 19 February 2025
As a data engineer, who is regularly fighting

- "these two databases have different SQL dialects"

- "did we miss a few rows due to poor transaction-isolation when trying to query recently changed rows on the upstream database"

- "is there some checksum of a region of cells that accepts any arrangement of rows and columns that doesn't require me to think about ordering?"

- "is there a subtle data format conversion problem silently mangling my data?"

...I've toying with trying to find a way to serialize everything consistently into something that can be XOR'd, then compare the output of XOR for two tables in two different databases that should be identical, without having to do some giant order-by comparison. Ideally, if necessary, this could be done by eye with two query windows pointing at two different databases...

Basically, Datafold's datadiff, but in a way that could plausibly be home-rolled for on-premise applications (especially quick spot-checks), and not be a total maintenance nightmare...

https://github.com/datafold/data-diff

Don't have anything working yet, but it just seems like one could at least xor a bunch of integers and get something useful... Somehow.

notdian 19 February 2025
"1 takes if not taken" and "0 just copies"
greenavocado 18 February 2025
Easy way to remember it: Never both together (NBT)
localghost3000 18 February 2025
Every now and again one of these come up in a PR and I hard reject it every time. Its clever. Its neat. Its elegant. And it takes a giant ass white paper to explain it to people. Please please please don't use stuff like this when other people have to read your code.
n0id34 19 February 2025
If anyone's inner CSS eye is twitching at the centered figures and captions but not centered text, open DevTools (F12 most of the time) and fix it with the following CSS (click the + on the right in Chrome/Brave where the CSS is to add new rules)

  div.flexcontainer {
    justify-content: unset;
  }

  figcaption {
    text-align: left;
  }
The figures and captions should be in line with the text now.