HN Offline: That XOR Trick (2020)

That XOR Trick (2020)

hundredwatt | 320 points | 16day ago | florian.github.io

danbruc|13day ago

For calculating the XOR of 1 to n there is a closed form solution, so no need to XOR them together in a loop.

  (n & ((n & 1) - 1)) + ((n ^ (n >> 1)) & 1)

Or a much more readable version

  [ n, 1, n + 1, 0 ][n % 4]

which makes it clear that this function cycles through a pattern of length four.

Why this works can be seen if we start with some n that is divisible by four, i.e. it has the two least significant bits clear, and then keep XORing it with its successors. We start with xxxxxx00 which is our n. Then we XOR it with n + 1 which is xxxxxx01 and that clears all the x's and leaves us with 00000001. Now we XOR it with n + 2 which is xxxxxx10 and that yields xxxxxx11 which is n + 3. The cycle finishes when we now XOR it it with n + 3 which yields 00000000. So we get n, 1, n + 3, 0 and then the cycle repeats as we are back at zero and at n + 4 which is again divisible by four.

sdenton4|13day ago

Nice!

My offhand solution not using xor is to subtract from the sum of 1 to n, which has a closed form solution. The closed form roughly halves the execution time, as we only have to iterate over the range once.

Good to know there's a similar speedup available on the xor path...

tomtomtom777|13day ago

Fascinating. It can see it work but I still can't really wrap my head around where the magic cycle length of 4 comes from.

danbruc|13day ago

Combining two consecutive integers starting with an even one yields one.

  xxxxxxx0 ^ xxxxxxx1 = 00000001

If we start at a number divisible by four and do this twice, we get one twice.

  xxxxxx00 ^ xxxxxx01 = 00000001
  xxxxxx10 ^ xxxxxx11 = 00000001

And combining the two of course yields zero and we are right back at the start.

betasilly|13day ago

Another interesting fact is that each time you make the xor of four consecutive numbers, beginning with an even number, the result is zero. Example in J.

  xor =: (16 + 2b0110) b.
  f =: 3 : 'xor/ y + i. 4'
  f"0 ] 2 * 1 + i. 100

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Summing a hundred millions: +/ f"0 ] 2 * i 100000000 gives zero (it takes a few seconds). So it seems the stated property holds for every even n.

danbruc|13day ago

Yes, because XORing two consecutive integers only differing in the least significant bit yields one.

  xxxxxxx0 ^ xxxxxxx1 = 00000001

Doing this twice with four consecutive numbers then also cancels the remaining one. That also means that you do not have to use consecutive numbers, you can use two arbitrary pairs

  2m ^ 2m + 1 ^ 2n ^ 2n + 1

and for example

  16 ^ 17 ^ 42 ^ 43

should be zero.

NickPollard|13day ago

There are essentially two bits of information in the 'state' of this iterated algorithm: a) Are all the non-lowest bits zero, or are they the value of the latest N b) the value of the lowest bit

So the cycle of (N, 1, N+3, 0) corresponds to (A) and (B) being: (0,0), (0,1), (1,1), (1, 0) - i.e. the 4 possible combinations of these states.

HappyPanacea|13day ago

If we generalize the problem to base k (they are k-1 duplicate of each number except the missing number, find missing one using base k-wise addition) then we can see the cycle is the smallest number such the base k-wise addition from 1 to the number is zero and it is power of k will form a cycle. I'm not sure if all such numbers are power of k if they exists or if there is an upper bound on them. For example in base 4 there appears to be no such cycle.

HappyPanacea|13day ago

I made an arithmetical mistake in base 4, so I was wrong. I also wrote they are instead of there are.

I think the following is true: For even k the cycle is k^2 long and for odd k is k long. Why? because units' place of generalized xor from 1 to k-1 is (k^2-k)/2 and therefore zero mod k if k is odd, if k is even then if we repeat it twice we get zero. For the second digit, k times the same digit will always give zero. Thus for odd k we have a zero when n is divisible by k and for even k we have a zero when n is divisible by 2k and the smallest power of k divisible by 2k is k^2 so it must be the cycle length.

Thorrez|13day ago

In your array-based equation, you say n+1, but in your explanation you say n+3. Is that a mistake?

danbruc|13day ago

No, that is correct, those two n represent slightly different things. n + 3 is the value after n XOR n + 1 XOR n + 2, so the n in the array index expression is n + 2 from the explaination and n + 3 results from (n + 2) + 1. I thought about how I could make this less confusing but it just became more confusing in my mind, so just used n in both cases.

Thorrez|13day ago

I'm now seeing that they're different. However, this sounds a bit off to me:

>n + 3 is the value after n XOR n + 1 XOR n + 2, so the n in the array index expression is n + 2 from the explaination and n + 3 results from (n + 2) + 1.

The reason I think it's off is that array index expression the start of the sum is 1, but in the explanation the start of the sum is n. So I don't think it's as simple as the ending being different by 2.

danbruc|13day ago

In the array expression the array values and the index depend on n and both vary, in the explanation the n is fixed. Let us do an example starting at say 8.

  n =  8   [  8, 1,  9, 0 ][ 8 % 4] =  8 = n
  n =  9   [  9, 1, 10, 0 ][ 9 % 4] =  1 
  n = 10   [ 10, 1, 11, 0 ][10 % 4] = 11 = n + 1
  n = 11   [ 11, 1, 12, 0 ][11 % 4] =  0

  n =  8   n                         = 8               =  8 = n
           n ^ n + 1                 = 8 ^ 9           =  1
           n ^ n + 1 ^ n + 2         = 8 ^ 9 ^ 10      = 11 = n + 3
           n ^ n + 1 ^ n + 2 ^ n + 3 = 8 ^ 9 ^ 10 ^ 11 =  0

So in the explanation we get 11 = n + 3 still with reference to the starting value n = 8, in the array expression on the other hand we have moved on to n = 10 when we pull 11 = n + 1 out of the array.

danbruc|12day ago

Amendment to the parent comment. Note that the arrays contain values like 9 and 10 that are not the XOR of 1 to n for any n but they will also never be accessed because when they appear in the array, then the index used will point to a different element.

Also because we end up back at zero when ever n % 4 == 3, there is some flexibility about the starting point. I wrote 1 to n because that is what the article used, but it would be mathematically cleaner to start at zero which actually changes nothing because XORing with zero does nothing. And we do not have to start at zero at all, we can start at any number divisible by four and less than n because the running XOR sum will become zero just before each multiple of four. So XORing together 0...n, 1...n, 4...n, 8...n or generally 4k...n will give the same result. The explanation part looked at one cycle starting at 4k and ending at 4k + 3 with the running XOR sum being back at zero. Maybe this would have been the less confusing explanation, just using 4k instead of using n again with the constraint that it is divisible by four.

Thorrez|12day ago

It makes sense to me now. Thanks for taking the time to explain it!

skullt|13day ago

There's a bit of a trick in that solution: n is assumed to have the lower two bits clear so for an arbitrary n the array would really be:

[(n & ~3), 1, (n & ~3) + 3, 0][n % 4]

where the (n & ~3) makes sure those lower 2 bits are cleared. But note that we only ever can look at the first element when n % 4 == 0. In that case, (n & ~3) == n already. And further, we only ever can look at the third element when n % 4 == 2. In that case (n & ~3) == n - 2, so (n & ~3) + 3 == n + 1. Hence the array can be simplified to the one given in the other comment.

teiferer|13day ago

[dead]

antirez|13day ago

About one month ago I applied XOR in a similar (but a bit more complicated way) to Redis Vector Sets implementation, in the context of sanity check of loading a vset value from the RDB file. I believe the way it works is quite interesting and kinda extends the applicability of the trick in the post.

The problem is that in vector sets, the HNSW graph has the invariant that each node has bidirectional links to a set of N nodes. If A links to B, then B links to A. This is unlike most other HNSW implementations. In mine, it is required that links are reciprocal, otherwise you get a crash.

Now, combine this with another fact: for speed concerns, Redis vector sets are not serialized as

    element -> vector

And then reloaded and added back to the HNSW. This would be slow. Instead, what I do, is to serialize the graph itself. Each node with its unique ID and all the links. But when I load the graph back, I must be sure it is "sane" and will not crash my systems. And reciprocal links are one of the things to check. Checking that all the links are reciprocal could be done with an hash table (as in the post problem), but that would be slower and memory consuming, so how do we use XOR instead? Each time I see a link A -> B, I normalize it swapping A and B in case A>B. So if links are reciprocal I'll see A->B A->B two times, if I use a register to accumulate the two IDs and XOR them, at the end, if the register is NOT null I got issues: some link may not be reciprocal.

However, in this specific case, there is a problem: collisions. The register may be 0 even if there are non reciprocal links in case they are fancy, that is, the non-reciprocal links are a few and they happen to XOR to 0. So, to fix this part, I use a strong (and large) hash function that will make the collision extremely unlikely.

It is nice now to see this post, since I was not aware of this algorithm when I used it a few weeks ago. Sure, at this point I'm old enough that never pretend I invented something, so I was sure this was already used in the past, but well, in case it was not used for reciprocal links testing, this is a new interview questions you may want to use for advanced candidates.

hundredwatt|13day ago

A neat trick to make the accumulator both collision-resistant and self-diagnosing.

  For every normalized link id x:
      y = (x << k) | h(x)   # append a k-bit hash to the id
      acc ^= y

If acc is zero, all links are reciprocal (same guarantee as before).

If acc is non-zero, split it back into (x', h'):

* Re-compute h(x').

* If it equals h', exactly one link is unpaired and x' tells you which one (or an astronomically unlikely collision). Otherwise there are >= 2 problems.

This has collision-resistance like the parent comment and adds the ability to pinpoint a single offending link without a second pass or a hash table.