Bit-String Orbits Under Rotate-XOR

For any string of n binary bits we can arrange the bits in a circle, and define the operation "rotate-xor" by Exclusively-ORing each bit with the corresponding bit of the circle rotated one place clockwise. In other words, each bit is XOR'ed with it's predecessor in the loop. Robert Sawyer called my attention to this operation, and pointed out that the "flows" induced by iterating this operation on the finite space of 2ⁿ bit-strings are quite interesting. The structures resemble hydrocarbon molecules, because they tend to cluster into closed loops, with inward-flowing branches attached to the vertices of the eventual cycles. It would be interesting to know if some combinations of atoms with the right valencies might tend to join according to these XOR structures.

For n = 3 there is just one cycle of period 3, and each vertex of this cycle is fed by one other string. Also, we have the convergent path from 7 to the fixed point 0. These two structures are shown below.

With n = 4 there are no non-trivial cycles, since every number flows to the fixed point 0, as shown below.

The set of 5-bit strings has a single cycle of length 15, and each vertex is fed by one other number. Also, the number 31 flows to the fixed point 0, as illustrated below.

With n = 6 we have two copies of the hexagonal structure shown below on the left. The other copy (not shown) has all the numbers equal to double the numbers in this figure modulo 63 (= 2⁶−1). We also have one triangular structure, as well as a single structure going to the fixed point 0. Each vertex of these structures is fed by a Y-shaped set of three numbers.

The set of 7-bit strings consists of nine septagonal structures as shown in the figure below. Seven of these contain numbers that are given by multiplying the numbers in the structure on the left below by 2^k modulo 127 (= 2⁷−1), for k = 0, 1,..., 6. The other two "articulated septagons" are self-generated, i.e., multiplying the numbers by 2^k (mod 127) gives rotated versions of themselves.

In general if n equals a power of 2 the resulting shift-xor structure is a single binary tree flowing to the fixed point 0.

These "shift-XOR" structures are formally examples of what are called "dynamic systems" (iteration patterns) produced by a particular unary operation on the elements of a set. The cycles are the "attractors" in these systems. Of course, as dynamic systems go, these shift-XOR structures are not too complex, because the underlying field (polynomials in Z₂[x] of degree n) is finite and necessarily leads to strictly periodic sequences. Nevertheless, the resulting structures are quite interesting. The wrap-around aspect of the shift-XOR operation makes the results non-trivial.

Sawyer asked for the number N(n) of distinct cycle-lengths as a function of n, noting that he had empirically determined that

Also, he wondered what is the length L(n) of the longest cycle as a function of n? He had found empirically that

I think the answer to the first question follows immediately from the answer to the second question. If L(n) = 2^t m where m is odd, then it appears that the number of distinct cycles lengths is N(n) = t + 2 if m > 1, and N(n) = 1 if m = 1. This is based on the observation that if there is a cycle of length 2k > 1 then there is also a cycle of length k, and conversely if there is a cycle of length k < L(n) then there is a cycle of length 2k. For example, the max cycle length for n = 20 is 60, and there are also cycles of length 30 and 15 (and of course 1). Thus L(20) = 2²15 and so t = 2 and N(20) = t + 2 = 4.

The second question is more difficult. It seems that if n = 2^k m where m is odd, then L(n) = 2^kL(m) if m>1, and L(n) = 1 if m = 1. Also, it appears that L(n) is always divisible by n, so we only need to consider the values of L(n)/n. The values of L(n)/n for the first few odd integers n are listed below

Notice that each L(n)/n is of the form 2^j − 1, with the exception of n = 23. However, in that case we have L(n) = 2047, which is of this form. Also, in several other cases L(n) is of this form. We also note that when n is of the form 2^j + 1 it appears L(n)/n is always n−2, as shown by the cases n = 3, 5, 9, 17. Also, when n is of the form 2^j − 1 it appears L(n)/n is always 1, as shown by the cases n = 3, 7, 15, 31. In general it seems that

for some non-negative integer j(n) and positive integer d(n). The values of these parameters for the first few odd n are shown below:

Another way of approaching this is to examine the sequence of rotate-XOR values of k for k = 0, 1, 2, ..., 2ⁿ − 1. If we let A(k) denote the rotate-XOR of k (for some fixed number of bits n) and let M = 2ⁿ − 1, then clearly we have A(k) = A(M−k) for all k. In other words, the sequence is palindromic. Also, for k from 0 to 2ⁿ⁻¹ (which gives the rest of the sequence via the palindromic property) the values of A(k) are always the sequence

and so on. (This is for a left-rotation, whereas for a right-rotation these are the values of A(2n), and the odd terms are another sequence beginning with 2ⁿ⁻¹ + 1 and somewhat similar differences.) If we let D(k) = A(k) − A(k−1) denote the first differences of this sequence, the values of D(k) for k = 1, 2,... are

and so on. These differences are given by

In general the positive differences for j greater than 1 are

and the negative differences are

So, to determine the value of A(N) for some given integer N we need only count the number of integers of these various forms less than or equal to N. For example, with N=18 we know the numbers of the form 4k+1 less than or equal to 18 are

Therefore we have

Obviously the number of integers of the form mk+n less than or equal to N is just 1 + {(N-n)/m} where {x} signifies the greatest non-negative integer not exceeding x. Combining this with the preceding expressions for D(k) gives a summation for A(n).

Notice that the first differences would be more consistent if the lowest-order magnitudes were swapped, i.e., if the difference for steps numbered 4k+1 was 1 and the difference for steps numbered 4k+3 was -3. This more consistent set of first-differences gives a sequence a(k) that is closely related to A(k), as shown below

We see that A(k) = a(k) + k if k is even, and A(k) = a(k) + (k+1) if k is odd. Also, we can express a(N) as the sum of the "consistent" first differences

where s_k equals +1 or -1 accordingly as the odd part of k is congruent to 1 or 3 (mod 4), and t_k equals the power of 2 dividing k. Thus, for even integers N we have

and for odd integers N we have

From more general considerations we can see that A(x) equals k then A(x') also equals k, where xʹ is the bit-wise complement of x. For example, considering six-bit strings, we see that both the numbers

flow to 101011 (43) under the rotate-XOR operation. Furthermore, this pair of numbers is unique, i.e., there can be no other numbers y such that A(y) = 43. To see this, we can directly determine the complementary pair (x,xʹ), if they exist, that flow to any given number as indicated below

Here we have arbitrarily taken the least significant bit of x as zero (which we can certainly do by exchanging x and xʹ), and we can infer the least significant bit of the left-rotation of x, denoted by L(x), by the reciprocity of the exclusive OR. Also, we know the next bit of L(x) is the left-shifted least bit of x, so we can bring that 0 down to L(x), and use it to infer the next bit of x, as follows

Now we can bring this new bit of x down and to the left one place in L(x), and use it to infer the next bit of x

and so on. When we have completed this process we have the result

We must then verify that the most significant bit of x equals the least significant bit of L(x). (If it didn’t, then we would have shown that 43 has no predecessor.) Of course if we have started with a 1 as the least bit of x, we would have generated the complement xʹ = 011001 (25).

Likewise for any k we can determine the unique pair of integers that flow to k, if any such exist. (The number zero flows to itself, and it's complement 2ⁿ − 1 also flows to zero.) Exactly half of all the numbers (for any fixed binary length) have "predecessors" in this sense. The others do not yield a consistent result to the above procedure when we try to wrap the last bit around.

In general, for n-bit strings, if we set n = m 2^t for some odd integer m, then every vertex of a cycle (as well as the number 0, which is periodic with period 1) is fed by a binary tree with 2^t layers. Thus, if n is odd, as in the cases n = 3, 5, or 7 illustrated above, each vertex is fed by a single "external" number with no predecessors. On the other hand, with n = 6 = 3(2¹) each vertex is fed by a binary tree with 2 levels. Also, with n = 4 = 2², each vertex (which is just zero in this case) is fed by a 4-layer binary tree.

Also, if d divides n, then the structures of n-bit numbers include copies of the d-bit structures with each number multiplied by (2ⁿ − 1)/(2^d − 1) (mod 2ⁿ − 1), plus the appropriate size of binary tree into each vertex. For example, with n = 3 we had the simple triangular cycle (3,5,6) with each vertex fed by a single number. Since 6 is divisible by 3, we know that there must be a triangular structure with 6-bit strings, with the numbers multiplied by (2⁶ − 1)/(2³ − 1) = 63/7 = 9 modulo 63. Indeed as we saw previously the 6-bit strings include a triangular cycle with the numbers (27,45,54), with each vertex fed by a two-level binary tree (because 6 is divisible by 2).

With these understandings we can infer the structures for larger values of n, based solely on the numbers of cycles of various lengths. Here is a brief table

The number of vertices is just the sum of products of cycles and lengths, and if we multiply each of these by 2^E where E = 2^t is the even part of n we get the total number of n-bit strings, which of course is 2ⁿ. For example, each vertex of an n = 12 structure is the base of a binary tree of 4 levels (because 4 is the highest power of 2 dividing 12), so each vertex implies 2⁴ = 16 numbers. Thus in total we have

We can characterize each cycle by its smallest term, and then easily generate the remaining terms, noting that the left-rotation of x is given by 2x modulo 2ⁿ − 1 (unless x = 2ⁿ − 1, in which case the left-rotation is simply 0).

Return to MathPages Main Menu