Sudoku Symmetries

Sudoku Symmetries

The popular puzzle known as Sudoku is based (in its standard form) on a 3x3 array of square blocks, each divided into a 3x3 array of square cells, as shown below.

In a completed Sudoku puzzle, each cell contains one of nine symbols, usually the digits 1 to 9 (although any nine symbols could be used, because the numerical properties of the digits are of no significance for the puzzle). The symbols must be distributed in such a way that each row, each column, and each block contains all nine symbols. To create an example of a completed Sudoku grid, we could start by simply listing the digits 1 to 9 by rows in the upper left hand block, as follows.

Next, by “rotating” the rows, we can fill the other two blocks of the top row of blocks, and each entire row of cells will then contain each of the nine symbols, as shown below.

Similarly, we can replicate this row of blocks, rotating the columns in each block, to produce two more rows of blocks, and each entire column of cells will then contain each of the nine symbols, as will each entire row and each block, as depicted below.

Now, given one such grid, satisfying all the requirements of a completed Sudoku solution, we can obviously create many others by performing certain transformations. For example, we can apply any of the 9! = 362880 possible permutations of the nine symbols, and the resulting grid will still satisfy all the Sudoku conditions. Likewise we can apply any of the 3! = 6 possible permutations of the block rows, and any of the 3! = 6 possible permutations of the block columns. In addition, we can apply any permutation of the three entire rows in any block row, which represents (3!)³ = 216 possible distinct transformations. Similarly we can apply any permutation of the three entire columns in any block column, which represents another (3!)³ = 216 possible distinct transformations. We can also rotate the entire grid.

The transformations described above are usually regarded as the trivial symmetries of a Sudoku grid, and hence any two grids that can be transformed into each other by some combination of these operations are considered to be (in a sense) “the same” grid. The number of such grids is so large that we might wonder if all Sudoku grids are equivalent (in this sense) to the one shown above, but this is clearly not the case, as can be seen by considering the grid shown below.

This happens to be the solution to a Sudoku puzzle printed in USA Today on 27 Aug 2009. In one block, the symbol 7 is directly to the right of 6, whereas in another block, 7 is directly above 6. In the previous grid that we developed, any two symbols that are in the same row (or the same column) of one block are in the same row (or the same column) of every block, and this obviously remains true after any of the simple symmetry transformations described above. Therefore, the USA Today grid shown above cannot be produced by applying any of those simple transformations to our baseline grid.

However, there is an additional symmetry operation, beyond those described above, that does enable us to transform our original Sudoku grid in such a way that two symbols in the same row of one block are in the same column of another block. Any column of one block can be swapped with a column of another block in the same row of blocks if the two columns contain the same three symbols. Likewise any row of one block can be swapped with a row of another block in the same column of blocks if the two rows contain the same three symbols. To illustrate these operations, consider the two grids shown below.

The left hand grid is our original baseline, and the right hand grid was formed by swapping the two yellow regions, and swapping the two cyan regions. Notice that in the left hand grid the symbols 8 and 9 are always in the same row (as each other) in each block, whereas this is not the case in the right hand grid. Furthermore, with another operation we can place 8 and 9 in the same column of one of the blocks, as indicated below.

This grid is the same as the one on the right above, except that we have swapped the two yellow regions. Notice that the symbols 8 and 9 are now in the same column of the left-most block of the center block row, even though these two symbols are still in the same row of several other blocks. This demonstrates that the additional symmetry transformations enable us to produce a much larger class of Sudoku grids from our original baseline grid. These additional transformations differ from the basic symmetries described previously in that the basic symmetries are purely dependent on positions, regardless of content, whereas the additional transformations depend on the content of the cells. It would be interesting to know if every possible Sudoku grid is related to every other by some sequence of these transformations. This would amount to a different kind of puzzle, similar to a Rubik’s cube, in which we begin with a given (complete) Sudoku grid, and the objective is to apply the allowed transformations to bring the grid into some standard arrangement.

As mentioned previously, the numerical values of the symbols are of no significance, so we could just as well use any nine distinct symbols. However, assigning numerical values to the symbols does enable us to express the Sudoku conditions algebraically. To illustrate, consider the simplified 4 x 4 grid shown below.

Here we require each row, column, and block to contain each of the numbers 1, 2, 3, and 4. Thus (for example), the variables in the first row must satisfy the conditions

We have twelve sets of equations of this form, one for each row, column, and block. Each set can be solved for each of its four variables, leading to the fact that each variable is a root of the polynomial

Now, suppose we are given the numerical values of A, H, K, and N, as shown below.

Needless to say, we can trivially infer the values of the other variables directly by the usual method of solving Sudoku puzzles, i.e., we note that F cannot be 1, 2, or 3, so it must be 4. Likewise C, I, and P are immediately identifiable as 3, 2, and 1 respectively. Inserting these values, the contents of the remaining are then uniquely determined.

However, we can also approach the problem algebraically. The original twelve sets of governing equations are now reduced by inserting the given numerical values and simplifying. Now, notice that initially there is some ambiguity as to the contents of cells B and G, but we can determine the contents of those cells – without first explicitly determining F, C, P, or I – by considering the algebraic conditions. The variables in the upper left block must satisfy B+E+F = 9 and BEF = 24, and the variables in the second row must satisfy E+F+G = 8 and EFG = 12. Subtracting the two sums and dividing the two products, this implies B – G = 1 and B/G = 2, from which it follows that B = 2 and G = 1. Similarly we can determine the contents of cells M and L by noting that the variables in the lower left block must satisfy I+J+M = 7 and IJM = 8, and the variables in the third row must satisfy I+J+L = 6 and IJL = 6. Subtracting the two sums and dividing the two products, this implies M – L = 1 and M/L = 4/3, from which it follows that M = 4 and L = 3.

Substituting the values of B,G,M, and L into the system equations, we get a further reduction, and we are left with equations such as

The equations on the left imply that J(5-J) = 4, and the equations on the right imply that J(3-J) = 2. Dividing the first of these by the second gives

from which it follows that J = 1, and therefore F = 4 and I = 2. We also have the system equation E+F = 7, so we have E = 3.

Similarly we have the system equations

The equations on the left imply that C(7-C) = 12, and the equations on the right imply that C(5-C) = 6. Dividing the first of these by the second gives

from which it follows that C = 3, and therefore D = 4 and O = 2. We also have the system equation O+P = 3, so we have P = 1. This completes the solution.

Clearly, when applied to such a trivial example, the algebraic approach involves more labor than the direct solution by the usual process of logical elimination. Some of this labor could be reduced by a more judicious choice of values. For example, instead of using the numbers 1,2,3,4, we could have used 0,1,2,3, and the zero value would result in simplification of the system equations, since every term involving that symbol would vanish. Another alternative would be to choose a set of values such as (+1, i, -1, -i), in which the content of each cell would be a root of x⁴ – 1 = 0, and the individual governing equations would be

Similarly for a full 9x9 Sudoku puzzle we could use the nine roots of x⁹ – 1 = 0. Admittedly, even with the optimum choice of variables, the algebraic approach still involves considerable effort, but this approach does have one potential advantage when dealing with particularly difficult Sudoku puzzles. Many puzzles can be solved directly by the local process of elimination, but some extremely difficult puzzles cannot be solved in this “local” way. After applying all the eliminations, we may find that there is still ambiguity in the contents of some of the cells. At this point, some solvers resort to “trial and error”, i.e., they make a guess as to the contents of an unresolved cell, and then work out the implications for the remaining cells. If the guess is wrong, an invalid grid will be produced, and the solver must return to the earlier state and make a different guess. In such situations, the algebraic approach could conceivably provide a linear system of equations in several variables that could be solved simultaneously to yield the unique solution directly, assuming there is a unique solution. We saw a simple case of this in the example above, where we inferred that B and G satisfy the equations B – G = 1 and B – 2G = 0, which we solved simultaneously to give the values of B and G. In less trivial puzzles, we might infer a larger linear system of n variables in n unknowns, which could be solved directly. Alternatively, if no such system of linear equations exists for a given Sudoku puzzle, this could be used to prove that there is not a unique solution.

Another interesting way of looking at Sudoku puzzles is by splitting them into bit planes. This (again) is best illustrated with a 2 x 2 format, and we can simplify by letting the 0, 1, 2, and 3 denote the four symbols to be placed on the cells. In binary, these symbols are 00, 01, 10, and 11. We can then conceptually split the puzzle into two separate puzzles, one consisting of the least significant binary bits, and one consisting of the most significant bits. For example, consider the puzzle below, with the contents of four cells given.

This can be split into the least and most significant binary bit planes as shown below.

On this individual planes, each row, column, and block must contain two “0” symbols and two “1” symbols. The actual contents of the known cells are ambiguous, but the number of symbols has also been reduced. One admittedly naïve strategy for seeking a solution is to strive in each cell to equalize the number of 0’s and 1’s in the interacting cells (i.e., the cells in the same row, column, or block). If, for example, the original puzzle contains an empty cell that interacts with cells containing exactly one instance of three of the four symbols, then this strategy will clearly lead to the correct choice. For example, cell F in the original puzzle interacts with 0, 1, and 2, so it must be 3, whereas in the LSB plane it interacts with two 0’s and a 1, so our strategy leads us to place 1 in that cell, which is consistent with a value of 3 in the original puzzle. Interestingly, for this particular puzzle, the strategy, when it leads to a choice, gives the correct choice for each individual cell. For example, cell B in the LSB plane interacts with two 0’s, so our strategy suggests that we should place 1 in this cell, which is indeed the correct choice. The same cell in the MSB plane interacts with one 0 and one 1, so it doesn’t lead to a choice. After we have filled in all the cells that give a choice using this strategy, we can go back and re-consider the remaining cells, which now are likely to lead to a choice.

We have the option on each pass to consider each cell individually, in the context of only the original given information, or else to consider the cells in some sequence, in the context of all the choices that have been made so far. In our simple example, it suffices to consider (on the first pass) each cell in the context of the original given information. This leads to the choices shown below.

We can now fill the remaining cells to give a complete bit-plane that satisfies the requirements, and the same process can be applied to the MSB plane, giving the results shown below.

Combining these two bit-planes, we happen to get a consistent solution to the original puzzle, although our procedure doesn’t seem to guarantee this. We could have two bit-planes, each of which satisfies the requirements individually (i.e., having two 0’s and two 1’s in each row, column, and block), and yet their combination need not satisfy the requirements. For example, combining two LSB planes together would not give a satisfactory overall solution. On the other hand, it’s conceivable that 2 x 2 binary puzzles are so simple that they unavoidably give consistent bit-planes.

Incidentally, we can give an explicit expression the contents of a standard 2 x 2 grid satisfying the Sudoku conditions algebraically by means of simple modulo arithmetic. Consider the grid shown in the figure below.

We specify each cell by its coordinates (x,y), and let x_j denote the jth binary digit of x. (For example, x₀ is the least significant bit of x.) With this notation, the two binary digits of the contents of the cell with coordinates x,y are

These expressions just represent the process described previously rotating the rows or columns of a single prototype block to generate the other blocks. Of course, we could also take the number c as the index of an array containing an arbitrary permutation of the four possible cell contents. Notice that, in addition to the three Sudoku partitions (rows, columns, and blocks), there is a fourth partition, whose parts are the corners of any 3x3 square.

We can also apply modulo arithmetic to ordinary 3 x 3 Sudoku puzzles, reducing the cell contents modulo some suitable number, but this doesn’t lead to a fully symmetrical set of symbols and reduced bit-planes. To construct a puzzle de-composable into three bit-planes, we could consider an 8 x 8 grid, bit it isn’t clear how to define “blocks” of eight cells on such a grid. One possibility would be a checker board arrangement, in which each of the four quadrants represents two interlaced “blocks” as shown below.

Here the blue cells represent one “block”, and the yellow cells represent another. This pattern is repeated in all four quadrants, giving a total of eight “blocks”. We require each row, column, and “block” to contain each of the eight symbols. Taking the symbols 0 through 7, we could then decompose this into three binary bit-planes. We could also define our blocks to be wrapped diagonals, although we would then have to choose one of the two directions. (On how many different subsets can we impose the requirement for each symbol to appear exactly once in each subset?)

On the other hand, if we insist on actual coherent solid blocks, the next analogous case after the 2 x 2 = 2² grid would be the 3 x 3 x 3 = 3³ grid, i.e., a three-dimensional grid of size 27 x 27 x 27. This grid is sub-divided into solid blocks of size 3 x 3 x 3. Thus each row, column, “shaft”, and block contains 27 cells, which must contain exactly one of each of 27 symbols. Each cell is therefore subject to four constraints instead of just three. Taking the symbols 0 through 26 as our symbols, we could then decompose this into three trinary-planes, in which each cell contains either 0, 1, or 2. It is certainly possible to construct grids that satisfy these requirements, by extrapolating the shift pattern used to generate the two-dimensional arrays previously. Begin with the block in one corner of the grid, and assign the symbols to those cells in any order. Then the entire line in any of the three directions can be filled by shifting (with rotation) over the two directions perpendicular to the line. To express this construction algebraically, we can generalize our earlier coordinate definition to three dimensions (x,y,z), and let x_j denote the jth digit of x in the base 3. To give an explicit example of a matrix that satisfies the four Sudoku conditions, we set the three digits contents of each cell with coordinates x,y,z to

If we discard the least significant digits of x, y, and z, the remaining digits represent coordinates of the 9 x 9 x 9 array of blocks. The least significant digits represent the coordinates of the cells within the block. To prove that assigning the contents according to these equations satisfy the Sudoku conditions, note that a particular block is characterized by fixed values of x₁, x₂, y₁, y₂, x₁, and x₂, whereas the 27 cells within that block are identified by letting x₀, y₀, z₀ run through all possible combinations of values. Since the expression for each c_j contains exactly one of these three digits, each added to a constant. If the additive constants (comprised of the more significant digits) were all zero, we would obviously general all 27 cell contents. Now if we apply constant offsets (mod 3) to the digits of each of these 27 values, we arrive again at the same 27 values, but in some different order.

The same reasoning applies to any subset consisting of 27 cells that are specified by choosing one free digit from each of the expressions for the digits of c. Thus the columns are specified by fixing x and y, and allowing the digits of z to vary. Likewise we can fix the digits of x and z, and allow the digits of y to vary, or we can fix the digits of y and z, and allow the digits of x to vary. We can also define other partitions (in addition to the four basic Sudoku partitions) into subsets that must contain all 27 symbols. For example, we could fix the two least significant digits of each coordinate and allow the most significant digits to vary. This represents the subset consisting of the same single cell in each of the 27 blocks in a specified super-block (i.e., a 3 x 3 x 3 array of blocks). Recall that for the special 4 x 4 grid that we constructed using modulo 2 arithmetic we could identify four coherent partitions, basically by fixing all but one of the two coordinate digits in the expression for each of the two content digits. Likewise for the 27 x 27 x 27 grid we can identify 27 coherent partitions by fixing all but one of the three coordinate digits in the expression for each of the three content digits.

Return to MathPages Menu