Sudoku Solving in R

I have decided to try and wrap my arms around R, the statistical programming environment that everyone seems to be talking about these days.  It’s been around since the ’70s, so probably people have been talking about it for decades, but it just hasn’t made its way into my daily life until now.

I am not a statistician, nor do I have any background whatsoever in statistics.  As a philosophy major, my foray into business classes was limited – in fact, I can’t think of a single class I took in the college of business.  I have faked it somewhat in my career to date, forging formulas in Excel that would probably have been more elegant, efficient, and easier with one introductory class to statistics.  So, I recognize a need to become more versed in the art.  What better way than to dive into such a vital programming environment, especially considering my love of programming itself?

One of my favorite ways to learn a new language is to program a Sudoku solver using it.  I have used Excel VBA, PHP, and C++ to do this (those scripts have long been lost to time), and now have done it with R (which script I will post here so that it is never lost).  When writing a Sudoku solver, you are forced to understand the syntax for variables, arrays, functions, loops, and conditionals, so once done, you have a firm footing to begin learning the more complex aspects of a language.

Please note, I wrote this while learning R.  Probably, I will look upon this script in a year’s time and wonder why I did some of these things, noting inefficiencies and poor constructs.  But, for now, it works, and that is all that matters to me.

To start, R speaks mainly in terms of “vectors”.  This is foreign to me, as I always thought of a vector in the mathematical term, something with both direction and magnitude.  In my experience, what R calls “vectors”, I would call “arrays”.

sud <- c(3, 7, 0, 0, 5, 8, 0, 0, 0, 0, 0, 0, 3, 0, 0, 7, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 6, 0, 3, 0, 0, 0, 7, 0, 2, 0, 1, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 7, 6, 0, 0, 0, 0, 4, 0, 4, 0, 0, 0, 2, 0, 8, 0, 0, 0, 0, 5, 6, 0, 0, 0, 0, 0, 0, 1, 7, 0, 0, 9, 0, 0)

This is how you define vectors, and means the variable sud contains a vector of these 81 numbers.  These 81 numbers are the Sudoku puzzle itself, where blanks are represented as 0s and the numbers are transposed by column, starting at the top left cell, moving down to the bottom left cell, then moving across the puzzle.  I went by column instead of by row because R handles matrices these way – columns, then rows.  This will come in handy later.

locked_cells <- sud > 0

When I realized R was capable of this kind of simplicity, I started to fall in love with it. This single line of code creates a new vector of name locked_cells and populates it with a boolean for each value in sud based on whether it is greater than 0 or not. Since I replaced blanks with 0s when transposing the puzzle, this indicates that a value is a given, hence it is “locked”.

sqrs <- c(1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 4, 4, 4, 5, 5, 5, 6, 6, 6, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 7, 7, 7, 8, 8, 8, 9, 9, 9, 7, 7, 7, 8, 8, 8, 9, 9, 9)

rows <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9)

cols <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9)

This is where I wish I had a firmer base in mathematics.  It is probably a simple matter to, using the index of the cell of the puzzle we are solving for, calculate the square, row, and column we are in using some kind of algorithm.  Since I am a just a lowly liberal arts major, I have to identify these vectors manually.  So, as we are solving for various cells – say, the 15th cell, which would be the second column, sixth row – we can use the 15th value in sqrs (3) to identify which square we are in (squares, like cells, are also ordered top-to-bottom then left-to-right), the 15th value in rows (6) to identify which row we are in, and the 15th value in cols (2) to identify which column we are in.

Since we will know which square, row, and column a given cell is in, we can validate its value based on whether or not it is repeated by other members of that square, row, or column using this function:

test_val <- function(inp, cells) {
  result <- c(0)
  for(i in 1:9) {
    # iterate through each cell in the list
    # identify if the value of the current cell is already used
    if(inp != cells[i]) {
      if(!is.na(sud[cells[i]])) {
        if(sud[inp] == sud[cells[i]]) {
          result <- c(1)
          break
        }
      }
    }
  }
  result
}

Given an index within the Sudoku puzzle (between 1 and 81) and the indices of the cells we need to check for validation, this function will simply see if the value of the index cell is repeated within any of the other cells.

Note, quickly, the difference between [] and ().  When referencing a value within a vector, we use [].  When passing variables to a function, we use ().  I had many bugs while writing this where I used one in place of the other accidentally.

Since we need to check for duplication with rows, columns, and squares, we need a wrapper function as so:

check_cell <- function(inp) {
  result <- c(0) # default to no errors

  # if the value is 0, it is invalid automatically
  if(sud[inp]==0) {return(1)}

  # if the value is 10+, it is invalid automatically
  if (sud[inp] >= 10) {return(1)}

  # first, get square
  q <- sqrs[inp] # this is the square we are in
  cells <- which(sqrs %in% c(q)) # this is all the cells of the sqaure
  result <- test_val(inp,cells)
  if(result == 1) {return(1)}

  # next, get row
  q <- rows[inp]
  cells <- which(rows %in% c(q))
  result <- test_val(inp, cells)
  if(result == 1) {return(1)}

  # next, get column
  q <- cols[inp]
  cells <- which(cols %in% c(q))
  result <- test_val(inp, cells)
  if(result == 1) {return(1)}

  return(0)
}

The line cells <- which(sqrs %in% c(q)) (and the subsequent lines referencing rows and columns) is another one of those beautifully simple commands that made me start to fall in love with this language.  R has a built-in search function called which() that returns a vector of indices matching the condition in a given vector.  So, this line returns the indicies of sqrs that are of the same value as the vector we previously identified as holding the value of the current square.  Thus, we can pass this new vector (cells) to the test_val function.

Then we come to the main part of the script, which iterates through each of the 81 cells, adding 1 to the current cell (if it isn’t locked) and checking for validity before moving on to the next cell.  If the current cell is invalid, it adds one again and checks for validity (and so on).  If the value it reaches is 10 (invalid because we know solutions can only be between 1 and 9), it sets the value back to 0 and moves backwards one step to the most recently-used unlocked cell and adds one to it (then checks for validity, and so on).  This is brute force at its best, and is wildly inefficient, but always gets the job done.  I like brute force methods for those very reasons.  They may not be pretty, but they are infallible (as long as the puzzle given is possible to solve).

s <- 1

repeat {
  if(!locked_cells[s]) {
    sud[s] <- sud[s] + 1
    while(check_cell(s) > 0) {
      sud[s] <- sud[s] + 1
      if(sud[s] >= 10) {
        sud[s] <- 0
        s <- s - 1
        if(s==0) {
          break
        } else {
          while(locked_cells[s]) {
            s <- s - 1
            if(s==0) {break}
          }
        }
        sud[s] <- sud[s] + 1
      }
    }
  }
  if(s == 81 || s < 0) {
  break
}

  s <- s + 1
}

At this point, the puzzle is solved.  The vector sud has been modified and all 0s have been replaced with values that fit the puzzle.  In order to display it for humans to read easily, we need to translate it into a matrix.  Since I started with a matrix transposed to a vector, all I need to do is tell R to transpose it to a matrix:

sud_matrix <- matrix(sud,nrow=9,ncol=9)

Then, R allows a variable to be printed to the console just by mentioning it.

sud_matrix

If you have any questions, please feel free to leave a comment below.  Like 90% of my posts, this was left here mainly so I wouldn’t lose it.  But, I couldn’t help myself from waxing poetic about the language itself.  It truly is a lovely language, and I am excited to learn more about it – and statistics – as my self-led education continues.

Postscript

I modified the code slightly to understand how long a given puzzle would take to solve. The easy puzzle used in the above post took my relatively modern (though 32-bit) laptop 10 seconds to solve. The extremely difficult puzzle from this link took 13 minutes and 28 seconds.  I never said this code was fast, just that it worked!