A recent paper by Gary McGuire, Bastian Tugemann, Gilles Civario "There is no 16-Clue Sudoku: Solving the Sudoku Minimum Number of Clues Problem" (PDF) shows a new and interesting approach to the problem of the minimum numbers of clues to a normal Sudoku. This has been highlighted in a number of news outlets and scientifc journals (eg Nature). (Note: Nature's article tries to reproduce the 17 Clue example from the paper but they get the 43 in the wrong place, giving an invalid puzzle with 9734 solutions. The Sudoku in the paper is correct).

It has been conjectured for some time that 17 clues is the minimum number of necessary clues to make a single solution Sudoku puzzle. There are about 50,000 such puzzles collected from various sources out and about on the Internet. One such set I use for calibrating and testing.

'Proof' though needs to be qualified. This is not a mathematical proof as such but a brute force computer search through the number space within the Sudoku set, and the author admits that a mathematical proof is still be discovered. What is interesting about the paper is the algorithm to search the space. To search through all possible combinations would take an impractical amount of time. The author identifies the Unavoidable Sets as the key to reducing the search space. These are sets of four cells which potentially could be interchanged to make two solutions - and therefore, minimally, one of those cells must be a clue.

Even with this insight, it is still a challenging algorithm to run. First you must obtain (or generate) all possible unique filled-in Sudoku boards, of which there are 5,472,730,538. The algorithm must also take into account higher order Avoidable Sets (with nine numbers instead of 4). So it is understandable that the computing time was still considerable.

Nevertheless, to get a result in a practical time is an achievement. I have been running some tests on my Solution Count program which shows how the time to check a puzzle using a brute force method becomes exponential with the reduction in the number of clues. Anyone familiar with the solver will know this feature. 17 clue puzzles take on average 6 seconds - although the exact orientation and placement of numbers can make this vary from 0.5 seconds to 30 seconds. Given 22 clues the average time is 0.037 seconds and becomes millisecond or less with 25 clues or more. So a practical search of this space without using a trick like Avoidable Sets is impossible.

Some comments on the Nature page (comments down at the time of writing) assert that the number of clues determines the grade. This is not true - if you using logical strategies for grades as I do. It is possible to have very easy 17 clue puzzles and 'extreme' 30 clue ones. The only effect of clue density it to increase the number of operations required to solve (which is one mildly additive heuristic in my grading).

Overall, an exciting paper.

Andrew Stuart

Overall, an exciting paper.

Andrew Stuart

## Comments

Comments Talk## Saturday 13-Jun-2020

## ... by: Martin Harvey

Hi Andrew,You mention a per solution time of six seconds, I'm assuming this is brute force search of the space, taking the "least branches first" approach to filling the solution tree. I suspect that using the DLX algorithm to solve the "exact cover" problem (which explicitly represents all possible combinations of constraints and dependencies between them), you should be able to get a reasonable solution in milliseconds, particularly if the implementation takes into account the capabilities of modern hardware. I'm just coding this up in the next couple of days I'll get back to you if you want.

If the DLX algorithm does solve puzzles in milliseconds, then I *think* enumerating all possible 17 clue sudoku puzzles (after removing symmetries / permutations) will be amenable to a distributed approach (a la "distributed.net", "folding@home", "BOINC", "roasetta@home" etc etc) : A few thousand volunteers should give you about 150 million CPU hours per year) - I suspect that would make the problem eminently solveable.

http://www.martincharvey.net/

MH.

## Saturday 2-Nov-2019

## ... by: James Havard

My apologies to Andrew Stewart, since I didn't credit him for the unique sudoku board info that I used in my post. Thanks Andrew, I enjoy using your sudoku site everyday.Thanks,

James Havard

## Tuesday 29-Oct-2019

## ... by: James Havard

Unique sudoku boards, I call them UBs, number 5,472,730,538 which is reduced from the actual larger number because of rotating and symmetry. So if the UBs are computer filed, then a (unique solution) puzzle can be solved by matching it's clues with a UB. Kind of like the lotto system for seeing if anyone won the lotto. 2 or more (millions? billions? more) sudoku puzzles can have the same UB. I made 2 sudokus in a matter of minutes, using the same UB, and each of course then has the same solution. But one is rated diabolical and has 24 clues and the other is rated extreme and has 23 clues. Puzzles with 20 clues on a single UB board can be arranged in 81 nCr 20 ways which is 4.69E18. Many of these puzzles would not have single solutions and would be invalid. On the average, if 1 out of a 1000 is valid then that is still 4.69E15 20 clue puzzles for 1 UB. Multiply that by how many UBs there are and that = about 5E24 20 clue puzzles. Now to save time and avoid tedious addition we multiply that by 15 to cover the other clue numbered puzzles arbitrarily (17 to 32). Very roughly, the number of sudoku puzzles is about 7.5E25.(My estimate of the possible sudoku puzzles is based on Andrew Stuart's info about unique boards.)

## Friday 25-Oct-2019

## ... by: James Havard

I wonder if it would be possible for a computer computation of the total number of single solution puzzles from the unique 5,472,730,538 boards?I may be walking on thin ice here but wouldn't each of these uinque boards,UB, have 81 80 clue puzzles?

CLUES Puzzles

80 81x UB = 443,291,173,578

79 3240xUB=1.77E13

78 1,663,740xUB=9.105E15

77 25,621,596xUB=1.4E17

About 100 times more puzzles for each drop in the number of clues, but it would get more complicated with fewer and fewer clues and the number of puzzles would decline. But still adding all the possible puzzles of different clues together would be quite a lot of puzzles.

## Monday 1-Jun-2015

## ... by: ibrahim

What is the total no. of Suduku Puzzle possible i.e in how many ways a 9*9 suduku grid can be filled with the given costraints? How to find?...but

if you take into account all the symmetries the layout enjoys (rotation, reflection, swapping 1 for 2 etc), the number of uniquely different filled sudoku boards is

5,472,730,538

## Saturday 22-Dec-2012

## ... by: bartlm

As only a very recent sudoku addict I was quite surprised that the mathematics had not been fully worked out for what seems a not too complicated combinatorial problem. Clearly it has inner complexity.Where I would take issue with your comment on the 16 clue paper is that it is not a mathematical proof. An exhaustive search may not be elegant but if the search results are incontrovertible and anyone can check them, then the conjecture has been proven. I seem to remember that the four colour theorem was proved in this way about 25 years ago.

Have a great Christmas