Main Page - Back
 
The Relative Incidence of Sudoku Strategies

From sudokuwiki.org, the puzzle solver's site
breakline
This article has been updated March 2012 and replaces the statistics done in Dec 2009

A recent question from a reader prompted me to run off some statistics which I think are interesting and worth exploring.

Comment:There is something I am curious about that I really hope you can answer, although it's quite subjective and I suppose the answer will be a ballpark figure but I was hoping a Sudoku expert such as yourself could take your best educated guess at.

If I know all of your basic, tough, and diabolical strategies, but don't go as far as any of your evil strategies that you list, what percentage of Sudoku puzzles (in your opinion) do you think I could solve-80% of all puzzles that I would try? 85%? 90%? 95%? 99%?

What would you guess if you had to estimate? I know it's hard since there are literally trillions of puzzles, but easy, medium, tough, and many diabolical puzzles I can already solve with these current strategies, excluding your evil ones. Do you think the percentage of puzzles where you HAVE to use one or more evil strategies in order to solve the puzzle is a small percentage, perhaps 1%? 2%? 5%? 10%?

Just curious what your opinion is.

There is a lot to grading and scoring a Sudoku puzzle. I've put some thoughts about this into
http://www.scanraid.com/Sudoku_Creation_and_Grading.pdf. There is not a one to one correspondence between the published grade (or the grade on my solver) and the list of strategies and many factors contribute to the grade. My strategy list is partially subjective in that I choose to label certain strategies as 'tough' for ease of explanation and to show what I consider the best order in which to attack a puzzle. It is an attempt at a 'minimum path'.

It should also be noted that because I don't use strategy X to solve a puzzle in the solver, it does not follow that strategy X could not be used. There are often many ways to solve the same puzzle.

However it is still an interesting question what proportion of all puzzles require at least one strategy in each grade group. I've run a count on a 200,000 puzzles I created searching for unsolvables (March 2012). These were produced randomly and I did not know the grade until after I created them. The sample is therefore fair. The results are:


This confirms my view that the vast majority of puzzles are uninteresting. In order to produce a 100 puzzles of all grades I need to over produce many puzzles since the incidence of higher grade puzzles is low. Note that the 10% of 'moderate' only puzzles does not mean they are rare. Any hard puzzle will require many more incidences of moderate strategies to complete in addition to the hard ones.

It follows that I can produce a list of all the Sudoku strategies and a count of their incidence in solving the stock.
STRATEGYCount%
Human Strategy 200000100.0%
Naked Singles 18038790.2%
Hidden Singles 9918549.6%
Naked Pairs 4517522.6%
Naked Triples 2135910.7%
Hidden Pairs 79424.0%
Hidden Triples 17370.9%
Naked Quads 2490.1%
Hidden Quads 00.0%
Tough Strategies
Intersection Removal 3469217.3%
X-Wing95764.8%
Simple Colouring2003010.0%
Y-Wings160438.0%
Sword-Fish13130.7%
X-Cycles110505.5%
XY-Chain 167248.4%
Diabolical Strategies
3D Medusa59503.0%
Jelly-Fish240.0%
Avoidable Rectangle220.0%
Unique Rectangles17480.9%
Hidden Unique Rectangles 41852.1%
XYZ Wing 15390.8%
Aligned Pair Exclusion23131.2%
Evil Strategies
Grouped X-Cycles18830.9%
Empty Rectangles1440.1%
Finned X-Wing11500.6%
Finned Sword-Fish8260.4%
Franken Sword-Fish 60.0030%
Altern. Inference Chains67773.4%
Strong Links 63013.2%
Weak Links 145147.3%
Off-chain 19511.0%
Digit Forcing Chains 9610.5%
Cell Forcing Chains9120.5%
Unit Forcing Chains2780.1390%
Sue-de-Coq 00.0%
Almost Locked Sets120.0060%
Death Blossom40.0020%
Pattern Overlay Method280.0140%
Quad Forcing Chains 520.0260%
Bowman Bingo 490.0245%

So if you were wondering, as I was, how useful certain strategies are, this data is interesting. The only other caveat I'd add is that some strategies are sub-sets of others, or can be expressed in terms of another strategy. For example, Remote Pairs are a special case of XY-Chains which is a sub-set of AICs. It is useful for the solver to split these out but when making and grading I don't do so. So there is some overlap.

The answer to the reader's original question - the incidence of 'evil' strategies, is I'd say, about 5%.

Andrew Stuart







Article created on 31-December-2009. Views: 8188
This page was last modified on 12-March-2012, at 14:14.
All text is copyright and for personal use only but may be reproduced with the permission of the author.
Copyright Andrew Stuart @ Syndicated Puzzles Inc, 2012