From SudokuWiki.org, the puzzle solver's site

Grading Puzzles

As of the 1st of October 2025 I'm adopting a new scoring and grading system for Sudoku and variants. This is a potentially controversial topic so I though a page to explain it would be worth while. For four fifths of puzzles, roughly, the grade will be the same. This is not a new ball park and on a large scale a small difference. But on a case-by-case basis I think the new is more justified. What I'm hoping for is greater alignment with human solver's gut feel for a puzzle.

The Old Scheme

When I first thought about making puzzles I considered two aspects as being important. The strategies necessary to logically solve a puzzle and how many opportunities there were to spot a pattern or find a solution. The former is relatively easy to appreciate - there is a full spectrum of puzzle difficulty if you go beyond guessing. The second aspect is more subtle. Some puzzles collapse quite quickly. Others you have to chip away at candidate by candidate until you can break through, often multiple times.

In the old scheme I considered "rounds" which were based on finding solutions to cells. A new round started when you solved one or more cells. I created a weighting factor based on how strung out the solve path was. Each strategy used scored a number of points. I had points for candidates eliminated and extra points of these led to a solved cell. The sum of these points was multiplied by the weighting factor.

I also introduced a number of 'heuristics' – filtering rules which removed puzzles from the stock pool or promoted or demoted a puzzle by a grade. The most important one was recognising easy puzzles with a very hard bottleneck. The scoring system might give a puzzle a "tough" grade but it is trivial except for one or two 'diabolical' strategies in the middle. These are frustrating for puzzle solvers and we want to avoid them. Some puzzles might rack up a score with a dozen or so Pointing Pairs, say, but these are 'moderate' strategies.

Candidate Density

For the most part the old scheme was relatively successful. Although in the early years Fish strategies were over scored and boosted the difficulty rating. A Sword-Fish could eliminate a lot of candidates in one go, for example. I dialled back the scoring for these some years ago. But the other problem was that very hard puzzles that needed a lot of chains under scored. This didn't matter too much as most such puzzles ended up in the 'Extreme' category and didn't used in newspapers. These are interesting puzzles as they are the coal face for Sudoku theory and some people love to solve them.

The weighting factor based on 'rounds' has become difficult to maintain with so many puzzle variants. My recent insight in assessing the grading system is to look at candidate density - the total number of candidates on the board at any one step. More unsolved cells means more candidates to search for a pattern. The harder puzzles have more candidates per cell than simpler ones. This seems an ideal measure of the board.

This table is a typical solve path with the number of candidates from start to end.

Step	Candidates C	*F = C / 727 20**
2	219	6.0	On a 9x9 board there are 727 total candidate slots. For Killer Sudoku all 727 are in play at the start. I create a factor by multiplying the fraction C/720 by 20. The factor F is multiplied by the points for that strategy. Since singles are common and easy and I further reduce their contribution by ignoring how many are found in the step and multiply by 1 and 2 for Naked and Hidden respectively.
3	206	5.7
4	185	5.1
5	162	4.4
6	148	4.1
7	133	3.6
8	123	3.4
9	112	3.1
10	102	2.8
11	96	2.6
12	93	2.6
13	91	2.5
14	84	2.3
15	81	2.2
16	77	2.1
17	74	2.0
18	72	2.0
19	70	1.9
20	68	1.9
21	65	1.8
22	62	1.7
23	58	1.6
24	54	1.5
25	48	1.3
26	39	1.1
27	30	0.8
28	19	0.5
29	14	0.4
30	8	0.2

Strategy Scores

	*score factor**
Naked Singles	F
Hidden Singles	F x 2
Naked Pair	5 x F
Naked Triple	10 x F
Hidden Pair	10 etc
Hidden Triple	25
Naked Quad	40
Hidden Quad	60
Pointing Pairs	20
Line/Box Reduction	20
Gurths Theorem	80
Bi-value Universal Grave	30
X-Wing	30
Unique Rectangle Type 1	20
Chute Remote Pair	25
Simple Colouring	50
Y-Wing	50
Rectangle Elimination	25
Sword-Fish	50
XYZ Wing	60
Tridagon	60
X-Cycle	60 + chain length
XY-Chain	50 + chain length
3D Medusa	80
Jelly-Fish	80
Unique Rectangle 2,3,4,5	50

	*score factor**
Avoidable Rectangle	60
Twinned XY-Chains	100
Fireworks	100
SK Loops	100
Extended Unique Rectangle	90
Hidden Unique Rectangle	100
WXYZ Wing	100
Aligned Pair Exclusion	140
Exocet	300
Grouped X-Cycle	100 + chain length
Finned X-Wing	160
Finned Sword-Fish	190
Franken Sword-Fish	150
Alternating Infer. Chains	100 + chain length
Sue-de-Coq	180
Digit Forcing Chain	120 + chain length
Nishio Forcing Chain	120 + chain length
Cell Forcing Chain	180 + chain length
Unit Forcing Chain	180 + chain length
Almost Locked Sets	140
Death Blossom	200
Pattern Overlay	100
Quad Forcing Chain	200 + chain length
Bowman Bingo	100

In the old scheme points were awarded for eliminations + extra points for solving a cell. I have come round to the idea that if you have found a pattern then it doesn't really matter how many candidates are removed. Dropping this notion is one of the big changes in the new scheme and it should avoid fruitful strategies inflating a grade.

Now the sum of the scores for each step is the puzzle score. For vanilla Sudoku this gives a score from anywhere between 20 and 12,000+. To reduce any fixation on too many decimal places I normalize the score to a number between 1 and 10 with a log function.

9x9: Log₅ (score) * 2
6x6: Log₄ (score) * 2

Currently the division of the spectrum of puzzles into the named grades is as follows - and might change in the near future. Most randomly produced puzzles will be easy with extremes being the fewest.

Puzzle	Kids	Gentle	Moderate	Tough	Diabolical	Extreme
Sudoku	< 3	3 to < 4	4 to < 5	5 to < 7	7 to < 9	9+
Sudoku X	< 4	4 to <5.5	5.5 to < 8	8 to < 10.5	10.5 to < 12	12+
Jigsaw Sudoku	< 4.5	4.5 to < 7	7 to < 9	9 to < 10	10 to < 11	11+
Killer Sudoku	x	<8.8	8.8 to < 9.2	9.2 to < 9.6	9.6 to < 10	10+
KenKen 6x6	x	<8.5	8.5 to < 9	9 to < 9.6	9.6 to < 10.5	10.5+
Killer 6x6	x	<6.3	6.3 to < 6.6	6.6 to < 7	7+	x

The current distribution of normal Sudoku in my stock table

Points for other strategies:

Jigsaw Strategies	*score factor**
Double Pointing Pairs	30 x F
Triple Pointing Pairs	50 x F etc
Double Line/Box Reduction	30
Triple Line/Box Reduction	50
Law of Leftovers	30
Killer Strategies	*score factor**
Trivial KenKen Dogleg	10
Killer Innies/Outies	5
Easy Combinations	5
Rule of Parity	50
Killer Cage Splitting	40
Killer Innies (2+ cells)	30
Cage/Unit Overlap	60
Hard Combinations	10
Killer Cage Comparison	50
KenKen Strategies	*score factor**
KenKen Dogleg	20
Rule of 21	40
Rule of 720	40

I have grouped strategies on the solver into 'tough', 'diabolical' and 'extreme' but it will be appreciated that the use of a particular strategy does not immediately make that puzzle belong to the grade the strategy is grouped in. I do try and filter away puzzles the score oddly and in the majority of cases the hardest strategy will be in the same group as the grade, but not always. I believe the new scheme pushes the spectrum further towards matching.

Comments and suggestions as ever always appreciated!

Many thanks to Andy Potvin for our discussions on this topic.

Andrew Stuart