Sudoku Solver

December 7, 2022 Reading time: 4 minutes

The Sudoku Solver on the site lets you contribute boards to a shared public library, and lets you choose from a list to see the original board, and its solution.

A Sudoku solver is the perfect project for learning a new scripting language, or trying a new algorithm.  As a puzzle, it can be anywhere from easy to extremely hard to solve, but it yields pretty easily to algorithmic solvers.

I’ve written solvers in Perl, Python, and most recently PHP, using a variety of techniques.  All of which come with a single lesson: don’t over-think the problem.

My first attempt was a rule-based system that tried to encapsulate the heuristics I had developed (so far) while solving the puzzle with pen-and-paper.  It worked, but was kind of clunky, and it was easy to get to a place where I still had cells to fill, but had run out of useful rules. 

To stay faithful to the approach, I needed more (and better) rules to get past the dead-ends.  When I later discovered the YouTube channel “Cracking the Cryptic”, my eyes were opened to whole new ways to expand what had been, in fact, a very paltry rule set. 

Solving Sudoku by hand is like a yoga or meditation practice.  A long-term, persistent practice of trying to improve your mental game for spotting patterns on the board.

One scripting opportunity would have been to try expanding the rule library I started with, and to deploy more exotic rules, but I decided to try instead a few other avenues:

I built an extension on the rule-based system that just added a recursive final step … when the rules tapped out, I identified an open cell with a small number of numbers that might work, and looped through them: plug in a guess and then see if the rule-based system could advance from there.  If it failed, back up, and try the next guess.  Slightly inelegant but it worked and was fast.

When I was learning Python, I started over and wrote a simple Depth First solver, which had been an approach I’d used on some quirky jigsaw puzzles to great success.  The idea here is that instead of using rules, you just proceed to the first blank on the board and loop a number into it, check for any failure states (like the number repeats in the given row or column, etc).  If you find a failure, back up and try the next number in the loop; otherwise move on to the next blank cell and start a loop on that one.  This approach works but turns out to be massively slow.  Although Depth First is a great way to prune away branches of a combinatorial tree, this was still a really big tree.  I ported the script to Perl to see if it was an interpreter issue (also to create a terminal display for no redeeming reason) but …. just really slow.

For the puzzle solver on the current site, I used a trick I’d seen my father use in manual solves:  imagine a keypad, 1 to 9, for each blank cell, and only include numbers in the keypad that you can’t find in the row, column, or local 3x3 matrix for the blank cell.  Inventory the whole board this way. When you find a cell where only one number works (and you usually will), fill that number in, and then refresh your inventory of blank-cell keypads to reflect this addition.  When out of single-number solutions for cells, identify a cell with the smallest number of digits remaining on its keypad, and loop through those remaining digits (fill in and recurse to one level down, popping back up a level when you hit a failure situation).

This final approach turns out to be very fast.  I wrote it first in Python.  The PHP version for the site solves even difficult puzzles and creates graphical versions of the initial board and a the solution to display, all without much of a pause loading the page.

The lesson: if you are a person with only your brain and a pen between you and the sudoku board, you are practicing a meditation and should approach it as such.  If you are using a computer, don’t over-complicate it, and let simple loops and recursion do the heavy lifting.

This several solver scripts (including a couple attempting a depth-first approach) can be found in this GitHub repository.


Mining Words

November 20, 2022 Reading time: 23 minutes

A favorite elementary school word puzzle is the 'how many other words can you make from the letters of this big word' variety. So, for example, the word "bread" will yield: red, are, ear, dare, dear, read, bar, etc. There are lots more.

How to Solve It

How would you get a list of all of the right answers? First, you have to figure out how to ennumerate the complete list of letter subsets found in the "parent"word, and then you have to figure out what words (if any) can be formed by rearranging the letters in any one subset. 

Find our solver Here

To generate an initial list of all possible letter subsets from the "parent" word, we can use binary numbers. 

Any given subset of letters from our original word can be represented as as either used (1) or not used (0) … as a string of 1s and 0s according to whether the letter in the parent word was used or not. Consider the word "east":

#

word

binary

subset

1

east

0001

t

2

east

0010

s

3

east

0011

st

4

east

0100

a

5

east

0101

at

6

east

0110

as

7

east

0111

ast

8

east

1000

e

9

east

1001

et

10

east

1010

es

11

east

1011

est

12

east

1100

ea

13

east

1101

eat

14

east

1110

eas

15

east

1111

east

Map the letters you use to a "1" and any that you don't use to a "0", and the letter combo you use suddenly has a unique binary "signature".

For a word N characters long, there must be 2^N - 1 possible combinations of letters (you subtract one more since one of the combos is all 0s ... no letters used ... so it can't ever help).  (The PHP "decbin" function is a quick way to express each number in the sequence as a binary string.)

For a four letter word like "east", that's 2^4 - 1 = 16 - 1 = 15 combos, and the binary form of each number from 1 to 15 can be used as a map for which letters are included in the subset.

Many "parent" words will have repeating letters, so some of these binary letter subsets will boil down to the same set of letters (just originating from different places in the "parent" word), so we only keep the list of unique letter combos.

Finally, we recognize each subset may represent more than one word, depending on how the letters are arranged (for example, the subset of "e", "a", and "t" can generate words "eat", "ate", and "tea"). But this is just the Jumble problem we've solved elsewhere, so we use that code to get the set of matches from the dictionary for each subset of letters we identify.

The dictionary I use contains over 100,000 words, and sometimes the matches are abbreviations or otherwise suspect.  A lot depends on the word list you compare to.

One challenge to come home from the 2nd Grade was to find the 39 words hidden in "ornaments". Using this script, we found 368! Time to start talking about extra credit ...


Solve the Jumble

October 4, 2022 Reading time: 3 minutes

In the "Jumble" puzzles (copyrighted by David L. Hoyt) the challenge is to unscramble a set of words.  The individual words are usually 5 to 8 characters long. In the official puzzle, selected letters from the individual answers are unscramble to provide the punchline to a horrific pun.

How to Solve It

This is not hard to solve.  Usually the answer pops right into your head after a quick glance. Sometimes it doesn't work and you need tricks to jump-start the puzzle-solving process (like reversing the scrambled letters, or looking for common groupings ... 'th', 'ing', 'er', etc.)

It is a fun Comp Sci 101 exercise, because people can be very creative at finding difficult solutions. On the other hand, you can use a clever hack described by Jon Bentley in the "Aha! Algorithms" essay from his book "Programming Pearls".

To unscramble a word, use this central trick: sort the letters of the scrambled word into alphabetical order, then compare this sorted arrangement of letters with a dictionary of words processed the same way ... an alphabetical list of words each paired with a version of itself with its letters sorted alphabetically.

For example, say that one of our scrambled puzzle words is "lorac". Sort the letters into alphabetical order to obtain "aclor". Then search the dictionary for any five-letter words whose letters also sort to "aclor". (I used a list of about 100,000 words and found four solutions: "calor", "carlo", "carol", and "coral")

The final script used in this page is written in PHP.  The word list and sorted counterparts are kept in a SQLite table, which facilitates the function used to solve the puzzle.  Bentley's version only required a text file (no SQL), adding a step to process the list of words and sorted aliases into a 'compressed' file: each line of the file starts with a unique 'sorted' key, followed by all of the words that share that key, separated by spaces.

You can try it out on the Puzzle Solver page, right here.

I set up a GitHub repository with most of the essential code (PHP/HTML and Python versions of the solver, and a Python script to create the SQLite word table) here: https://github.com/vandinem/jumble


The Quote Page(s)

October 1, 2022 Reading time: 4 minutes

My old site had a page of favorite quotes, inspired by the back-of-door creation by brothers Buddy and Seymour Glass, described in a favorite short story: "Zooey", by J.D. Salinger (page 176 in my Little, Brown mass market paperback):

"He closed the door behind him as tightly as possible, and with an expression implying that the absence of a key in the lock met with his disapproval. He gave the room itself scarcely a glance, once he was inside it. Instead, he turned around and deliberately faced a sheet of what had once been snow-white beaverboard that was nailed uncompromisingly to the back of the door. It was a mammoth specimen, very nearly as long and as wide as the door itself. One could have believed that its whiteness, smoothness, and expanse had at one time cried out rather plaintively for India ink and block lettering. Certainly not in vain, if so. Every inch of visible surface of the board had been decorated, with four somewhat gorgeous-looking columns of quotations from a variety of the world's literatures. The lettering was minute, but jet-black and passionately legible, if just a trifle fancy in spots, and without blots or erasures. The workmanship was no less fastidious even at the bottom of the board, near the doorsill, where the two penmen, each in his turn, had obviously lain on their stomachs. No attempt whatever had been made to assign quotations or authors to categories or groups of any kind. So that to read the quotations from top to bottom, column by column, was rather like walking through an emergency station set up in a flood area, where, for example, Pascal had been unribaldly bedded down with Emily Dickinson, and where, so to speak, Baudelaire's and Thomas a Kempis's toothbrushes were hanging side-by-side."

Now, four columns of anything on the Web is asking for trouble in these days of spinning mobile devices, but two columns seemed reasonable with the hyper-efficient lit.css stylesheet I'm using on the main site. That new page can be found here.

The quotes themselves were originally maintained in a MySQL table, and at less than 150 entries added since the project began, eons ago, this is maybe too much overhead. In the new world, all of the quote content is migrated to a SQLite data file, with columns that provide the quote, author, source, layout hints (left, right or center on the page), and a "Really Use This?" flag. (Some of the quotes don't wear well with time, or just get a little stale, and this is an easy way to filter without deleting anything).

I add quotes to these tables with a standard database program (phpMyAdmin for MySQL, DB Browser for SQLite, or BeeKeeper Studio for either). The page would work as well, and probably faster, just reading from a structured text file, but my goal was to link a few tools together, so this approach is more diverting.

Generating a list of quotes in random order is a simple SQL and PHP task, and the page setup only takes a few minutes. Deciding how you want to arrange the various elements of the quote is what takes up all of the time. You never feel like you've really nailed it.

While working on this, I also came across '58 bytes of CSS to look great nearly everywhere', a great github post by Joey Burzynski. I opted to use his slightly extended 100 bytes of CSS to create an alternate quotes page. I have a real Hate/Hate relationship with CSS, so these efforts to provide basic, clean design really appeal to me.


Book Club Mini

August 17, 2022 Reading time: 2 minutes

A few years ago I created a Bookclub site (you can see the public-facing side of it here. Use the username 'guest' and no password), initially to serve my book club of 20 years and 200+ books. 

We'd been tracking our books and meetings in a shared Google Sheet, and honestly, that's all it needs.  Only two of us even care about tracking the data, unless we start a debate about whether or not we've read a given author or book.

Building the site was actually just an excuse to learn about a responsive CSS framework ... I think I started with Bootstrap but ended up on Bulma ... and a little Javascript.  And it got out of hand, as unplanned applications always do.  It had a home-grown login system, a private Manager interface, tie-ins to an Amazon Associate account, and a MySQL-based data model to support multiple book clubs.  In other words, complete overkill gathering dust on a virtual shelf.

"Book Club Mini" is a revised site that embraces a few simple truths:

  1. There are about 300 books ... this is not Big Data
  2. A bookclub site usually doesn't need member management ... view the lists or don't.  It isn't secret stuff.
  3. If you commit to CSS and JavaScript, it will take over your whole life

The revised site is open to the public.  Books and Meetings are maintained in small, delimited text files.  Graphics are provided by Bookshop.org widgets.  PHP is used for setting up files and lists, and managing the server-side tasks.  And the look and feel leverages Lit.css, a 500 byte responsive CSS file, which is about all the CSS I'm willing to deal with.


About

August 16, 2022 Reading time: ~1 minute

Mark Van Dine's Blog

(for www.vandine.biz)

Charles the Foolish (by Al Van Dine)

mark.vandine on gmail

vanadium on Post.News



Mark's Blog
(for www.vandine.biz)

Charles the Foolish (by Al Van Dine)
mark.vandine on gmail
vandinem on Twitter