Nerd sniping

Randall Munroe, the man behind XKCD recently launched a new book, Thing Explainer, where he describes complicated things using only 1000 common english words. Unfortunately, his book tour didn't come to the UK, but to say 'sorry' he created a puzzle for those of us in the UK to pour over.

I spent an otherwise unproductive (and lonely, as my usual lunch buddies were away) lunchtime brainstorming possible solutions to the first clue, before a friend of mine (@imogenwhittam) pointed out to me that each clue seemed to refer to a specific place in each town (e.g. '... was educated here'), without naming the place in question.

This, in conjunction with the distance and implied direction on the map was enough for me to twig what the answer was. I know Cambridge extremely well, having spent the best part of a decade there, and 50m due east of Trinity College was going to indicate Heffers - a bookshop I have spent many hours in - as the location of the next clue. This made a lot of sense in retrospect, because it was meant to be promoting the Thing Explainer book!

Following the same logic for the London clue, I stepped out of work for the afternoon, and headed down to the Piccadilly branch of Waterstones in the hope of finding the next clue.

My understanding is that there was meant to be a big poster present, but instead there were just A4 photocopies of the poster; in other locations, at least, there had been some confusion over how long the competition was meant to run and the poster had been accidentally thrown away! Maybe I should have just considered myself lucky anything was there at all!

'Clue' two was to write a 250 word short story over the following weekend, using only the words found in Thing Explainer, subject to a collection of extremely arbitrary rules. Each of these rules would award a story points, and the story with the highest number of points would win! This was the point where I was well and truly sniped, and became fixated on giving this as good a shot as I could.

Each of the rules are described below, along with the analysis I conducted to (try and) maximise my score. I have released the code I wrote to help with this process at GitHub; I think it is unlikely I will work on it further, but I may come back to it to polish it up just as an entertaining exercise.

Function one

Ten points per personally-described character whose fate we know by the end of the story

Okay, this is pretty sensible. This is essentially a large bounty to stop people submitting a random string of words that score ludicrously highly, but do not actually form a short story. I neglected to write an actual function for this(!), but simply kept in mind that I should have three characters.

Maximum Score: 30 points
Achieved Score: 30 points

Function two

$\ln(87)$ points per different Oscar-winning movie title included, up to a limit of five.

This was the first function I tackled with the intention of optimising fully. One of my favourite films is Once, which won an Oscar for best song, so I immediately thought that I should be able to get the maximum possible pretty easily. The titles I went with were 'Her', 'Wings', 'Up', 'Once' and 'The Fly', as these were short and even had a common theme running through several of them!

Unfortunately, in writing this retrospective, I realised I managed to fail to include 'The Fly', and my original code was caught out by the fact that there are two films called 'Up' that have won Oscars. Therefore I'm pretty sure I've left some easy points on the table, unless the judges are feeling particularly generous.

Maximum score: 22.33
Achieved score: 17.86 and a lot of self-flagellation
Why ln(87)? There have been 87 Academy Award ceremonies.

Function three

$10\sqrt{n}$ points for the longest run of $n$ consecutive lines of iambic pentameter (you don't have to put line breaks in)

It's actually not immediately clear what the maximum score here is. As a rough estimate, let's assume that one of the most famous lines of iambic pentameter is representative - "Now is the winter of our discontent", containing seven words. That'd give us at most 35 lines of iambic pentameter, corresponding to 59.16 points.

This was another category I didn't try to hard at, but was probably the second most fun one to code up a scoring function for. I was astonished to discover that CMUdict existed and was free. My scoring function was pretty bad, breaking on a lot of edge cases. Once I got one line in there, I was happy (again, diminishing returns was a real consideration here).

'Maximum' score: 59.16
Achieved score: At least 10
Why 10? A line of iambic pentameter has ten syllables

Function four

$\sqrt{n}$ points where $n$ is the sum of the squares of the lengths of your sentences which are acrostics of valid words

I thought this category was surprisingly hard. I dodged a lot of issues with the London Underground words (see later) by making sure there was a sentence break across the words. That couldn't be the case here. I ended up settling for a short four-word sentence "Students always want something" to make sure I was scoring something here, but I think there was definitely room for improvement here.

There was one valid fourteen letter word ('understandings'), so the maximum score would correspond to 17 14-letter sentences and then a 12 letter sentence for a total score of 58.96 points.

Maximum score: 58.96
Achieved score: 4

Function five

$\pi\times\sqrt{n}$ points for the longest run of $n$ words in a row whose lengths (mod 10) make up the first $n$ digits of $\pi$

$\pi$, unfortunately, has two '1's early on in its value (3.14159...), which was the biggest hurdle I faced here, even if one of them could be an 11 letter word. I managed 3.14, but I'm sure someone will achieve something truly great with this one given the precedence that there is in the world at large.

Maximum score: 49.67
Acheived score: 5.44

Function six

$\ln\left(118\right) \times \sqrt{n}$ points for the longest run of $n$ words in a row that can be made of chemical element symbols (ingoring spaces and punctuation and capitalisation)

The category with technically the highest possible score available, but only at quite the cost to the available vocabulary! Note that you are allowed to ignore spaces, so individual words don't have the be spellable with chemical symbols, just the string. So for example, you would be allowed 'sea life' (SeAlIFe), even though 'sea' cannot be spelled with chemical symbols. Despite this insight, the string I ended up with did just consist of words that could be spelled individually ('is no other choice with life takers that').

Maximum score: 75.43
Acheived score: 13.49
Why 118? There are 118 currently discovered chemical elements

Function seven

$3\times\sqrt{n}$ points for the longest run of $n$ words in a row that are in alphabetical order

This function, I reckon, has a pretty big discrepancy between the theoretical maximum and the practical maximum, so I didn't worry about it too much. Entertainingly, my longest run came from one of my London Underground anagrams ("[Training] always bored me. Up")

Maximum score: 47.43 points
Achieved score: 6 points

Function eight

$1.52 \times \sqrt{n}$ points if a string of consecutive words in your story anagrams to the names of two stations on the same London Underground line, the shortest route between which on that line has $n$ stops. Scored up to a limit of three times on different lines.

This felt like it was going to be another great category to score the maximum possible number of points, and I was right. It took a while for me to figure out how to write the code to score this function, but the solution I eventually came up with felt pretty elegant - use open data to create a graph in Python, and then look for the stations with the largest journeys between them on a single line and the number of stops between them. The best pairs of stations were:

District Line - Ealing Broadway to Upminster - 42 stops, worth 9.85 points
Piccadilly Line - Cockfosters to Uxbridge - 41 stops, worth 9.73 points
Central Line - Epping to West Ruislip - 36 stops, worth 9.12 points

Unfortunately, Cockfosters to Uxbridge and Epping to West Ruislip proved pretty difficult to find appropriate anagrams I was able to use in my story. I therefore took a minor hit with two of these pairs, and ended up with

District Line - Ealing Broadway to Upminster - 42 stops, worth 9.85 points - "Training always bored me. Up"
Piccadilly Line - Cockfosters to Hillingdon - 40 stops, worth 9.61 points - "old neck. 'Schooling first"
Central Line - Hainault to West Ruislip - 35 stops, worth 8.99 points - "usual, the lit 'win' pairs"

Maximum score: 28.7
Achieved score: 28.45
Why 1.52? The London Underground is 152 years old.

Function nine

$n$ points where $n!$ is the greatest factorial dividing the product of the lengths of all sentences in the story.

This category is pretty interesting to think about. You could think of this as needing a two word sentence, then a three word sentence, then a four word sentence, and so on. But to squeeze out the highest number of points, you want sentence lengths that are prime factors.For example, instead of a fifteen word sentence, you save words by having a three word sentence and a five word sentence. The sum of the prime factors of 27! is 243, so the highest possible score is 27. Note that prime-number sentence lengths you have to include at full length - there's no way around it. The concession to this I made was going back and including a 13 word sentence towards the end of my writing, which earned me two extra points, as I had a spare 14 word sentence just lying around, apparently.

Maximum score: 27
Achieved score: 14

Function ten

$n$ points for the greatest $n$ such that you use exactly $n$ different $n$-letter words

Another category with diminishing returns. There were 23 valid eleven-letter words, but only 6 twelve-letters words. I stumbled into this one a bit by accident; about 200 words into my story I got to seven different seven-letter words, and decided that I simply wouldn't use any more. I had to make some deference to my natural writing process, but otherwise this came quite easily.

Maximum score: 11 points
Achieved score: 7 points

Function eleven

$15 e^{-\ln^2{\frac{n}{9}}}$ points if you don't use any words that score $n$ in scrabble, for the $n$ which makes this biggest

This function peaks at 9, and tails off slowly for large n. I just took these as free points, and didn't bother optimising to any extent. Only six points were left on the table with no effort, which seemed pretty reasonable to me.

Maximum score: 15 points
Achieved score: 8.58 points
Why 15?: A scrabble board is 15x15.
Why 9?: I don't know. The average score of a word in the corpus at Simplewriter was over 10. Insight appreciated!

Function twelve

$\frac{(n-13)^2}{13}$ points if you use exactly $n$ of the letters of the alphabet an odd number of times

The final fiddle, in my case at least. I created a list of letters I had used an odd number of times, and then looked for valid adverbs and adjectives that I could add made of those letters, or even numbers of any other letters. I then added these to (parts of) sentences that didn't affect the earlier points scores.

Maximum score: 13 points
Achieved score: 13 points

There was also a final possible category, but it was only worth 0 points - but 100 style points - for having your story accepted in any periodical carried by newsstands. I opted not to worry about this one!


So how do I think my story scored? Out of a rough maximum of 437.68 (which is, of course, impossible - you could not score maximum points in both the acrostic category and the factorial category, for example), I think my story scores at least 160, which I think is pretty respectable. Of course, the other people entering this competition were a self-selecting crowd of XKCD readers that I fully expected to do something incredible. I wasn't even in the top ten! I'm looking forward to reading the winning entries and seeing how they scored - the company that ran the competition is being weirdly coy about the winning scores for now, for some reason, but I'm sure they'll appear sooner or later.

My story, breaking the English language to its very limit in order to score points, is below. Judge it kindly. I'm also not entirely sure how it ended up as essentially X-Men fanfiction, but here we are...

The first year after she got her wings were the worst. "Training always bored me. Up until I saw my brother die. Knowing I would be able to kill the ten men who did it kept me going."

The man with no hair in a chair with wheels turned his old neck. "Schooling first," he said. "Once you have finished that, you can take their life, if you so choose. Students always want something. In time, it may not be killing you want."

A man with short hair on his cheeks and too sharp metal hidden in his arms was busier, stood at the game table that used bars to hit the ball. As usual, the lit "win" pairs of lights meant he was going to get an all time high number of points. "Kill them now", he shouted, "There is no other choice with life takers that kill those close to us."

She rose, nodded and took him by the arm. "Then we go now." They walked out of the door together, as the old man sat, suddenly sad.

Later that day, as the old man watched the news, there was a live story about loud noise and big fire on a piece of land flying in the sky. "No people still alive," the news said.

He would never let anything close to this happen again under his watch. A single tear rolled down his cheek, as he thought about lost friends, crying.


comments powered by Disqus