Wednesday, July 2, 2014

Don't panic Mr Mannering!

If you get the title reference you're probably as old as I am (and also English).  Anyway, 20 hours to go until the first qualifiers for the 2014 World's and time for the sky to cave in!

First we've been erroring every game of 'Sheep & Wolf' on Tiltyard the past 24 hours or so, so I must have broken something (or Andrew did, but it was probably me - it usually is!).  It was bound to be an obvious bug right?  I mean it fails immediately during meta-gaming - how hard could that be to find?

Turns out quite hard.  Was down to a recent 'improvement' to our propnet factory, that was intended to aggressively trim input propositions (and any logic dangling from them) for always-illegal moves.  This was necessary (well convenient anyway) for some puzzle analysis that I've been working on the last couple of weeks (to address a class of games that boil down to marking cells, such as Sudoku).  I'll be writing a post on extra puzzle solving analysis we do now (or will be doing shortly, as some is not quite finished yet), that allows us to handle factorized puzzles (dual Hamilton, multiple whatever, etc.), sparse-goal-state puzzles (8-puzzle etc.), and puzzles with a partitionable decision space (Sudoku etc.) at some point (possibly not until after the Worlds though).  Anyway, for some reason I still do not understand (I will have to go back to this later), the changes to achieve this caused a few games (Sheep&Wolf notably) to fail by never terminating (i.e. - it broke their terminal network in some obscure fashion I have not managed to figure out yet).  For now I've just backed out the 'improvement' concerned!

Ok, so everything must be fine now surely?

Now to make sure that we pass the 'player tester' that they will be using to filter entry into tomorrow's qualifiers.  How hard can that be?

Hmmm

Harder than you might expect!  We fail BOTH test games.   Aaaaarghhhhhh!

I blame the games of course!  The first issue was that they are only giving 5 seconds meta-gaming time, and we do rather a lot of serial analysis during meta-gaming, and when allocating how much time we are prepared to spend on each piece, we assume that we should leave at least (you guessed it) 5 seconds free at the end of the meta-gaming time to allow for network latency, garbage collection, lightning strikes, nuclear attack, and so on.  Unfortunately allowing only the time remaining from a total of 5 seconds, to leave at least 5 seconds spare, doesn't give much analysis time at all, and as a result we totally skipped some analysis without which the actual attempt to play even a trivial game utterly fails!

Ho hum.  Modified the code to always give us 1 second regardless of whether that leaves a decent buffer or not.

Game 1 passed.

In game 2 we seem to have no legal moves.  Maybe we cannot cope with such complex GDL.  Hmm.  This GDL to be precise:

( role white )
( base ( count 1 ) )
( base ( count 2 ) )
( base ( count 3 ) )
( base ( l 1 ) )
( base ( l 2 ) )
( input white move1 )
( input white move2 )
( init ( count 1 ) )
( init ( l 1 ) )
( init ( l 2 ) )
( <= ( legal white move1 ) ( true ( l 1 ) ) )
( <= ( legal white move2 ) ( true ( l 2 ) ) )
( <= ( next ( l 1 ) ) ( does white move1 ) )
( <= ( next ( l 2 ) ) ( does white move2 ) )
( <= ( next ( count 2 ) ) ( true ( count 1 ) ) )
( <= ( next ( count 3 ) ) ( true ( count 2 ) ) )
( goal white 0 )
( <= terminal ( true ( count 3 ) ) )

Nope.  Not overly complex.  Turns out we are optimizing a bit too aggressively in propnet generation.  Basically we throw away stuff that is irrelevant to our goal outcome in puzzles, which in this case is...the entire game!  A little over-zealous maybe.

Skipped the optimization for games where the goal is a fixed value regardless of how it plays out.

Yay, second game passed.  Player tester thinks we're A-Ok!

Time to regression test all the puzzles we know about, starting with the Stanford repository.  After a slight glitch (this is when I actually discovered that backing out the OPNF 'improvement' had broken our partitioning analysis, and left us looking about as effective as Random in Sudoku), they all pass with maximum achievable score again.  Well almost.  Only 99 on 8-puzzle.  I'll sort that out at the weekend (just need to finish off some work on a different puzzle solving approach that is not yet finished, but I'm confident will at least get full marks on 8-puzzle reliably once done).

No comments:

Post a Comment