CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Greg Young [MVP]

September 2006 - Posts

  • PC2:Analysis "Deja Vu"

    Some people got this optimization, others did not. This is the single most important optimization and is the focus of nearly all optimizations when dealing with larger boards.

    If we started with a brute force searcher we can speed it up by orders of magnitude (1000-10000 times speed up is quite normal). If we can make it remember the positions it has seen in the past. The best way to illustrate this is to get out some pennies and place them on a paper board size 5 (or if you were a winner you can do this with your actual cracker barrel board). Place all pegs in then take out the pegs at holes 1 and 15 (top and bottom right). If you are missing a size 5 board here is a picture so you can at least try to mentally follow along.

    Before we get going with this I am going to introduce a notation that we will use from now on for moves. The notation will be starthole-jump-endinghole … As an example 14-9-5 says to move the peg in hole 14 to hole 5 jumping (and removing the peg in hole 9).

    So now that we have our board setup let’s look at how we can end up in the same place. If we were to move 13-14-15 jumping into the bottom right corner and we then moved 4-2-1 jumping into the top corner we would be left in a positions like this …

    then

    then we end up here ..

    If however we were to reverse things from our starting board and move 4-2-1 into the top hole, then move 13-14-15 into the bottom right corner we would be left with the same exact board. By remembering that we have previously seen this board and that it was unsolvable we can prune our tree very efficiently.

    then

    then we end up at the same place

     

    This optimization applies to nearly all searching problems but this one it is especially good for as it has a very high rate of repetition… To see just how high this rate of repetition is I used the problem set of all size 5 boards with 1 peg missing. I have added a counter to my entry as follows to keep track of how many previously found boards it was able to use as cutoffs ..  To do this I added a public long cutoff to the UInt32Board class and in my Search(int) method I added the following line of code.

                public long cutoff = 0;

                if (found > 0) {

                    cutoff++;

                    return false;

                }

    Listing 1: changes to track cutoffs

    To see the cutoffs for yourself simply access this variable when you are done with the test to make things easier, you can just set a break point while it is generating the tests as it will be run for those boards already to determine if they are or are solvable. To give you an idea … in solving the first 15 positions it received 1508 cutoffs, keep in mind that these cutoffs are often quite high in the tree resulting in it not having to evaluate sometime tens of thousands of positions for each cutoff. To give you an idea of the difference, if I remove this optimization from my board my speed drop dramatically.

    To illustrate the speed drop I will deal with a single board loaded using the following code.

            static void SimpleTest() {

                int[] board = new int[28];

                for (int i = 0; i < 28; i++) {

                    if (i != 12) {

                        boardIdea = 1;

                    }

                }

                TestSimpleBoard(board);

            }

            static void TestSimpleBoard(int [] board) {

                Young.UInt32Board b = new Young.UInt32Board(board);

                b.SetStartingPosition(board);

                Stopwatch s = new Stopwatch();

                s.Start();

                Console.WriteLine(b.Search());

                s.Stop();

                Console.WriteLine(s.Elapsed);

            }

    Listing 2: Code for benchmark

    With remembering previously seen positions my engine will finish this test in about 20 seconds (this problem is quite tough as it passes a parity check but is unsolvable). When I disable my previous board check, the code slows down dramatically. To run this test for yourself you can either comment out the code in my search routine building and checking previous positions or you can just comment out the “return false” near the top.

    When we run my version without the remembering of previous positions we get a much different measurement. On my machine without the cutoffs it took over an hour before I killed it and it still was not done… why did it take this long? Well the version that remembered its previous positions benefited from a huge number of cutoffs while going through its search (79,150,041 to be exact which equates to around 1,000,000,000 leaf nodes it never had to search!).

    As we can see from these basic tests remembering what we have seen and keeping ourselves from going down the same road makes a huge difference in how quickly we can process these puzzles. For those who were having a hard time beating brute force, this is the key in most problems. Of course when we get to larger puzzles we can’t always keep all of the positions we have ever seen so we need to get into some smarter algorithms.

    How you store your previous positions is also paramount. We will discuss here for smaller boards and leave how to do it on larger boards for another post. A few people used what I considerred to be the best method which was just to store an array of bits (either as bytes or as UInt32s). You can then keep a single bit for every given position. For the largest puzzle this will work well on (a size 7 board which has 28 pegs) this will require a reasonable amount of memory 2^(28-3)  or 2^25 bytes. This seems like a very big number but its not actually that bad, its only 33Mb of memory give or take. Storing a position and looking up a position are both assured to be O(1) operations as you have a completely unique key (the board itself).

    Others experimented with other methods of storage, an interesting one was Wil who used a binary tree to store this data. By using a binary tree you may in many circumstances end up using less memory as you are only storing the positions you know to be solved (as opposed to every possible position). This will quickly fall to being more memory in cases where you have big problems though. The other issue you may run into with a binary tree is that it is O(log n) in terms of search speed. This can quickly add up if you are getting alot of misses as they will be your worst case.

    One thing which should be noted about the storage of previously saved positions is that they can be shared by any board instances that are dealing with the same size board. It doesn't matter what your starting position is, the same stored positions still apply. For this reason I created a small manager class to deal with these saved positions. In fact these saved positions when done in a table based format (array of bits) are even completely thread safe without locking. The worst thing that can possibly happen is that you lose a saved position in the case that 2 threads are writing the same are of the array at the same time. Since this is only being used to help cutoff search paths this is not a major concern as it will only cost you a cutoff .. it will not in any way affect the correctness of the search.

    We will discuss this more in later posts dealing with making our searching happen in parallel. Before we get there though we will look at some further algorithmic optimizations that we can use for all boards, the next post “Mirror Mirror” will discuss  a major algorithmic optimization that a few people implemented.

  • Performance Contest #2 Results

    Sorry for the delay in getting results posted, the testing ran a bit longer than I expected and I did not have much time following the weekend I had set aside (last weekend I was speaking at a code camp, and this weekend I was camping/kayaking with my girlfriend).

    I ran into a few problems generating the results for the submissions, mainly because none of them could pass my original tests. I try to write my tests before I receive any of the submissions because the method in which I write the tests can favor certain submissions over others. Due to this I was forced to rewrite all of my tests, my initial tests were done mainly on boards that were over size eight or larger.

    There were two reasons why various submissions did not work well on larger problems. The largest was that they only used a brute force based search; the problem space grows exponentially so this quickly becomes a limiting factor. A brute force search could easily take one-hundred years to run on a board size of fifty-five.

    The second limiting factor was people not capping their memory usage; this would quickly become an issue on larger boards as the number of possible positions grows exponentially. They would run out of memory and begin thrashing to disk. For extremely large puzzles you will actually be better off using disk but we will discuss this more later…

    In the next ten or so analysis posts (roughly one a day) we will look more in depth at how we can help get around these two issues and allow our code to run efficiently on larger boards. It is important to note that this problem is NP Complete, whatever optimizations we add will eventually reach a failure point. As an example many used tables in varying forms to remember the positions they had seen in the past but at some point this optimization will fail due to not being able to save all previous positions any more (perhaps it’s a two GB limit for memory or a two TB limit if you are storing on disk either way you will eventually hit a board that you cannot possibly hold every previous position for). We will however continue this discussion in the analysis posts for now let’s get to the results.

    Testing Methodology

    To summarize the tests, all were written to operate upon leaf nodes in an n depth search. For instance a board size of five with a search depth of two would generate every possible board position on a board with 15 holes and 13 pegs. This makes for ease of generating large numbers and tends to help find algorithmic differences as it covers every possible position (and as we will see in analysis helps show one specific optimization).

    Size

    Depth

    Total Positions

    5

    1

    15

    6

    1

    21

    7

    1

    28

    5

    2

    210

    6

    2

    420

    7

    2

    756

    5

    3

    2,730

    6

    3

    7,980

    7

    3

    19,656

    5

    4

    32,760

    5

    5

    360,360

    As I have mentioned, the testing was a bit difficult since the engines were not able to run on the larger puzzles due to this not all submissions were run on the 7 size tests as some submissions would have taken a very long time to run on this test set (on the order of hours to weeks).

    The Results

    15-1

    15-2

    15-3

    15-4

    15-5

    21-1

    21-2

    21-3

    28-1

    28-2

    28-3

    Total

    Errors

    Bowen

    0.00262

    0.53161

    0.53883

    4.523

    37.5729

    0.000206

    76.70967

    10.46

    DNR

    DNR

    DNR

    130.3388

    Yes

    Brooks

    0.00947

    0.02706

    0.04794

    0.38668

    3.26843

    0.01962

    5.29703

    0.96051

    DNR

    DNR

    DNR

    10.01674

    No

    Bushman

    0.01389

    0.31435

    2.14839

    14.05154

    86.34329

    0.00774

    150.8724

    1650.721

    DNR

    DNR

    DNR

    1904.472

    No

    Laan

    0.01272

    0.01865

    0.019807

    0.11418

    2.51365

    0.0078

    0.02761

    0.33106

    2.53936

    8.97131

    32.12311

    46.67926

    No

    Rasmus

    0.0073

    0.0022

    0.02157

    0.25159

    2.282763

    0.00936

    0.0224

    0.12136

    0.1843

    1.67424

    4.23623

    8.813313

    No

    Rustad

    0.04127

    0.42313

    3.17082

    21.40802

    125.3042

    0.00408

    197.6765

    2023.164

    DNR

    DNR

    DNR

    2371.192

    No

    Young

    0.00026

    0.0081

    0.00895

    0.09986

    0.98724

    0.00047

    0.002469

    0.05961

    0.06225

    0.52418

    1.80063

    3.554019

    No

    Rasmus will receive a copy of Michael Abrash's "Zen of Code Optimization" and the managed bonus of "Algorithms" by Sedgewick for coming in with the fastest entry. Wil who won second place will receive a copy of George Polya's classic work "How To Solve It".

    Let me the first to congratulate our winners Rasmus Faber-Espensen  and Wil Laan for their great work on this problem.

    Starting this evening we will delve deep into analysis of this problem.

  • Interface Re-Implementation Bug in 2.0 Runtime

    Came across an interesting bug the other day in the MS newsgroups, it appears to be a bug in the 2.0.50727.42 release of the runtime. If this issue is affecting you, you can either use the updated 2.0.50727.97 revision of the runtime which seems to have this issue resolved or you can avoid the situation as I will show below.

    The following C# code will illustrate the issue.

        using System;

        namespace InterfaceMapTest {

            class MyApp {

                static void Main(string[] args) {

                    MyControl obj = new MyControl();

                    obj.Paint();

                    ((IControl)obj).Paint();

                    IControl iControl = (IControl) obj;

                    iControl.Paint();

                }

            }

            public interface IControl {

                void Paint();

            }

            public class Control : IControl {

                public virtual void Paint() {

                    Console.WriteLine("Control.Paint");

                }

                 void IControl.Paint() {

                    Console.WriteLine("Control.IControl.Paint");

                }

            }

            public class MyControl : Control, IControl {

            }

        }

    Listing 1: Code shows bug in 2.0.50727.42 version of runtime

    C:\>C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\csc.exe foo.cs
    Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.42

    for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
    Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

    C:\>foo
    Control.Paint
    Control.Paint
    Control.Paint

    This code should print Control.IControl.Paint the second two calls. This worked properly in version 1.x of the framework (and in later revisions of 2.0).  At first I thought the C# compiler might be getting a bit confused but after a quick look with reflector I came received the following IL.

         .entrypoint
         .maxstack 1
         .locals init (
               [0] ConsoleApplication21.InterfaceMapTest.MyControl control1,
               [1] ConsoleApplication21.InterfaceMapTest.IControl control2)
         L_0000: nop
         L_0001: newobj instance void
    ConsoleApplication21.InterfaceMapTest.MyControl::.ctor()
         L_0006: stloc.0
         L_0007: ldloc.0
         L_0008: callvirt instance void
    ConsoleApplication21.InterfaceMapTest.Control::Paint()
         L_000d: nop
         L_000e: ldloc.0
         L_000f: callvirt instance void
    ConsoleApplication21.InterfaceMapTest.IControl::Paint()
         L_0014: nop
         L_0015: ldloc.0
         L_0016: stloc.1
         L_0017: ldloc.1
         L_0018: callvirt instance void
    ConsoleApplication21.InterfaceMapTest.IControl::Paint()
         L_001d: nop
         L_001e: ret

    Listing 2: IL of main method

    Which is correct, the ILASM’ed version of the same IL when run in 2.0.50727.97 will work properly (thanks to Bart De Smet for verifying this). Basically what it happening is that the runtime is getting confused by the re-implementation of the interface on the derived class. It only seems to happen when the same signature method on the base is virtual as well. If you run into this problem you may be able to not re-implement the interface on the derived class and the issue will go away. As this item has already been fixed I guess it just needs to be released so there is no feedback you can click through to vote etc.

More Posts