Saturday, June 02, 2012

Sparse Matrix Multiplication

I want the Math.NET Numerics developers to know their work is great, they put together an easy to use, astoundingly well documented numerical library for .NET. Please know this little criticism comes from a place of respect. It could even be that the code has been updated since your last release and what I'm going to point out is no longer a problem.

I really don't know much about calculus and mathematics at that level. I barely passed A-level maths, and the only time I've used any of the knowledge gained therein was when I had to calculate the first derivative of 1-e-x at university. My mathematics skills are weak (sadly). So, when, in mid-April, I was asked at work to implement some maths heavy algorithms, I felt suitably challenged. Thankfully the scientist who was feeding me the algorithms understood them really well and was on hand to explain things to me over and over again until we finally got things working yesterday. Yay!

Some of what we did relied on sparse matrices, something I had heard of, but never used. So my first thought was that I needed a third party library to do these calculations. The library we are currently using is the bluebit .NET matrix library, it's not perfect and we'll have to replace it with something faster, but for the moment it makes the code testable. This matrix was not my first choice, ideally I wanted something we didn't have to pay for. My first stop was the Math.NET Numerics library. This, unfortunately proved to be too slow. I also tried out Extreme Optimization, but this library was also too slow. Other libraries I looked at were ILNumerics, IMSL.NET and Center Space NMath. I looked but I did not test these last three because each library's API and help were so bad I couldn't figure out how to do what I needed to do. I don't have time to figure out matrix maths, this is why I'm looking for a library. If you want me to choose yours, make it easy to use.

So that was the bulk of the outcome of my foray in numerical libraries. Bluebit is my current choice, but I will have to change it for something faster. This is not the only thing I learned. I learned something that I hope, if they haven't already, the Math.NET developers will be able to use in their code. I've not time to dive into the project, and patch it myself — as I've said, my understanding of the maths is not great — so feel free to take the code here and fix it to work in the library.

At work I'm dealing with quite large matrices. The stuff I've been testing with is 8K x 8K points, and the real data will probably be up to 32K x 32K. But these are sparse matrices, so working with them should not be too processor and memory intensive. The major things I need to do are transposition, multiplication and inversion. Inversion is the killer, and understanding it is currently over my head. It's the place where Extreme Optimization fell down, and where bluebit struggles. I need the algorithms to run in a few seconds. Currently, with 16K x 16K points and bluebit, it's taking 2 minutes. The algorithm did not complete with the other two libraries. I waited for over half an hour, and still nothing, and that was with 8K data.

The first problem that Math.NET encountered was with the multiplication of the matrices. This is what I hope I've optimised. All I've done is profile their code and change the bit that took forever - assigning data to a point in the matrix

My first step was to write these two tests, to make sure I was multiplying the matrices correctly:

[Test]
public void MatrixMultiplication()
{
    var leftM = new double[,] {{4, 5, 6, 7, 8, 1, 2}, {3, 9, 6, 7, 3, 3, 1}, {2, 2, 8, 4, 1, 8, 1}, {1, 9, 9, 4, 3, 1, 2}};
    var rightM = new double[,] {{1, 8, 1}, {2, 6, 2}, {3, 4, 1}, {4, 2, 2}, {5, 1, 1}, {6, 3, 2}, {7, 5, 1}};
    var expectedM = new double[,] {{120, 121, 46}, {107, 133, 51}, {106, 98, 40}, {97, 122, 43}};

    var sm = new SparseMatrix();

    var resultM = sm.MultiplyMatrices(leftM, rightM);

    Assert.AreEqual(expectedM.Rank, resultM.Rank);
    Assert.AreEqual(expectedM.GetLength(0), resultM.GetLength(0));
    Assert.AreEqual(expectedM.GetLength(1), resultM.GetLength(1));

    for(int row = 0; row < 4; row++)
    {
        for(int col = 0; col < 3; col++)
        {
            Assert.AreEqual(expectedM[row, col], resultM[row, col]);
        }
    }
}

[Test]
public void SparseMatrixMultiplication()
{
    var leftM = new double[,] {{1,2,3,0,0,0,0,0,0,0}, {0,0,0,0,0,1,2,0,0,0}, {1,0,4,0,0,5,0,0,0,0}, {0,4,0,5,0,6,0,0,7,0}, {9,0,0,0,0,0,8,0,0,0}};
    var rightM = new double[,] {{0,2,0,4,0}, {1,0,0,1,1}, {3,0,1,3,0}, {4,0,0,0,0}, {0,5,6,0,0}, {0,9,0,6,0}, {0,1,0,3,0}, {0,0,8,0,9}, {0,0,0,0,7}, {0,1,0,0,5}};
    var expectedM = new double[,] {{11,2,3,15,2}, {0,11,0,12,0}, {12,47,4,46,0}, {24,54,0,40,53}, {0,26,0,60,0}};
 
    var sm = new SparseMatrix();

    var resultM = bc.MultiplyMatrices(leftM, rightM);

    for (int row = 0; row < 4; row++)
    {
        for (int col = 0; col < 3; col++)
        {
            Assert.AreEqual(expectedM[row, col], resultM[row, col]);
        }
    }
}

(SparseMatrix isn't really the name of the class, I put the multiplication into the class that was handling the algorithm, but I'm not allowed to talk about that!)

Then I spent ages struggling (because of my ignorance, the code is easy to read) with the Math.NET code to try and understand sparse matrix multiplication - how it could be faster than normal matrix multiplication, and how I could implement it faster. It took a couple of days. I spent a couple of days, rather than giving up and finding a proprietary library right away, because I thought that Math.NET would do the business when it came to inversion. Sadly this isn't the case. Anyway, this is my optimised sparse matrix multiplication method:

private IEnumerable GetNonZeroIndicesForMatrixColumn(double[,] matrix, long col, int rowcount)
{
    for (int row = 0; row < rowcount; row++)
    {
        if (matrix[row, col] != 0)
        {
            yield return row;
        }
    }
}

private IEnumerable GetNonZeroIndicesForMatrixRow(double[,] matrix, int row, int colcount)
{
    for (int col = 0; col < colcount; col++)
    {
        if (matrix[row, col] != 0)
        {
            yield return col;
        }
    }
}
        
/// <summary>
/// Matrix multiplication optimised for sparse matrices
/// </summary>
/// <param name="matrix1">Matrix on the left of the multiplication</param>
/// <param name="matrix2">Matrix on the right of the multiplication</param>
/// <returns>A matrix that is the multiplication of the two passed in</returns>
public double[,] MultiplyMatrices(double[,] matrix1, double[,] matrix2)
{
    int j = matrix1.GetLength(1);
    if (j != matrix2.GetLength(0))
    {
        throw new ArgumentException("matrix1 must have the same number of columns as matrix2 has rows.");
    }

    int m1Rows = matrix1.GetLength(0);
    int m2Cols = matrix2.GetLength(1);
    double[,] result = new double[m1Rows, m2Cols];

    var nonZeroRows = new List[m1Rows];

    Parallel.For(0, m1Rows, row =>
    {
        nonZeroRows[row] = GetNonZeroIndicesForMatrixRow(matrix1, row, j).ToList();
    });

    var nonZeroColumns = new List[m2Cols];

    Parallel.For(0, m2Cols, col =>
    {
        nonZeroColumns[col] = GetNonZeroIndicesForMatrixColumn(matrix2, col, j).ToList();
    });



    Parallel.For(0, m1Rows , row =>
    {
        Parallel.For(0, m2Cols, column =>
        {
            var ns = nonZeroColumns[column].Intersect(nonZeroRows[row]);
            double sum = ns.Sum(n => matrix1[row, n] * matrix2[n, column]);
            result[row, column] = sum;
        });
    });

    return result;
}

As you can see, there is a lot of reliance on the parallel methods that come with .NET 4. That, coupled with the trick of getting the intersection of the non-zeros in the rows of the left matrix with the columns of the right matrix, seems to be the major advantage of my method over Math.NET, because their assignments can't be done in parallel. This could be to do with Silverlight compatibility issues, I don't know. I don't have to worry about Silverlight.

I have run a benchmark for my code. I created a 5000 x 5000 point matrix and filled it at random points with random data (well, pseudo-random). I benchmarked at 5, 50, 150 and 500 non-zero items per row. I ran the test 10 times, to get a mean. The table shows the results:

Number of non-zeros per rowMean seconds taken to multiplyStandard Deviation
56.244657160.1037383251
5051.109723320.8521258197
15093.2973362977.751344564
50013.184354116.4991175895

I find it strange that the standard deviation for the 150 condition is so high. If anyone can see a problem in my code, I'd be really happy to hear it! The full test is below:

toggle test code
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading.Tasks;

namespace Math.NetBenchmark
{
    class Program
    {
        private static Random _r = new Random();

        static void Main(string[] args)
        {
            const int rows = 5000;
            const int cols = 5000;
            var nonzerosPerRow = new [] {5, 50, 150, 500};

            Console.WriteLine("started");
            using (var sw = new StreamWriter("MyMX10.results"))
            {
                sw.WriteLine("Number of non-zeros,Time taken");
                foreach (var nzpr in nonzerosPerRow)
                {
                    Console.Write(nzpr+" - making left");
                    var left = MakeMatrix(rows, cols, nzpr);
                    Console.Write("making right");
                    var right = MakeMatrix(rows, cols, nzpr);
                    Console.Write("multiplying...");
                    var startTime = DateTime.Now;
                    MultiplyMatrices(left, right);
                    var endTime = DateTime.Now;
                    var diff = endTime - startTime;
                    sw.WriteLine(nzpr + "," + diff.TotalSeconds);
                    Console.WriteLine("done");
                }
            }

            Console.WriteLine("done");
        }

        private static double[,] MakeMatrix(int rows, int cols, int nonzerosPerRow)
        {
            var result = new double[rows, cols];
            var colsPoss = Enumerable.Range(0, cols).ToArray();
            Parallel.For(0, rows, iRow =>
            {
                var posleft = colsPoss;
                Console.Write(".");
                for (int i = 0; i < nonzerosPerRow; i++)
                {
                    int posindex = _r.Next(posleft.Length);
                    int index = posleft[posindex];
                    result[iRow, index] = 1+_r.NextDouble();
                    posleft = posleft.Take(index).Concat(posleft.Skip(index+1)).ToArray();
                }
            });
            return result;
        }

        private static IEnumerable GetNonZeroIndicesForMatrixColumn(double[,] matrix, long col, int rowcount)
        {
            for (int row = 0; row < rowcount; row++)
            {
                if (matrix[row, col] != 0)
                {
                    yield return row;
                }
            }
        }

        private static IEnumerable GetNonZeroIndicesForMatrixRow(double[,] matrix, int row, int colcount)
        {
            for (int col = 0; col < colcount; col++)
            {
                if (matrix[row, col] != 0)
                {
                    yield return col;
                }
            }
        }

        public static double[,] MultiplyMatrices(double[,] matrix1, double[,] matrix2)
        {
            int j = matrix1.GetLength(1);
            if (j != matrix2.GetLength(0))
            {
                throw new ArgumentException("matrix1 must have the same number of columns as matrix2 has rows.");
            }

            int m1Rows = matrix1.GetLength(0);
            int m2Cols = matrix2.GetLength(1);
            double[,] result = new double[m1Rows, m2Cols];

            var nonZeroRows = new List[m1Rows];

            Parallel.For(0, m1Rows, row =>
            {
                nonZeroRows[row] = GetNonZeroIndicesForMatrixRow(matrix1, row, j).ToList();
            });

            var nonZeroColumns = new List[m2Cols];

            Parallel.For(0, m2Cols, col =>
            {
                nonZeroColumns[col] = GetNonZeroIndicesForMatrixColumn(matrix2, col, j).ToList();
            });

            Parallel.For(0, m1Rows, row =>
            {
                Parallel.For(0, m2Cols, column =>
                {
                    var ns = nonZeroColumns[column].Intersect(nonZeroRows[row]);
                    double sum = ns.Sum(n => matrix1[row, n] * matrix2[n, column]);
                    result[row, column] = sum;
                });
            });

            return result;
        }
    }
}

Friday, April 27, 2012

Object Thinking - Anthropomorphism

This follows on from Object Thinking - Objects have actions

Anthropomorphism is essential for object thinking to take place. Anthropomorphism is when a person attributes human mental states to other, non-human, things. Attributing human-like mental states to objects allows a programmer to treat the object as an agent, as opposed to something inanimate, and so bestow upon it appropriate behaviours allowing it to act in an appropriate manner within the application, interacting with other object. The amount of responsibility that you want an object to have will reflect how much you anthropomorphise it. It is important not to give an object too much responsibility, as explained by the Single Responsibility Principle.

That Anthropomorphism occurs is so obvious it doesn't need investigating! So obviously it has been researched by a huge number of people. The paper being looked at here, Making Sense by Making Sentient: Effectance Motivation Increases Anthropomorphism, by Waytz et al. in 2010[1], is one that attempts to explain why and how people anthropomorphise.

Their hypothesis is that one of the reasons people anthropomorphise objects because they want to increase their effectance motivation. This is the motivation to be an effective social agent. The researchers conduct six experiments based on this hypothesis.

The first experiment asked participants to rate their computers. Half of the participants (A) were asked to rate how much they felt their computer has a mind of its own. The other half (B) were asked to rate how much their computer appeared to behave as if it has its own beliefs and desires. Both sets were asked how often they had problems with the computer or its software. The hypothesis for the study is that the more problems people have with their computer, the more they will anthropomorphise it.

Results showed that, in accordance with the hypothesis, the more often participants in group A had problems with their computers, the more they thought their computers had minds of their own and that the more often participants in group B had problems, the more likely they were to believe their computers had beliefs and desires.

The second experiment asked participants to judge the agency of gadgets that had been assigned one of two descriptions about it. The gadget's description either made it seems as though what it did was within or outside the control of the user, but always described the same functionality. There were two groups of participants. They all saw the same set of gadgets, there were alternating sets of descriptions. After reading the descriptions, the participants were asked to rate how much control they thought they had over the gadgets, and then to assess how much the gadget had a “mind of its own”, “intentions, free will and consciousness” and appeared to experience emotions, in the same way they had to rate how much control they thought they had over the gadget.

In alignment with their hypothesis, the participants rated the gadgets with low controllability to be more anthropomorphic than those that were perceived to be more easy to control.

The third experiment was essentially replica of the second, but the participants were subject to an fMRI scan while rating the gadgets. This was conducted because the researched reasoned that people could be using mind as a metaphor for the behaviour they were seeing, rather than actually attributing minds to the objects. By determining the region of the brain in use when anthropomorphising takes place they could rule out certain modes of thinking and give weight to a possible seat for anthropomorphism in the brain. The researchers propose, through reference to previous studies, that the superior temporal sulcus (STS) is involved in social or biological motion, the medial prefrontal cortex (MPFC) is in use when considering people vs objects and considering the mind of another, and the amygdala, inferior parietal lobe and intraparietal sulcus are active when evaluating unpredictability. They therefore hypothesise that the MPFC will increase in activity when anthropomorphising.

The results of the experiment showed the ventral MPFC (vMPFC) to be the most active region, whereas the STS was not active.

The results also showed activation in a network of areas related to mentalising, which strongly resembles a circuit corresponding to processing of self-projection, mentalising and general social cognition, which is what would be expected for anthropomorphism.

This implies that unpredictable gadgets are perceived to have a mind, in an actual rather than metaphorical sense.

The results are inconsistent with the alternative hypotheses: attribution of mind to objects only related to social or biological motion analogies; that processing unpredictability is the cause of the activation; or that the activation is influenced by animism.

The fourth experiment asked participants to evaluate a robot that would answer yes/no questions the participants asked. There were three conditions that the participants were randomly assigned to: the condition where the robot answered yes as often as no, the condition where the robot answered no more often, and the condition where the robot answered yes more often. The second two conditions were the predictable conditions.

After asking the questions and receiving answers, the participants were asked to rate the robot on predictability, then on how much they thought it had free will, its own intentions, consciousness, desires, beliefs and the ability to express emotions. The participants were also asked to rate the robot on attractiveness, efficiency and strength. The ratings were done on a five point scale from “Not at all” (1) to “Extremely” (5).

Results from the experiment showed that participants in the predictable groups found the robot to be predictable, more-so than those in the unpredictable group. Also predicable-no was felt to be more predictable than predictable-yes.

Importantly anthropomorphism was found to be more prevalent where the robot would found to be less predictable.

The only significant difference between the conditions and the non-anthropomorphic evaluation was that predictable-yes participants found the robot to be more attractive than predictable-no. The researchers do not discuss this finding. There was no significant interaction found between liking the robot and anthropomorphising it.

These results show people anthropomorphise unpredictable agents, and present a causal link between the two. This is important as the previous three experiments could be interpreted as a simple association rather than a clear cognitive process.

Experiment five gave some participants motivation to predict the behaviour of a robot, and the others were asked to predict the behaviour with out being motivated. The hypothesis was that increasing motivation should increase motivation to understand, explain and predict an agent.

Participants evaluated a robot on a computer screen. They watched videos of the robot perform but not complete a task. Participants saw options of what the robot would do next and were asked to pick what they thought would happen. Participants in the motivation condition were offered $1 per correct answer. All participants then evaluated the robot's anthropomorphism. Finally the participants were shown the outcome, and compensated where necessary.

Results showed that motivated participants rated the robot as more anthropomorphic.

This shows that effectance motivation is increased when a person is motivated to understand an agent, and not simply controlled by the predictability of the agent.

The sixth and final experiment was predicated by the hypothesis that anthropomorphism should satisfy effectance motivation, i.e. anthropomorphism should satiate the motivation for mastery and make agents seem more predictable and understandable.

Participants evaluated four stimuli (dog, robot, alarm clock, shapes). Half of the participants were told to evaluate the dog and alarm clock objectively, and the robot and shapes in an anthropomorphic fashion, the other half were given the opposite instructions.

Each participant was shown a video of each stimulus three times. After the third time the participant was asked to evaluate the stimulus on two scales: the extent to which they understood the stimulus and the extent to which they felt capable of predicting its future behaviour.

The results showed that the dog and shapes were found to be easier to understand than the robot or alarm clock.

Importantly, participants perceived greater understanding and predictability of agents they had been told to anthropomorphise. The effect did not seem to depend on the group the participant was in.

This study implies that anthropomorphism satisfies effectance motivation.

It is clear from this paper that anthropomorphism is a natural part of human cognition, that is used to make behaviour of objects in the world around us seem more predictable and thus give us a better sense of control. It also shows that there is a neurological basis for this behaviour; the brain is set-up to anthropomorphise the world around us.

[1] Making Sense by Making Sentient: Effectance Motivation Increases Anthropomorphism. A. Waytz, C. K. Morewedge, N. Epley, G. Monteleone, J. H. Gao, J. T. Cacioppo. Journal of Personality and Social Psychology 2010, Vol.99, No.3, 410–435

Friday, February 17, 2012

What does #deadbeef; look like?

I've been working with WPF themes a lot this week. My task has been to take a theme from one application and put it into another, otherwise unrelated, application. This is not as easy as it sounds. The themes from the original application do not transplant to other applications without judicious use of a hacksaw.

While going through the theme's various XAML files I noticed things like color="#FF123456", and I looked and I couldn't figure out what colour I was looking at. There are a lot of these hex notation colours and they all seemed opaque to me.

It struck me that it would be nice if I could just hover my mouse over the hex and get the colour to pop up. Sounded like an easy enough task. So I set out to write an extension for Visual Studio to do just that.

My first attempt to write an extension to Firefox met with disappointment - I couldn't figure out how to get started - so I was a little apprehensive about writing an extension for Visual Studio. Luckily extensions for Visual Studio are easy to create (so long as you have Visual Studio).

  1. Download the Visual Studio 2010 SDK (or this one for Service Pack 1)
  2. Install the SDK
  3. Start a new project by selecting from the C#/Extensibility templates - I chose Editor Text Adornment, because I wanted to adorn the editor text with something.
The project comes with code already in place, so you can just hit F5 and you'll be able to see the extension at work right away. Then read the code to see how it works! It's pretty obvious, and with the Intellisense of Visual Studio you can discover all the bits you'll need with ease.

So my task, now that I have the ability to write an extension, was to write an extension that does what I want - i.e. show a colour swatch of the hex notation I'm hovering over.

Step 1: create a regex that picks out the hex. I tried one or two and settled on this one: #(([0-9A-F]{6})|([0-9A-F]{8})|([0-9A-F]{3}))["<;]. There might be ways to write it shorter, and I'm willing to hear them, but I'm not a regex guru, so I'll stick with simple. You'll notice that I've constrained the hex to start with a # and end with ", <, or ;. This way the regex will only pick up hex that is the right length, and not any old length, and is most likely meant to be a colour. All the colour hexes I could see ended in ", < or ;. I could have missed an edge case, but not so far!

Step 2: turn that string into a colour. There might be a library function for doing this, but I couldn't find it (would be glad if someone were to tell me about it!). I wrote my own:

private Tuple<byte, byte, byte, byte> BytesFromColourString(string colour)
{
    string alpha;
    string red;
    string green;
    string blue;

    if (colour.Length == 8)
    {
        alpha = colour.Substring(0, 2);
        red = colour.Substring(2, 2);
        green = colour.Substring(4, 2);
        blue = colour.Substring(6, 2);
    }
    else if (colour.Length == 6)
    {
        red = colour.Substring(0, 2);
        green = colour.Substring(2, 2);
        blue = colour.Substring(4, 2);
        alpha = "FF";
    }
    else if (colour.Length == 3)
    {
        red = colour.Substring(0, 1) + colour.Substring(0, 1);
        green = colour.Substring(1, 1) + colour.Substring(1, 1);
        blue = colour.Substring(2, 1) + colour.Substring(2, 1);
        alpha = "FF";
    }
    else
    {
    throw new ArgumentException(String.Format("The colour string may be 8, 6 or 3 characters long, the one passed in is {0}", colour.Length));
    }
    return new Tuple<byte, byte, byte, byte>( Convert.ToByte(alpha, 16)
                                            , Convert.ToByte(red, 16)
                                            , Convert.ToByte(green, 16)
                                            , Convert.ToByte(blue, 16));
}

OK, so this actually returns a Tuple<byte, byte, byte, byte>. I'm not entirely sure why I chose that over returning an actual colour. I might refactor that later. Anyway, turning the tuple into a System.Windows.Media.Color is a trivial call to the static method Color.FromArgb(byte, byte, byte, byte). Also, the above method is a brute force approach to breaking down the colour string into bytes, there could well be a better way. I'm sticking with what works until I'm shown something better.

My next hurdle was figuring out how to place the colour swatch where I wanted it. I was able to return the position in text the mouse was hovering over, which would give me a single character, but I couldn't think of how to use that position and character to get the hex colour string.

In the end I opted for a two stage approach. Stage one: when the layout updates, find the start and end positions for any colours in the view. Stage two: when the mouse is hovering somewhere, see if it's position is in any of the ranges previously stored.

Stage one looks like this:
private void OnLayoutChanged(object sender, TextViewLayoutChangedEventArgs e)
{
    _colourPositions = new List<Tuple<int, int, Color>>();
    var matches = Regex.Matches(_view.TextSnapshot.GetText(), "#(([0-9A-F]{6})|([0-9A-F]{8})|([0-9A-F]{3}))[\"<;]", RegexOptions.IgnoreCase);
    foreach(var m in matches)
    {
        var match = m as Match;
        var mgrp = match.Groups[1] as Group;
        var colourbytes = BytesFromColourString(mgrp.Value);
        var colour = Color.FromArgb(colourbytes.Item1, colourbytes.Item2, colourbytes.Item3, colourbytes.Item4);
        _colourPositions.Add(new Tuple<int,int,Color>(mgrp.Index, mgrp.Index + mgrp.Length, colour));
    }
}
I went with a list to store the position of the colours because I think it makes cleaner code than a dictionary would.
Stage two's like this:
private void ShowColourSwatch(int position, IMappingPoint textPosition, ITextView textView)
{
    _layer.RemoveAllAdornments();
    SnapshotPoint? snapPoint = textPosition.GetPoint(textPosition.AnchorBuffer, PositionAffinity.Predecessor);
    if (snapPoint.HasValue)
    {
        SnapshotSpan charSpan = textView.GetTextElementSpan(snapPoint.Value);
        var colourPos = _colourPositions.Find(cp => (cp.Item1 <= charSpan.Start) && (cp.Item2 >= charSpan.Start));
        if(colourPos != null)
        {
            Image image = CreateSwatchImage(colourPos, charSpan);

            _layer.AddAdornment(AdornmentPositioningBehavior.TextRelative, charSpan, null, image, null);
            Thread t = new Thread(p =>
            {
                Thread.Sleep(3500);
                lock (lockObject)
                {
                    Application.Current.Dispatcher.Invoke(new Action(() =>
                    {
                        _layer.RemoveAdornmentsByVisualSpan(charSpan);
                    }), new object[]{});
                }
            });
            t.Start();
        }
    }
}

The Thread in there just makes sure that the colour swatch disappears after three and a half seconds. CreateSwatchImage uses a lot of the code from the example project that Visual Studio gives you to start with, and just draws the colour swatch on a black and white background for contrast.

That is pretty much all the important code that I wrote in constructing the extension. There is one last snippet, I had to modify a single line in the auto-generated factory class so that the swatch would be above the text: [Order(After = PredefinedAdornmentLayers.Text, Before = PredefinedAdornmentLayers.Caret)]. Before that the property made the adornment go behind the text, which looked silly for my purposes.

The last thing that tripped me up was installing the extension. Obviously I can't sign my extension because I'm too cheap to pay for a certificate to do that with, so I can't get it put on the online extensions thing. However I was sure I could find a way. My first attempt was to double click on the .vsix file that Visual Studio had generated for me. This looked promising - it ran me through an install process and told me it had been successful, so I loaded up Visual Studio but my extension was no where to be found. I tried rebooting my computer, just in case, but to no avail. So I sought out where the extension had been placed and deleted it - which is how you are meant to uninstall extension, by the way - and went online to find out The Right Way™. A few places told me to put the extension in a folder under %appdata%, but that didn't seem to work. Eventually I found an MSDN page that explained I should be putting it under %localappdata%, which sorted me right out. Essentially the path should go something like %localappdata%Microsoft\VisualStudio\10.0\Extensions\[company]\[extensionName]\[version]\ although you can probably leave out [company] and [version] and it will still work. Once I put the extension there and loaded up Visual Studio, I checked the Extensions Manager in the tools menu and it was there, but needed enabling. After being enabled, and restarting Visual Studio, the extension was working like a charm! No more wondering about what a hex colour string means for me.

what #deadbeef; looks like


To view all the code for my extension, and download it for yourself, visit my Github repository.

Wednesday, October 19, 2011

Object Thinking - Objects have actions

This post follows on from Object Thinking - Objects: a neurological basis

The paper being reviewed is Micro-affordance: The potentiation of components of action by seen objects (Ellis and Tucker, 2000)[1]
 
The paper focuses on two experiments. The first is concerned with power and precision micro-affordance, and the second with wrist rotation micro-affordance.

In the first experiment the participants were told to memorise objects as they were shown them. They were then tested on the objects halfway through the experiment and at the end. During the memorisation phase, whenever they heard a tone, the participant was to either squeeze a cylindrical button with their whole hand, or pinch a small button between their index finger and thumb.

The type of grip response would be dependant on the type of tone; high or low. So there were two mappings known to the participants: high – large grip, low – small grip, and high – small grip, low – large grip. There were also two unknown mappings: high – large object, low – small object, and high – small object, low large object.

Each participant was assigned one mapping from each of the two groups and this was sustained throughout the experiment.

In the results from the experiment there was a statistically significant positive correlation between grip type and object type.

The second experiment was set up much the same as the first. The differences were that instead of large or small grips, the participant would make clockwise or anticlockwise wrist rotations dependant on tone, and the objects were categorised as ones more easily grasped with an anticlockwise or clockwise wrist rotation.

The results showed a statistically significant positive correlation between wrist rotation and object type.

The paper classifies micro-affordance (MA) as the state of an observer that gives rise to stimulus-response compatibility (SRC) between what the viewer sees and what actions they perform regardless of their intention. The theory is meant as a solution to the symbol grounding problem. (The reference to this problem in the paper is Harnad, 1990[2].)

The paper explains that SRC is demonstrated in many previous experiments, by various researchers, in forced choice reaction time tests. For example an advantage is gained when reaching for something on the left with the left hand, and similarly for the right. In fact an advantage is gained even in non-reaching tasks, where the location of the stimulus gives an advantage when it is on the same side as the response, this is known as the Simon Effect.

Previous experiments by Ellis and Tucker show that location is not the only action related feature encoded in this way.

This preparedness for action is thought to be a coordination of the what and where pathways in the brain.

The paper reports that the theoretical implications of the results of the study are:
  1. MA are different from Gibsonian affordance in that they suggest the affordance is encoded in the viewer's nervous system (not the object being viewed), they only apply to grasping, and only grasping appropriate to the object.
  2. SRC works because what is being responded to is unrelated to what is causing the compatibility effect. SRC theories suggest that stimulus → response options elicit particular mental codes, so the location of an object elicits a left or right handed response. MA, however, can be evoked without evoking a coherent action.
    This means that MA should interfere with SRC experiments.
    SRC effects have been modelled as ecological relations between visual properties and actions. They have also been modelled as effect codes that can be combined into whole actions.
    MA and these two approaches share the assumption that a compatibility effect arises from visual objects and possible, real-world actions that can be performed on them.
    MA diverges from the ecological approach by retaining representation of objects, and from effect codes by having a direct connection between vision and action. MA diverges from both because it states that actions are potentiated whenever an object is seen, regardless of the intention of the viewer.
  3. Developmentally, MA fits in well with the popular theory of Neural Darwinism. Development of adaptive behaviours requires integration of sensory and motor processes. The paper proposes learning coordinated actions result from gradual adaption of the neuron groups involved. This leads to coupling of motor and sensory systems.
    The implication of the experiments is that MA reflect the involvement of the motor components of the global mapping, which have come to represent visual objects.

So what does this tell us about how natural object thinking is? Object thinking requires that you understand the objects your are working with in terms of the behaviours that they can perform. You need to be able to create your objects so that discovering what behaviours are available is intuitive — i.e. when others come to your API they aren't spending hours going through the documentation, they can just get on and use it.

Ellis and Tucker show that the brain is well suited to understanding and preparing for expected behaviours. When we see an object, we immediately know the actions that the object has available, and are primed to use them.

This implies that once we have a good understanding of a problem domain, we should be able to model the behaviours of the objects in the domain intuitively, and anyone else with a good understanding of the problem domain will be able to intuitively discover each object and its behaviours.

The behaviour driven aspects of object thinking are intrinsic to how the human mind works at the brain level.

The next section deals with anthropomorphism, why OT needs it and where it comes from: Object Thinking - Anthropomorphism.

[1] Micro-affordance: The potentiation of components of action by seen objects; Rob Ellis, Mike Tucker. British Journal of Psychology (2000), 91, 451-471
[2] Harnad, S. (1990). The symbol grounding problem. Physica D, 42, 335±346. (As sited in [1])

Thursday, September 22, 2011

Object Thinking - Objects: a neurological basis



This post deals with how the brain perceives the world as objects.

A neurological perspective of how perception work, via studying perceptual disorders, is covered in chapter two of Neuropsychology: from theory to practice [1]. This is a review of that chapter.

Studying perceptual disorders tell us how we work by looking at damaged brains in people, or damaging brains in animals, and seeing how that affects what is perceived.

The chapter concentrates largely on visual perception, due to “the natural dominance of our visual sensory system”. It starts out by identifying two major pathways in the brain, the “what” pathway, which is responsible for identification of objects, and the “where” pathway, which is responsible for location position and motion. These were originally identified in monkeys in 1983 by Mishkin, Ungerleider and Macko. Milner and Goodale (1995) expanded on this model to explain that the “where” pathway is dedicated to the preparation of movement.

This demonstrates that humans understand the world as objects and actions. 

The chapter goes onto explain that these two pathways are linked, essentially the flow of data goes primary visual cortex → “what” pathway → “where” pathway → motor cortex. The system also gets feedback, via other pathways, from interactions with the environment to aid in learning. This of course means that we get better at performing actions the more we do them.

The next section of the chapter deals with sensation versus perception. It is not particularly relevant to this discussion. In short summary: sensation occurs before perception, and is not consciously recognised. In vision the sensation pathways are those that link the retina to the visual cortex. People with damage to these pathway will not notice that they don't see something, unless they are made aware of it appearing and disappearing from view.

Discussion of the hierarchy of the visual cortex follows on. This has quite a strong neurological focus, and describes a lot of the brain's structure in this area. The key point relevant here is that the brain is modular and parallel, which means that human thinking is modular and parallel, which is clearly analogous to separation of concerns. The parallelism is accomplished through pathways that allow feedback between modules. This could be thought of as message passing, although it might be a stretch to say it scales up to conscious thought.

Next the chapter discusses what certain disorders show us about visual perception. The two types of disorder covered are apperceptive agnosia – a condition that means the patient has a difficulty distinguishing between objects – and associative agnosia – in which the patient is unable to recognise objects or their functions.

Apperceptive agnosia, and its milder counter part; categorisation deficit, give strong evidence that the mind perceives the world as objects. People with these disorders cannot discern one object from another. This impedes problem solving, as the person with the condition does not know how to act on what they see. In fact, in the case of apperceptive agnosia, it can be equivalent to blindness, as those with the condition find it easier to navigate with their eyes shut.

Associative agnosia, prevents people from being able to recognise objects or their functions. This class of agnosia can affect any of the senses. The book focuses on vision.

People with associative agnosia can copy (e.g. by drawing) and match objects, but they cannot recognise. So it appears that primary perceptual processing is intact.

The current theory for what causes this agnosia is that the “what” pathway has become disconnected from the memory store for associative meaning. People with this condition can write something down, such as their name or address, but are completely unable to read it back. This is clear evidence that we use background knowledge to solve problems.

The chapter gives an example (p. 53) of a patient, with associative visual agnosia, who can only tell what a banana is after eating it, and even then only through logical deduction: “...and here I go right back to the stage where I say well if it's not a banana, we wouldn't have this fruit.”

The next section of the chapter discusses object and face recognition. The focus is on how this works at a neurological level, and the difference between face recognition and object recognition. The key point it makes is that the left hemisphere of the brain deals with parts of objects, and the right deals with objects as a whole. (Faces, are a special case, however, as they seem to be perceived as a whole, and not as parts, i.e. most of facial recognition is done in the right hemisphere.) The brain is set up to understand about composition.

The rest of the chapter focuses on describing top down (using past experience to influence perception) and bottom up (working from first principles) processing of visual information, and come to a conclusion about how the left and right hemispheres interact to give what we see meaning. Essentially they work together, the left hemisphere identifying objects and the meaning of objects, while the right analyses structural form, orientation and does holistic analysis of an object.

So, in conclusion, the chapter lays out clearly that human beings perceive the world as objects, even at a neurological level. This is our nature. Thus is makes sense when designing software to think of our problem space in terms of the objects in it. 

The next section will deal with why action is integral to how we think about the world, and can be found here: Object Thinking - Objects have actions.

[1] Neuropsychology: from theory to practice, David Andrewes (2001, Psychology Press)

Saturday, July 09, 2011

Object Thinking is the natural way to think. Introduction

Preface
I don't know why I'm up so early on a Saturday, but I am. *yawn*. So I've been writing a paper reviewing other texts, to explain why Object Thinking is the natural way to think.
I am doing this because I do not want to lose an internet argument. I know. I've already lost. Both side have. That's how internet arguments work.
The argument is at Programmers, particularly my answer to the question "is OOP hard because it is not natural?" SK-Logic is zealously anti OO, and I am equally zealously pro OO.
Then the other day I was discussing what I'm writing with Pierre 303, in the Programmers' chat room, and he suggested that I make it into several 'blog articles, because then it would be easier to digest. I agree, so that's what I'm doing. I still don't know why I'm up so early, but at least I'm doing something.


Introduction
Object Thinking; it's been around for decades as a paradigm for software design, but what is it? When presented with a problem, someone using object thinking will start to decompose the problem into discrete sections that can interact with each other. You could, for example, be forced to change the tyre on your car. A simple task, certainly, but to do it you must understand the tools and relevant components of your car, and how they need to work together to achieve your goal.

It might take several attempts to achieve a fine grained enough understanding to effectively solve the problem. Your first pass at the above example might leave you with the idea to take the wheel off your car. A second thinking might make you realise that you need to lift the car off the floor to do that, and so on.

One thing that can give you a head start in solving a problem using object thinking is background knowledge. Knowing about your problem domain, what the objects in it are capable of, makes it easier to plan how to use them. Not knowing enough can cause issues, however, if assumptions are made based on incomplete knowledge.

For example: You are asked to stick a poster to the wall, without leaving holes in the wall. You are given a hamster, newspaper and some Blu Tack®, along with the poster. If you don't know what Blu Tack® is for then your understanding of the problem domain is incomplete and you could end up using the hamster to chew up newspaper into balls, and use those to stick the poster to the wall.

It is also important to note that not everything present in your problem domain will necessarily be used to solve the problem. So, in the previous example, you might not use the newspaper or hamster at all (or, of course, you might find the hamster solution better, as it reuses the newspaper, which is more ecological).

So how does this apply to software design? Software is just “algorithms and data structures”, right? Well, at the end maybe, but you've still got to design it. Software is the output of people's attempt to solve a problem. Solving a problem with object thinking is the natural way, as this series of posts hopes to demonstrate, because it uses people's natural problem solving techniques.

Object thinking is a core tenet of Object Oriented Design (OOD), a well known software design paradigm. The inventors of OOD set out to fix what they saw are the main problem with software design – software design was taught to make people think like computers, so that they could write software for computers.
 
A book that extensively covers the meaning and practical aspects of object thinking is Object Thinking by David West (2004, Microsoft Press). In it he likens the way that traditional programmers use OOD to writing lots of small COBOL programmes [1]. Objects in this sense have been turned into data structures with algorithms wrapped around them. While modularising code is better than having one large function, it only makes designing software a little easier. It still focuses the attention of the design on how a computer works and not how the problem should be solved.

So what makes reasoning about large systems easier? Focusing on the problem space and decomposing it into several smaller problems helps. But what is easier to think about? Is it easier to think how those problems translate into code? Perhaps in the short term, but you will end up solving the same problems over and over again, and your code will probably be inflexible.

Would it be better to think about software design the same way you think about solid world problems? That way you can use your innate problem solving skills to frame and express your design.

It turns out that the way people reason about real world problems is to break them down into smaller parts, using their background understanding of the problem space, take the parts of the problem space and treat them as objects that can do things and have things done to them, and find way for the objects to interact. [2]

This works well because people like to anthropomorphise objects, so that they can imagine the object doing things under its own agency, even if in the end it's a person causing the action.[3]

How can you be sure this is how you think, and is therefore the more sensible way to approach software design? Well it turns out that there is an oft ignored backwater science known as Cognitive Psychology, and scientists in this field have been studying people for decades, to find out how they work.

Future posts in this series will review certain cognitive psychology and neuropsychology texts and expand on how this applies to object thinking. The end goal is to demonstrate that object thinking is innate and therefore the best strategy for designing software.

Next post in the series: Object Thinking - Objects: a neurological basis

References
[1] Object Thinking, D. West (2004, Microsoft Press) p9
[2] Problem Solving from an Evolutionary Perspective visited 9th July 2011
[3] Object Thinking, D. West (2004, Microsoft Press) p101

Blu-Tack is a registered trademark of Bostik. I am not affiliated with Bostik.

Friday, April 29, 2011

Networking client / server example

At work I have been writing a lot of code relating to sending data over a TCP connection.

I have also seen a couple of questions, recently, on Stack Overflow asking about why networking code wasn't working. Unfortunately I didn't have time to answer them, but it did make me think that there must be a dearth of good samples of networking code online.

Allow me to make that dearth one sample fewer! (Does that make sense?)

For the full listing visit my Github repository: https://github.com/Mellen/Networking-Samples

One problem, that sparked my interest, was how to to keep the server running when a client disconnects. Because the server needs to know when a client disconnects, and not just choke and die. A client disconnecting is not an exceptional circumstance.

The first problem is to not let the server die when a client disconnects, the second is to keep the server looking for new connections, so that it can be a server.

Keep it alive!

My solution to the disconnection problem got generalised to both the client and the server classes, because it makes sense to not have the client die if the server disappears. The user might want to try to reconnect.

You'll find this code in the file NetworkSampleLibrary/NetworkStreamHandler.cs

protected void ReadFromStream(object worker, DoWorkEventArgs args)
{
    BackgroundWorker streamWorker = worker as BackgroundWorker;
    NetworkStream stream = args.Argument as NetworkStream;
    try
    {
        HandleStreamInput(stream);
    }
    catch (Exception ex)
    {
        if (ex is IOException || ex is ObjectDisposedException || ex is InvalidOperationException)
        {
            streamWorker.CancelAsync();
        }

        if (ex is IOException || ex is InvalidOperationException)
        {
            stream.Dispose();
        }

        if (StreamError != null)
        {
            StreamError(ex, stream);
        }
    }
}

You might have noticed that the method is an event handler. More on that below.

As you can see, there are three types of exception that can happen if a client disconnects from the server: IOException, ObjectDisposedException and InvalidOperationException. I found this out through trial and error.

The most common exception that gets thrown when a client disconnects is IOException. This is because the server will be trying to read from the client when it leaves.

Because of the threaded nature of the system, ObjectDisposedExceptions gets thrown when another exception gets thrown and the server still tries to read from the stream in the mean time.

I'm not entirely sure why InvalidOperationException gets thrown, and it doesn't happen a lot, but it is always when the client disconnects.

My strategy is to catch all exceptions, deal with the disconnection exceptions by disposing of the stream if necessary and cancelling the process that reads from the stream, then raising an event that contains the exception and the stream that threw it. I could create a custom exception here, but I settled on an event just in case something that wouldn't catch an exception wanted to know about it.

All are welcome

The next part of the puzzle is to make sure that more than one client can connect to your server.

This is achieved in the NetworkServer class. This can be found at NetworkServerSample / NetworkServer.cs

The pertinent parts are listed below:

public NetworkServer(int port)
{
    _listener = new TcpListener(IPAddress.Any, port);
    _listener.Start();
    _listener.BeginAcceptTcpClient(AcceptAClient, _listener);
    DataAvilable += SendDataToAll;

    StreamError += (ex, stream) =>
        {
            if (ex is IOException || ex is InvalidOperationException || ex is ObjectDisposedException)
            {
                _streams.Remove(stream);
                Console.WriteLine("lost connection {0}", ex.GetType().Name);
            }
            else
            {
                throw ex;
            }
        };
}

private void AcceptAClient(IAsyncResult asyncResult)
{
    TcpListener listener = asyncResult.AsyncState as TcpListener;

    try
    {
        TcpClient client = listener.EndAcceptTcpClient(asyncResult);

        Console.WriteLine("Got a connection from {0}.", client.Client.RemoteEndPoint);

        HandleNewStream(client.GetStream());
    }
    catch (ObjectDisposedException)
    {
        Console.WriteLine("Server has shutdown.");
    }

    if (!_disposed)
    {
        listener.BeginAcceptTcpClient(AcceptAClient, listener);
    }
}

private void HandleNewStream(NetworkStream networkStream)
{
    _streams.Add(networkStream);
    BackgroundWorker streamWorker = new BackgroundWorker();
    streamWorker.WorkerSupportsCancellation = true;
    streamWorker.DoWork += ReadFromStream;
    streamWorker.RunWorkerCompleted += (s, a) =>
                                        {
                                            if (_streams.Contains(networkStream) && !a.Cancelled)
                                            {
                                                streamWorker.RunWorkerAsync(networkStream);
                                            }
                                        };
    streamWorker.RunWorkerAsync(networkStream);
}

In the constructor, the server is set up to listen on a particular port for incoming connections and handle the connection requests asynchronously. It also creates an event handler for when the network stream throws an exception, as explained above. This makes sure that the stream is removed from the list of streams, so that it doesn't try to get disposed of when the server is disposed, and that no data gets broadcast down it.

The method that deals with the asynchronous requests for connection (AcceptAClient) has to make sure that the server hasn't been disposed of when the connection attempt is made, hence the try-catch block. Once the connection request has been handled then the method starts listening for another connection attempt. This is all it takes, essentially asynchronous recursion.

The HandleNewStream method also uses asynchronous recursion to read each message from the client. It sets up a BackgroundWorker instance that asynchronously calls the ReadFromStream method in the previous section, and when the work is complete, the worker will call the method again, so long as the stream is in the list of streams on the server and the worker has not been cancelled.

That's the meat of the server. Accepting and handling input from more than one client is achieved with a list and asynchronous recursion. Dealing with clients disconnecting is done with exception handling and events.

Thursday, April 28, 2011

Really basic programming maths (part 1)

So I've been trying to mentally do hexadecimal addition. I've found that I'm not very good at it.

I'm going to slowly explain how I go about working stuff out, with the hope that it will stick in my head and get easier. (Binary is written with the most significant bit first, and all numbers are unsigned.)

First of all there is how to think about numbers in binary and hex.

Decimal numbers get split up into multiples of powers of ten.

For example 4181 can be broken down as:
  • 4 x 103
  • 1 x 102
  • 8 x 101
  • 1 x 100

Remembering that all numbers raised to 0 are 1.

This applies to both binary and hexadecimal too.

So 0xFEED breaks down to:
  • F(15) x 10(16)3
  • E(14) x 10(16)2
  • E(14) x 10(16)1
  • D(13) x 10(16)0

The numbers in parenthesis are the decimal representations of the hexadecimal numbers.

And 0b1101 breaks down to:
  • 1(1) x 10(2)3
  • 1(1) x 10(2)2
  • 0(0) x 10(2)1
  • 1(1) x 10(2)0

The numbers in parenthesis are the decimal representations of the binary numbers.

Next up is the easy way to transition from hex to binary and back.

Since an individual hex digit takes up to a maximum of four bits, all hex numbers can be represented as collections of four bit numbers.

So 0x4432 can be broken down into 0b0100, 0b0100, 0b0011, 0b0010

This can be reversed. Say you have the 32bit number 0b10011100110100110101101011110011.

If you break it down into four bit chunks you get:
  • 0b1001
  • 0b1100
  • 0b1101
  • 0b0011
  • 0b0101
  • 0b1010
  • 0b1111
  • 0b0011

Each chunk can be represented as a hex digit:
  • 0x9
  • 0xC
  • 0xD
  • 0x3
  • 0x5
  • 0xA
  • 0xF
  • 0x3

Which gives us the number 0x9CD35AF3.

The difficult part comes in getting that number as decimal.

To do it from hex, you need to add up all the powers of sixteen that there are:
  • 9 x 167
  • 12 x 166
  • 13 x 165
  • 3 x 164
  • 5 x 163
  • 10 x 162
  • 15 x 161
  • 3 x 160

Which turns out to be: 2631097075. Not easy to calculate in your head. To do it from binary would take even longer as you would need to add up all the powers of two from 31 to 0.

Thus endeth part one.

Monday, December 13, 2010

Addresses in databases


Whenever I see something like this:

  • Address Line 1:
  • Address Line 2:
  • Address Line 3:
  • City:
  • Country:
  • Post Code:

I want to find the database designer and smack them.

What is it about addresses that make people think that they don't need normalising?

No! Of course! The solution to storing addresses is to create a table and force all addresses to fit into five lines plus a postal code. Brilliant. Really smart.

There is one mandatory field in the address: country. That's the only one. Everyone lives in a country. I don't want to get into stupid arguments like "Wales isn't a country it's a principality", etc., when you put it in an address it's a country.

You know something people know? How many lines there are in their address. So don't force them to have 3, 4, 5, xty mumble-jillion, or however many you think is sufficient.

This is what I want to see from now on:

Address


If you do the post / zip / whatever code search thing, then great, but be sure to store the address lines in a sensible manner.

address_id line_id text
1 1 My House Name
1 2 My Street Name
1 3 My City Name
1 4 My Post Code
1 5 My Country

Thursday, December 02, 2010

Re: quick idea

It's not trivial. There is no easy way to convert a file like jpg/png/gif into icon format. Arbitraried!

Sunday, November 14, 2010

No coding Sundays

I've decided that I'm going to not code on Sundays.

I'll try and cut out Stack Overflow too, except for next Sunday because that is my 99th consecutive day. I NEED MY BADGE.

Sundays will be given over to something else. Anything else.

It's not that I've stopped loving coding. I think I love it too much. I'm going to see what else there is.

Friday, October 22, 2010

Quick idea

I think it should be trivial to make an png/jpeg/gif/bmp -> icon creator

I'm going to work to one.

Friday, October 08, 2010

Solving Sudoku

I was chatting with my manager the other day, just shooting the breeze, and we got on to how he knocked together a python script to prove to his girlfriend that programmatically solving sudoku puzzles is easy.

I disagreed for a moment and then realised I was thinking of generating sudoku puzzles, which we agreed isn't easy.

I had tried to make a sudoku helper app before, to practice MVVM and WPF, but had messed up in some calculation or other. Probably at the point where I was calculating which block a square was in. Anyway I had deleted that one, but my boss had spurred my interest in doing it again.

I'm a better programmer than I was that first time - I understand both WPF and MVVM better now, so this little solver is pretty sweet. (Unless you look at the code.)

It has all the features I need. I can fill in the known numbers, delete mistakes, and click a button to solve the unknowns (once the knowns are in place).
Sometimes you don't even need the button, since the programme eliminates possibilities as you type. One puzzle I tried was solved before I typed in all the known numbers!

So my amazing solver has two simple algorithms doing the solving:
  1. Each square has an event that fires when its number of possible values reaches 1, either programmatically or by user intervention. This event is subscribed to by all the squares related to it (row, column, block), and so each related square will remove this value from their possible values list. This can cause a chain reaction of updates, solving the sudoku puzzle when enough knowns are typed in.
  2. If elimination alone doesn't do the job then the second algorithm is just a button click away. I might have over thought this one:
    1. Create a list of squares that have at least 2 possible values, sorted in ascending order of number of possible values
    2. Take the first square and find all the squares in the same block
    3. Add theses squares to a checked block list
    4. Flatten the lists of potential values into one list
    5. Find any unique values in that list
    6. If there are any unique values then these represent solved squares so break out of the loop and update the squares related to those values.
    7. If there isn't a unique value then repeat 3, 4 and 5 for the row, then the column of the current square.
    8. If after that there still isn't a unique value, move onto the next square that hasn't been checked yet.
If at the end of the second algorithm a number hasn't been updated then the programme lets the user know that it needs more knowns, otherwise it starts the second algorithm again until all the squares are filled.

I know what you're thinking. You're thinking that if a user makes a mistake inputting a value, then when they delete it and input a new value the possible values list for the related squares will be wrong. Fear not! Deleting a value fires an event that does the opposite of inserting a value, so things go back to the way they were. Phew!

If you want to look at the code it's on github here: http://github.com/Mellen/SudokuSolver

The code is c#. The project is a Visual Studio 10 project, that runs on the .NET 4.0 framework. It even has a couple of unit tests. Yes, I'm that guy. I unit test toy projects.

The executable is available from github: SudokuSolver1.0.2.zip. It requires .NET version 4.0.

Anyway! This was a fun little diversion. I makes me happy that I got it right the second time.

Monday, September 20, 2010

Thinking about learning

So, my lack knowledge needs to take a bit of a beating.

If I'm to get significantly better at writing C#, I need to understand the specification.

It seems like a daunting task, but I think if I try and tackle a point at a time, writing small programmes to demostrate my understanding, I'll get a much deeper understanding of how my programmes hang together and how to write them better.

Wish me luck!

Saturday, August 28, 2010

Learning to see patterns in my own behaviour

So, a week and a half ago I was looking at a question on Stack Overflow (Algorithm to calculate the number of combinations to form 100 ). I set about solving it in Haskell, and came up against a block to my success:

Given a list of numbers xs and another number n, generate a list of all the possible combinations lists of length n that contain the numbers from xs.

So, given the list [1,2] and the number 3, the function should generate this list of lists: [[1,1,1],[1,1,2],[1,2,1],[1,2,2],[2,1,1],[2,1,2],[2,2,1],[2,2,2]]

I was pretty sure that this had been done before, but because I'm trying to get better at deducing algorithms, I'm stubborn, and I'm doing this for fun I decided to figure out the algorithm for myself.

It wasn't as easy as it seemed.

I sat down and wrote out the outputs for a few different sets of inputs, I looked at them, I looked some more. I could see a couple of patterns, namely that (length of xs)n is the length of the final output and that you could create a rectangle of answers with width length of xs and height (length of xs - 1)n. Neither of these were helpful.

I left the problem alone for a while, hoping that time would give me perspective. I was surprised how hard I was finding it to find the pattern.

Today I came back to it with a fresh brain and time to kill. I took a walk to the park, sat down, started to write out the output where the input is a list of length 3, and n as 3. As I was writing, I had the realisation that the way to solve this was to figure out the algorithm of how to write it down. The problem in my previous examples of output was that I hadn't written it in a good enough pattern. I started writing out the output for a different input a list of length 4, with n of 4 (256 items, for those keeping count). This time I was very systematic about how I wrote out the output. I got to the 44th list in the list and stopped to see if I could see it yet. I could: the last element in the individual lists was repeating every 4 items.

I stood up and, as is my wont when I am thinking, I started pacing. I must have looked a little unhinged, as I was pacing in a small circle around my bag.

It took me a few minutes, but eventually I figured out how to represent what I was seeing in my written output as an algorithm: the first time through, each item of xs is appended to an empty list, for each subsequent time through, each item in xs is appended to each list in the list of lists.

In Haskell, I came up with this function to do the work:

makeallsets :: Integral a => [a] -> a -> [[a]] makeallsets xs n = mas (addtoonelist [] xs) xs (n - 1) where mas yss _ 0 = yss mas yss xs (n + 1) = mas (addtoeachlist yss xs) xs n where addtoeachlist [] xs = [] addtoeachlist (ys:yss) xs = (addtoonelist ys xs) ++ (addtoeachlist yss xs) addtoonelist ys [] = [] addtoonelist ys (x:xs) = (x : ys) : (addtoonelist ys xs)

This allowed me to create an answer to the Stack Overflow problem. (Although there's no point posting it for 3 very good reasons: 1. it's not in the target language (which is Scala); 2. It uses the brute force approach; 3. There is already a better answer.)

Score 1 for perseverance!

P.s. if anyone would like to show me a better way, I'd be very glad to hear it.

Sunday, July 25, 2010

Update to ToDoList

I have made an update to the ToDoList WPF application I wrote some time ago.

ToDoList version 1.2.0.0

Changes:

  • Created a ViewModel for the To Do List object and To Do List items.
  • Setup templates in the MainWindow XAML that display the ViewModel.
  • Added in an edit window.
  • Added in a context menu for items that allows for editing, deletion and marking as done.
  • Added in edit and delete functionality.

I think the final addition will be to allow users to view done items. I'll get around to this at some point :D

Wednesday, May 05, 2010

Memoizing functions in c++

I was thinking about memoization, and how I'd not yet used it. I thought this was a bad thing simply because not using it might lead to me forget about it. So I'm putting together this blog post to help me solidify the concept.

A long while a go I realised a simple fact about square numbers: x² = (x-1)² + (x-1) + x, x ∈ N. I.e. for any positive integer its square is the square of the previous integer plus the previous integer plus itself. (e.g. 17*17 = 16*16 + 16 + 17)

This is something that is unlikely to be interesting or useful, except that I can use it to demonstrate memoization.

From the above formula you can write a recursive function:

int square(int n)
{
    if(1 == n)
    {
        return 1;
    }
    return (square(n - 1) + (n - 1) + n);
}

As you can see this is a very long winded way to get the square of a number, and not a function that would ever be used in reality, but it is a good candidate for memoization.

Memoization in this instance is very easy. Simply add in a static map<int, int> and update it for each number you haven't calculated yet:

int square(int n)
{
 static std::map<int, int> results;
 if(1==n)
 {
  return 1;
 }
 if(0 == results[n])
 {
  results[n] = square(n-1) + n-1 + n;
 }
 return results[n];
}

It might be that you'll want to make the results variable on the heap with some sort of smart pointer, so that it automatically deletes itself, but other than that this second version should give a performance increase over the original.

I carried out some simple timing tests with std:clock(). The programme had to calculate the squares from 1 to 32767 using the memoized and non memoized functions, in a loop:

toggle test code
#include <map>
#include <ostream>
#include <ctime>

int calcSqr(int);
int calcSqrSlow(int);

int main()
{
 clock_t start1 = std::clock();
 for(int i = 1; i <= 32767; ++i)
 {
  calcSqrSlow(i);
 }
 clock_t start2 = std::clock();
 std::cout << "Ticks taken (slow): " << start2 - start1 << std::endl;
 clock_t start3 = std::clock();
 for(int i = 1; i <= 32767; ++i)
 {
  calcSqr(i);
 }
 clock_t start4 = std::clock();
 std::cout << "Ticks taken (memo): " << start4 - start3 << std::endl;
 return 0;
}

int calcSqrSlow(int n)
{
 if(1 == n)
 {
  return 1;
 }
 
 return (calcSqrSlow(n - 1) + (n - 1) + n);
}

int calcSqr(int n)
{
 static std::map<int, int> results;
 
 if(1==n)
 {
  return 1;
 }
 
 if(0 == results[n])
 {
  results[n] = calcSqr(n-1) + n-1 + n;
 }
 
 return results[n];
}

Ticks taken for the normal function: 3120
Ticks taken for the memoized function: 78

Obviously this test was biased towards the memoized function, but I really did it to show the potential benefits of memoizing a function where the results can be reused.

Tuesday, March 23, 2010

SVG + Javascript drag and zoom

Recently I've been working on a project that uses SVG (Scalable Vector Graphics).

I have been using SVGWeb (http://code.google.com/p/svgweb/) so that the SVG will work in all the major browsers.

It is a fantastic library and I am so grateful to the people who work on it.

The things I found difficult were figuring out how to get zooming with the mouse wheel and dragging to work. I had it working in Firefox, using its native SVG renderer, however SVGWeb does things differently. It took me a while to work out how. I'm going to share what I found here. (Hooking the mouse wheel is actually explained on the SVGWeb mailing list: Mouse Wheel Events.)

With dragging, I knew I needed to store the old X and Y values of the position of the mouse and take the difference between them and the new mouse position. For some reason setting global variables for the old X and Y values didn't quite work - the delta was very small, approximatley 7.5 times too small.

With zooming, the SVGWeb library doesn't pick up the mouse wheel event. The way to get around this is to attach the mouse wheel event to the container tag (e.g. div) that is surrounding the object tag that is holding the SVG on the HTML page.

On to the code!

I did not come up with the Javascript - I took it from various places; mostly the SVGWeb mailing list entry above and the "photos" demo that comes with SVGWeb.

This is the main HTML and Javascript for the page that is holding the SVG:

toggle code

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
        <title>SVG Example</title>
        <meta name="svg.render.forceflash" content="true" />
        <link rel="SHORTCUT ICON" href="favicon.ico" />
    </head>
    <body onload="loaded()">
        <div id="svgContainer">
            <!--[if IE]>
            <object id="svgImage" src="example.svg" classid="image/svg+xml" width="100%" height="768px">
            <![endif]-->
            <!--[if !IE]>-->
            <object id="svgImage" data="example.svg" type="image/svg+xml" width="100%" height="768px">
            <!--<![endif]-->
            </object>
        </div>
        <script type="text/javascript" src="svg/src/svg.js" data-path="svg/src/" ></script>
        <script type="text/javascript">
            function loaded()
            {
                hookEvent("svgContainer", "mousewheel", onMouseWheel);
            }
            function hookEvent(element, eventName, callback)
            {
              if(typeof(element) == "string")
                element = document.getElementById(element);
              if(element == null)
                return;
              if(element.addEventListener)
              {
                if(eventName == 'mousewheel')
                  element.addEventListener('DOMMouseScroll', callback, false);
                element.addEventListener(eventName, callback, false);
              }
              else if(element.attachEvent)
                element.attachEvent("on" + eventName, callback);
            }
            function cancelEvent(e)
            {
                e = e ? e : window.event;
                if(e.stopPropagation)
                    e.stopPropagation();
                if(e.preventDefault)
                    e.preventDefault();
                e.cancelBubble = true;
                e.cancel = true;
                e.returnValue = false;
                return false;
            }
            function onMouseWheel(e)
            {
                var doc = document.getElementById("svgImage").contentDocument;  
                e = e ? e : window.event;
                doc.defaultView.onMouseWheel(e);
                return cancelEvent(e);
            }
        </script>
    </body>
</html>

This is the SVG and Javascript:

toggle code

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg version="1.0" xmlns="http://www.w3.org/2000/svg" onload="loaded()" id="svgMain" >
    <script type="text/javascript" language="javascript">
    <![CDATA[
        var isDragging = false;
        var mouseCoords = { x: 0, y: 0 };
        var gMain = 0;
       
        function loaded()
        {
            var onloadFunc = doload;

            if (top.svgweb)
            {
                top.svgweb.addOnLoad(onloadFunc, true, window);
            }
            else
            {
                onloadFunc();
            }
        }
       
        function doload()
        {
            hookEvent('mover', 'mousedown', onMouseDown);
            hookEvent('mover', 'mouseup', onMouseUp);
            hookEvent('mover', 'mousemove', onMouseMove);
            hookEvent('mover', 'mouseover', onMouseOver);
            gMain = document.getElementById('gMain');
            gMain.vScale = 1.0;
            gMover = document.getElementById('mover');
            gMover.vTranslate = [50,50];
            setupTransform();
        }
       
        function onMouseDown(e)
        {
            isDragging = true;
        }
       
        function onMouseUp(e)
        {
            isDragging = false;
        }
       
        function onMouseOver(e)
        {
            mouseCoords = {x: e.clientX, y: e.clientY};
        }
       
        function onMouseMove(e)
        {
            if(isDragging == true)
            {
                var g = e.currentTarget;
                var pos = g.vTranslate;
                var xd = (e.clientX - mouseCoords.x)/gMain.vScale;
                var yd = (e.clientY - mouseCoords.y)/gMain.vScale;
                g.vTranslate = [ pos[0] + xd, pos[1] + yd ];
                g.setAttribute("transform", "translate(" + g.vTranslate[0] + "," + g.vTranslate[1] + ")");
            }
           
            mouseCoords = {x: e.clientX, y: e.clientY};
           
            return cancelEvent(e);
        }
       
        function setupTransform()
        {
            gMain.setAttribute("transform", "scale(" + gMain.vScale + "," + gMain.vScale + ")");
        }
       
        function hookEvent(element, eventName, callback)
        {
            if(typeof(element) == "string")
                element = document.getElementById(element);
            if(element == null)
                return;
            if(eventName == 'mousewheel')
            {
                element.addEventListener('DOMMouseScroll', callback, false);
            }
            else
            {
                element.addEventListener(eventName, callback, false);
            }
        }
       
        function cancelEvent(e)
        {
            e = e ? e : window.event;
            if(e.stopPropagation)
                e.stopPropagation();
            if(e.preventDefault)
                e.preventDefault();
            e.cancelBubble = true;
            e.cancel = true;
            e.returnValue = false;
            return false;
        }
       
        function onMouseWheel(e)
        {
            e = e ? e : window.event;
            var wheelData = e.detail ? e.detail * -1 : e.wheelDelta / 40;
           
            if((gMain.vScale > 0.1) || (wheelData > 0))
            {
                gMain.vScale += (0.02 * wheelData);
            }
           
            setupTransform();
           
            return cancelEvent(e);
        }
    ]]>
    </script>
    <g id="gMain">
        <g transform="translate(50,50)" id="mover">
            <circle stroke-width="2" stroke="black" cx="0" cy="0"  r="20" fill="red"/>
            <text font-family="verdana" text-anchor="middle" transform="translate(0,40)" fill="black" stroke-width="1" font-size="12" >Drag me!</text>
        </g>
    </g>
</svg>
There is some overlap in the Javascript presented there, this is just to keep things simple if you're copy/pasting this to test for your self.

This Javascript in the main file passes the mouse wheel event info to the SVG document:
function onMouseWheel(e)
{
   var doc = document.getElementById("svgImage").contentDocument;   
   e = e ? e : window.event;
   doc.defaultView.onMouseWheel(e);
   return cancelEvent(e);
}
The rest of the important Javascript is in the SVG document.
To get dragging to work, first define a global object to hold position information:
var mouseCoords = { x: 0, y: 0 };
When the mouse moves over the desired element, update the object:
function onMouseOver(e)
{
    mouseCoords = {x: e.clientX, y: e.clientY};
}
There also needs to be a global boolean to switch dragging on and off. I called mine isDragging. Toggle dragging when the mouse is up or down on the element.
function onMouseDown(e)
{
    isDragging = true;
}
      
function onMouseUp(e)
{
    isDragging = false;
}
When moving the mouse with dragging on, change the position of the element and update the object. Notice that the delta is being divided by the scale. This prevents the movement from becoming erratic.
function onMouseMove(e)
{
    if(isDragging == true)
    {
        var g = e.currentTarget;
        var pos = g.vTranslate;
        var xd = (e.clientX - mouseCoords.x)/gMain.vScale;
        var yd = (e.clientY - mouseCoords.y)/gMain.vScale;
        g.vTranslate = [ pos[0] + xd, pos[1] + yd ];
        g.setAttribute("transform", "translate(" + g.vTranslate[0] + "," + g.vTranslate[1] + ")");
    }
  
    mouseCoords = {x: e.clientX, y: e.clientY};
  
    return cancelEvent(e);
}

And that's how it works.