Monday, June 24, 2013

More AI Ramblings

(This is technically after lunch, don't judge me!)

The other problem I thought up today was that I don't really know what each node of the distributed processes would do.

Should they all do the same thing? Neurons are all basically the same, should each processing unit be basically the same? Or should a unit be more like a part of the brain? So a unit would be like Broca's area, or the visual cortex, etc. But not those things exactly because they are human building blocks, not blocks of the programme.

Let's say there are N types of block. That is analogous to the modular approach to neurology and cognitive psychology. But they need to be resilient to defect, so any block's functions should, over time, be able to be transferred to some other block, as required. This is sort of analogous to the monolithic approach. Meeting in the middle seems par for the course with psychology. Shades of grey are inherent in defining consciousness. Maybe it's even more exciting than a grey scale, perhaps it involves all values of colour.

So the blocks' functions would be mutable. That's a scary thought, but practical.

I shall carry this on after today.

Sunday, June 02, 2013

AI Rambling

I've always had delusions of grandeur. That's what inspired me to start this blog: that I would be able to chronicle my development of AI software. Of course I have since found that I'm not quite smart enough to do that. However today I will indulge myself.
I have recently been reading New Scientist articles on consciousness and the analogical nature of thought.

Firstly: It (finally?) occurred to me that AI should be layered. I had always had in mind a distributed model, but it had always been on one layer. But if some processes seemed "unconscious" and others "conscious", for example the aggregation of input vs the perception of input, then it would be easier to combine input into conscious thoughts because of the specialised nature of the "conscious" and "unconscious" processing units. The AI would only have thoughts that made sense to it (so the theory goes).

The distributed model would have various components, trying to produce analogies of things like "the seat of consciousness". Which brings me to my second thought: how analogy fits in. Douglas Hofstadter says that "Analogy is the machinery that allows us to use our past fluidly to orient ourselves in the present." So analogy can be used not only as the storage mechanism for thought, but as the transport too. It's important to remember that "storage" is shorthand not only for long term or short term memories, but also for the information currently being processed. The thought currently bubbling through your prefrontal cortex, etc. is analogical.

The problem for me is trying to figure out what to base analogies for a programme on. Humans use feeling. What is a good analogy for feeling in a programme? Processor strain? Amount of memory being used? What are good feelings? What are bad feelings? Could it be some sort of arbitrary value? I don't think an arbitrary value would work because of its ungrounded nature. I would prefer things based on what represents reality for the programme, not some abstract idealism concocted by my imagination. What feels good or bad for a programme won't be the same as what feels good or bad for me, but there will be a way to link them together through analogy.

I think I will continue this after lunch.

Saturday, June 02, 2012

Sparse Matrix Multiplication

I want the Math.NET Numerics developers to know their work is great, they put together an easy to use, astoundingly well documented numerical library for .NET. Please know this little criticism comes from a place of respect. It could even be that the code has been updated since your last release and what I'm going to point out is no longer a problem.

I really don't know much about calculus and mathematics at that level. I barely passed A-level maths, and the only time I've used any of the knowledge gained therein was when I had to calculate the first derivative of 1-e-x at university. My mathematics skills are weak (sadly). So, when, in mid-April, I was asked at work to implement some maths heavy algorithms, I felt suitably challenged. Thankfully the scientist who was feeding me the algorithms understood them really well and was on hand to explain things to me over and over again until we finally got things working yesterday. Yay!

Some of what we did relied on sparse matrices, something I had heard of, but never used. So my first thought was that I needed a third party library to do these calculations. The library we are currently using is the bluebit .NET matrix library, it's not perfect and we'll have to replace it with something faster, but for the moment it makes the code testable. This matrix was not my first choice, ideally I wanted something we didn't have to pay for. My first stop was the Math.NET Numerics library. This, unfortunately proved to be too slow. I also tried out Extreme Optimization, but this library was also too slow. Other libraries I looked at were ILNumerics, IMSL.NET and Center Space NMath. I looked but I did not test these last three because each library's API and help were so bad I couldn't figure out how to do what I needed to do. I don't have time to figure out matrix maths, this is why I'm looking for a library. If you want me to choose yours, make it easy to use.

So that was the bulk of the outcome of my foray in numerical libraries. Bluebit is my current choice, but I will have to change it for something faster. This is not the only thing I learned. I learned something that I hope, if they haven't already, the Math.NET developers will be able to use in their code. I've not time to dive into the project, and patch it myself — as I've said, my understanding of the maths is not great — so feel free to take the code here and fix it to work in the library.

At work I'm dealing with quite large matrices. The stuff I've been testing with is 8K x 8K points, and the real data will probably be up to 32K x 32K. But these are sparse matrices, so working with them should not be too processor and memory intensive. The major things I need to do are transposition, multiplication and inversion. Inversion is the killer, and understanding it is currently over my head. It's the place where Extreme Optimization fell down, and where bluebit struggles. I need the algorithms to run in a few seconds. Currently, with 16K x 16K points and bluebit, it's taking 2 minutes. The algorithm did not complete with the other two libraries. I waited for over half an hour, and still nothing, and that was with 8K data.

The first problem that Math.NET encountered was with the multiplication of the matrices. This is what I hope I've optimised. All I've done is profile their code and change the bit that took forever - assigning data to a point in the matrix

My first step was to write these two tests, to make sure I was multiplying the matrices correctly:

[Test]
public void MatrixMultiplication()
{
    var leftM = new double[,] {{4, 5, 6, 7, 8, 1, 2}, {3, 9, 6, 7, 3, 3, 1}, {2, 2, 8, 4, 1, 8, 1}, {1, 9, 9, 4, 3, 1, 2}};
    var rightM = new double[,] {{1, 8, 1}, {2, 6, 2}, {3, 4, 1}, {4, 2, 2}, {5, 1, 1}, {6, 3, 2}, {7, 5, 1}};
    var expectedM = new double[,] {{120, 121, 46}, {107, 133, 51}, {106, 98, 40}, {97, 122, 43}};

    var sm = new SparseMatrix();

    var resultM = sm.MultiplyMatrices(leftM, rightM);

    Assert.AreEqual(expectedM.Rank, resultM.Rank);
    Assert.AreEqual(expectedM.GetLength(0), resultM.GetLength(0));
    Assert.AreEqual(expectedM.GetLength(1), resultM.GetLength(1));

    for(int row = 0; row < 4; row++)
    {
        for(int col = 0; col < 3; col++)
        {
            Assert.AreEqual(expectedM[row, col], resultM[row, col]);
        }
    }
}

[Test]
public void SparseMatrixMultiplication()
{
    var leftM = new double[,] {{1,2,3,0,0,0,0,0,0,0}, {0,0,0,0,0,1,2,0,0,0}, {1,0,4,0,0,5,0,0,0,0}, {0,4,0,5,0,6,0,0,7,0}, {9,0,0,0,0,0,8,0,0,0}};
    var rightM = new double[,] {{0,2,0,4,0}, {1,0,0,1,1}, {3,0,1,3,0}, {4,0,0,0,0}, {0,5,6,0,0}, {0,9,0,6,0}, {0,1,0,3,0}, {0,0,8,0,9}, {0,0,0,0,7}, {0,1,0,0,5}};
    var expectedM = new double[,] {{11,2,3,15,2}, {0,11,0,12,0}, {12,47,4,46,0}, {24,54,0,40,53}, {0,26,0,60,0}};
 
    var sm = new SparseMatrix();

    var resultM = bc.MultiplyMatrices(leftM, rightM);

    for (int row = 0; row < 4; row++)
    {
        for (int col = 0; col < 3; col++)
        {
            Assert.AreEqual(expectedM[row, col], resultM[row, col]);
        }
    }
}

(SparseMatrix isn't really the name of the class, I put the multiplication into the class that was handling the algorithm, but I'm not allowed to talk about that!)

Then I spent ages struggling (because of my ignorance, the code is easy to read) with the Math.NET code to try and understand sparse matrix multiplication - how it could be faster than normal matrix multiplication, and how I could implement it faster. It took a couple of days. I spent a couple of days, rather than giving up and finding a proprietary library right away, because I thought that Math.NET would do the business when it came to inversion. Sadly this isn't the case. Anyway, this is my optimised sparse matrix multiplication method:

private IEnumerable GetNonZeroIndicesForMatrixColumn(double[,] matrix, long col, int rowcount)
{
    for (int row = 0; row < rowcount; row++)
    {
        if (matrix[row, col] != 0)
        {
            yield return row;
        }
    }
}

private IEnumerable GetNonZeroIndicesForMatrixRow(double[,] matrix, int row, int colcount)
{
    for (int col = 0; col < colcount; col++)
    {
        if (matrix[row, col] != 0)
        {
            yield return col;
        }
    }
}
        
/// <summary>
/// Matrix multiplication optimised for sparse matrices
/// </summary>
/// <param name="matrix1">Matrix on the left of the multiplication</param>
/// <param name="matrix2">Matrix on the right of the multiplication</param>
/// <returns>A matrix that is the multiplication of the two passed in</returns>
public double[,] MultiplyMatrices(double[,] matrix1, double[,] matrix2)
{
    int j = matrix1.GetLength(1);
    if (j != matrix2.GetLength(0))
    {
        throw new ArgumentException("matrix1 must have the same number of columns as matrix2 has rows.");
    }

    int m1Rows = matrix1.GetLength(0);
    int m2Cols = matrix2.GetLength(1);
    double[,] result = new double[m1Rows, m2Cols];

    var nonZeroRows = new List[m1Rows];

    Parallel.For(0, m1Rows, row =>
    {
        nonZeroRows[row] = GetNonZeroIndicesForMatrixRow(matrix1, row, j).ToList();
    });

    var nonZeroColumns = new List[m2Cols];

    Parallel.For(0, m2Cols, col =>
    {
        nonZeroColumns[col] = GetNonZeroIndicesForMatrixColumn(matrix2, col, j).ToList();
    });



    Parallel.For(0, m1Rows , row =>
    {
        Parallel.For(0, m2Cols, column =>
        {
            var ns = nonZeroColumns[column].Intersect(nonZeroRows[row]);
            double sum = ns.Sum(n => matrix1[row, n] * matrix2[n, column]);
            result[row, column] = sum;
        });
    });

    return result;
}

As you can see, there is a lot of reliance on the parallel methods that come with .NET 4. That, coupled with the trick of getting the intersection of the non-zeros in the rows of the left matrix with the columns of the right matrix, seems to be the major advantage of my method over Math.NET, because their assignments can't be done in parallel. This could be to do with Silverlight compatibility issues, I don't know. I don't have to worry about Silverlight.

I have run a benchmark for my code. I created a 5000 x 5000 point matrix and filled it at random points with random data (well, pseudo-random). I benchmarked at 5, 50, 150 and 500 non-zero items per row. I ran the test 10 times, to get a mean. The table shows the results:

Number of non-zeros per rowMean seconds taken to multiplyStandard Deviation
56.244657160.1037383251
5051.109723320.8521258197
15093.2973362977.751344564
50013.184354116.4991175895

I find it strange that the standard deviation for the 150 condition is so high. If anyone can see a problem in my code, I'd be really happy to hear it! The full test is below:

toggle test code
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading.Tasks;

namespace Math.NetBenchmark
{
    class Program
    {
        private static Random _r = new Random();

        static void Main(string[] args)
        {
            const int rows = 5000;
            const int cols = 5000;
            var nonzerosPerRow = new [] {5, 50, 150, 500};

            Console.WriteLine("started");
            using (var sw = new StreamWriter("MyMX10.results"))
            {
                sw.WriteLine("Number of non-zeros,Time taken");
                foreach (var nzpr in nonzerosPerRow)
                {
                    Console.Write(nzpr+" - making left");
                    var left = MakeMatrix(rows, cols, nzpr);
                    Console.Write("making right");
                    var right = MakeMatrix(rows, cols, nzpr);
                    Console.Write("multiplying...");
                    var startTime = DateTime.Now;
                    MultiplyMatrices(left, right);
                    var endTime = DateTime.Now;
                    var diff = endTime - startTime;
                    sw.WriteLine(nzpr + "," + diff.TotalSeconds);
                    Console.WriteLine("done");
                }
            }

            Console.WriteLine("done");
        }

        private static double[,] MakeMatrix(int rows, int cols, int nonzerosPerRow)
        {
            var result = new double[rows, cols];
            var colsPoss = Enumerable.Range(0, cols).ToArray();
            Parallel.For(0, rows, iRow =>
            {
                var posleft = colsPoss;
                Console.Write(".");
                for (int i = 0; i < nonzerosPerRow; i++)
                {
                    int posindex = _r.Next(posleft.Length);
                    int index = posleft[posindex];
                    result[iRow, index] = 1+_r.NextDouble();
                    posleft = posleft.Take(index).Concat(posleft.Skip(index+1)).ToArray();
                }
            });
            return result;
        }

        private static IEnumerable GetNonZeroIndicesForMatrixColumn(double[,] matrix, long col, int rowcount)
        {
            for (int row = 0; row < rowcount; row++)
            {
                if (matrix[row, col] != 0)
                {
                    yield return row;
                }
            }
        }

        private static IEnumerable GetNonZeroIndicesForMatrixRow(double[,] matrix, int row, int colcount)
        {
            for (int col = 0; col < colcount; col++)
            {
                if (matrix[row, col] != 0)
                {
                    yield return col;
                }
            }
        }

        public static double[,] MultiplyMatrices(double[,] matrix1, double[,] matrix2)
        {
            int j = matrix1.GetLength(1);
            if (j != matrix2.GetLength(0))
            {
                throw new ArgumentException("matrix1 must have the same number of columns as matrix2 has rows.");
            }

            int m1Rows = matrix1.GetLength(0);
            int m2Cols = matrix2.GetLength(1);
            double[,] result = new double[m1Rows, m2Cols];

            var nonZeroRows = new List[m1Rows];

            Parallel.For(0, m1Rows, row =>
            {
                nonZeroRows[row] = GetNonZeroIndicesForMatrixRow(matrix1, row, j).ToList();
            });

            var nonZeroColumns = new List[m2Cols];

            Parallel.For(0, m2Cols, col =>
            {
                nonZeroColumns[col] = GetNonZeroIndicesForMatrixColumn(matrix2, col, j).ToList();
            });

            Parallel.For(0, m1Rows, row =>
            {
                Parallel.For(0, m2Cols, column =>
                {
                    var ns = nonZeroColumns[column].Intersect(nonZeroRows[row]);
                    double sum = ns.Sum(n => matrix1[row, n] * matrix2[n, column]);
                    result[row, column] = sum;
                });
            });

            return result;
        }
    }
}

Friday, April 27, 2012

Object Thinking - Anthropomorphism

This follows on from Object Thinking - Objects have actions

Anthropomorphism is essential for object thinking to take place. Anthropomorphism is when a person attributes human mental states to other, non-human, things. Attributing human-like mental states to objects allows a programmer to treat the object as an agent, as opposed to something inanimate, and so bestow upon it appropriate behaviours allowing it to act in an appropriate manner within the application, interacting with other object. The amount of responsibility that you want an object to have will reflect how much you anthropomorphise it. It is important not to give an object too much responsibility, as explained by the Single Responsibility Principle.

That Anthropomorphism occurs is so obvious it doesn't need investigating! So obviously it has been researched by a huge number of people. The paper being looked at here, Making Sense by Making Sentient: Effectance Motivation Increases Anthropomorphism, by Waytz et al. in 2010[1], is one that attempts to explain why and how people anthropomorphise.

Their hypothesis is that one of the reasons people anthropomorphise objects because they want to increase their effectance motivation. This is the motivation to be an effective social agent. The researchers conduct six experiments based on this hypothesis.

The first experiment asked participants to rate their computers. Half of the participants (A) were asked to rate how much they felt their computer has a mind of its own. The other half (B) were asked to rate how much their computer appeared to behave as if it has its own beliefs and desires. Both sets were asked how often they had problems with the computer or its software. The hypothesis for the study is that the more problems people have with their computer, the more they will anthropomorphise it.

Results showed that, in accordance with the hypothesis, the more often participants in group A had problems with their computers, the more they thought their computers had minds of their own and that the more often participants in group B had problems, the more likely they were to believe their computers had beliefs and desires.

The second experiment asked participants to judge the agency of gadgets that had been assigned one of two descriptions about it. The gadget's description either made it seems as though what it did was within or outside the control of the user, but always described the same functionality. There were two groups of participants. They all saw the same set of gadgets, there were alternating sets of descriptions. After reading the descriptions, the participants were asked to rate how much control they thought they had over the gadgets, and then to assess how much the gadget had a “mind of its own”, “intentions, free will and consciousness” and appeared to experience emotions, in the same way they had to rate how much control they thought they had over the gadget.

In alignment with their hypothesis, the participants rated the gadgets with low controllability to be more anthropomorphic than those that were perceived to be more easy to control.

The third experiment was essentially replica of the second, but the participants were subject to an fMRI scan while rating the gadgets. This was conducted because the researched reasoned that people could be using mind as a metaphor for the behaviour they were seeing, rather than actually attributing minds to the objects. By determining the region of the brain in use when anthropomorphising takes place they could rule out certain modes of thinking and give weight to a possible seat for anthropomorphism in the brain. The researchers propose, through reference to previous studies, that the superior temporal sulcus (STS) is involved in social or biological motion, the medial prefrontal cortex (MPFC) is in use when considering people vs objects and considering the mind of another, and the amygdala, inferior parietal lobe and intraparietal sulcus are active when evaluating unpredictability. They therefore hypothesise that the MPFC will increase in activity when anthropomorphising.

The results of the experiment showed the ventral MPFC (vMPFC) to be the most active region, whereas the STS was not active.

The results also showed activation in a network of areas related to mentalising, which strongly resembles a circuit corresponding to processing of self-projection, mentalising and general social cognition, which is what would be expected for anthropomorphism.

This implies that unpredictable gadgets are perceived to have a mind, in an actual rather than metaphorical sense.

The results are inconsistent with the alternative hypotheses: attribution of mind to objects only related to social or biological motion analogies; that processing unpredictability is the cause of the activation; or that the activation is influenced by animism.

The fourth experiment asked participants to evaluate a robot that would answer yes/no questions the participants asked. There were three conditions that the participants were randomly assigned to: the condition where the robot answered yes as often as no, the condition where the robot answered no more often, and the condition where the robot answered yes more often. The second two conditions were the predictable conditions.

After asking the questions and receiving answers, the participants were asked to rate the robot on predictability, then on how much they thought it had free will, its own intentions, consciousness, desires, beliefs and the ability to express emotions. The participants were also asked to rate the robot on attractiveness, efficiency and strength. The ratings were done on a five point scale from “Not at all” (1) to “Extremely” (5).

Results from the experiment showed that participants in the predictable groups found the robot to be predictable, more-so than those in the unpredictable group. Also predicable-no was felt to be more predictable than predictable-yes.

Importantly anthropomorphism was found to be more prevalent where the robot would found to be less predictable.

The only significant difference between the conditions and the non-anthropomorphic evaluation was that predictable-yes participants found the robot to be more attractive than predictable-no. The researchers do not discuss this finding. There was no significant interaction found between liking the robot and anthropomorphising it.

These results show people anthropomorphise unpredictable agents, and present a causal link between the two. This is important as the previous three experiments could be interpreted as a simple association rather than a clear cognitive process.

Experiment five gave some participants motivation to predict the behaviour of a robot, and the others were asked to predict the behaviour with out being motivated. The hypothesis was that increasing motivation should increase motivation to understand, explain and predict an agent.

Participants evaluated a robot on a computer screen. They watched videos of the robot perform but not complete a task. Participants saw options of what the robot would do next and were asked to pick what they thought would happen. Participants in the motivation condition were offered $1 per correct answer. All participants then evaluated the robot's anthropomorphism. Finally the participants were shown the outcome, and compensated where necessary.

Results showed that motivated participants rated the robot as more anthropomorphic.

This shows that effectance motivation is increased when a person is motivated to understand an agent, and not simply controlled by the predictability of the agent.

The sixth and final experiment was predicated by the hypothesis that anthropomorphism should satisfy effectance motivation, i.e. anthropomorphism should satiate the motivation for mastery and make agents seem more predictable and understandable.

Participants evaluated four stimuli (dog, robot, alarm clock, shapes). Half of the participants were told to evaluate the dog and alarm clock objectively, and the robot and shapes in an anthropomorphic fashion, the other half were given the opposite instructions.

Each participant was shown a video of each stimulus three times. After the third time the participant was asked to evaluate the stimulus on two scales: the extent to which they understood the stimulus and the extent to which they felt capable of predicting its future behaviour.

The results showed that the dog and shapes were found to be easier to understand than the robot or alarm clock.

Importantly, participants perceived greater understanding and predictability of agents they had been told to anthropomorphise. The effect did not seem to depend on the group the participant was in.

This study implies that anthropomorphism satisfies effectance motivation.

It is clear from this paper that anthropomorphism is a natural part of human cognition, that is used to make behaviour of objects in the world around us seem more predictable and thus give us a better sense of control. It also shows that there is a neurological basis for this behaviour; the brain is set-up to anthropomorphise the world around us.

[1] Making Sense by Making Sentient: Effectance Motivation Increases Anthropomorphism. A. Waytz, C. K. Morewedge, N. Epley, G. Monteleone, J. H. Gao, J. T. Cacioppo. Journal of Personality and Social Psychology 2010, Vol.99, No.3, 410–435

Friday, February 17, 2012

What does #deadbeef; look like?

I've been working with WPF themes a lot this week. My task has been to take a theme from one application and put it into another, otherwise unrelated, application. This is not as easy as it sounds. The themes from the original application do not transplant to other applications without judicious use of a hacksaw.

While going through the theme's various XAML files I noticed things like color="#FF123456", and I looked and I couldn't figure out what colour I was looking at. There are a lot of these hex notation colours and they all seemed opaque to me.

It struck me that it would be nice if I could just hover my mouse over the hex and get the colour to pop up. Sounded like an easy enough task. So I set out to write an extension for Visual Studio to do just that.

My first attempt to write an extension to Firefox met with disappointment - I couldn't figure out how to get started - so I was a little apprehensive about writing an extension for Visual Studio. Luckily extensions for Visual Studio are easy to create (so long as you have Visual Studio).

  1. Download the Visual Studio 2010 SDK (or this one for Service Pack 1)
  2. Install the SDK
  3. Start a new project by selecting from the C#/Extensibility templates - I chose Editor Text Adornment, because I wanted to adorn the editor text with something.
The project comes with code already in place, so you can just hit F5 and you'll be able to see the extension at work right away. Then read the code to see how it works! It's pretty obvious, and with the Intellisense of Visual Studio you can discover all the bits you'll need with ease.

So my task, now that I have the ability to write an extension, was to write an extension that does what I want - i.e. show a colour swatch of the hex notation I'm hovering over.

Step 1: create a regex that picks out the hex. I tried one or two and settled on this one: #(([0-9A-F]{6})|([0-9A-F]{8})|([0-9A-F]{3}))["<;]. There might be ways to write it shorter, and I'm willing to hear them, but I'm not a regex guru, so I'll stick with simple. You'll notice that I've constrained the hex to start with a # and end with ", <, or ;. This way the regex will only pick up hex that is the right length, and not any old length, and is most likely meant to be a colour. All the colour hexes I could see ended in ", < or ;. I could have missed an edge case, but not so far!

Step 2: turn that string into a colour. There might be a library function for doing this, but I couldn't find it (would be glad if someone were to tell me about it!). I wrote my own:

private Tuple<byte, byte, byte, byte> BytesFromColourString(string colour)
{
    string alpha;
    string red;
    string green;
    string blue;

    if (colour.Length == 8)
    {
        alpha = colour.Substring(0, 2);
        red = colour.Substring(2, 2);
        green = colour.Substring(4, 2);
        blue = colour.Substring(6, 2);
    }
    else if (colour.Length == 6)
    {
        red = colour.Substring(0, 2);
        green = colour.Substring(2, 2);
        blue = colour.Substring(4, 2);
        alpha = "FF";
    }
    else if (colour.Length == 3)
    {
        red = colour.Substring(0, 1) + colour.Substring(0, 1);
        green = colour.Substring(1, 1) + colour.Substring(1, 1);
        blue = colour.Substring(2, 1) + colour.Substring(2, 1);
        alpha = "FF";
    }
    else
    {
    throw new ArgumentException(String.Format("The colour string may be 8, 6 or 3 characters long, the one passed in is {0}", colour.Length));
    }
    return new Tuple<byte, byte, byte, byte>( Convert.ToByte(alpha, 16)
                                            , Convert.ToByte(red, 16)
                                            , Convert.ToByte(green, 16)
                                            , Convert.ToByte(blue, 16));
}

OK, so this actually returns a Tuple<byte, byte, byte, byte>. I'm not entirely sure why I chose that over returning an actual colour. I might refactor that later. Anyway, turning the tuple into a System.Windows.Media.Color is a trivial call to the static method Color.FromArgb(byte, byte, byte, byte). Also, the above method is a brute force approach to breaking down the colour string into bytes, there could well be a better way. I'm sticking with what works until I'm shown something better.

My next hurdle was figuring out how to place the colour swatch where I wanted it. I was able to return the position in text the mouse was hovering over, which would give me a single character, but I couldn't think of how to use that position and character to get the hex colour string.

In the end I opted for a two stage approach. Stage one: when the layout updates, find the start and end positions for any colours in the view. Stage two: when the mouse is hovering somewhere, see if it's position is in any of the ranges previously stored.

Stage one looks like this:
private void OnLayoutChanged(object sender, TextViewLayoutChangedEventArgs e)
{
    _colourPositions = new List<Tuple<int, int, Color>>();
    var matches = Regex.Matches(_view.TextSnapshot.GetText(), "#(([0-9A-F]{6})|([0-9A-F]{8})|([0-9A-F]{3}))[\"<;]", RegexOptions.IgnoreCase);
    foreach(var m in matches)
    {
        var match = m as Match;
        var mgrp = match.Groups[1] as Group;
        var colourbytes = BytesFromColourString(mgrp.Value);
        var colour = Color.FromArgb(colourbytes.Item1, colourbytes.Item2, colourbytes.Item3, colourbytes.Item4);
        _colourPositions.Add(new Tuple<int,int,Color>(mgrp.Index, mgrp.Index + mgrp.Length, colour));
    }
}
I went with a list to store the position of the colours because I think it makes cleaner code than a dictionary would.
Stage two's like this:
private void ShowColourSwatch(int position, IMappingPoint textPosition, ITextView textView)
{
    _layer.RemoveAllAdornments();
    SnapshotPoint? snapPoint = textPosition.GetPoint(textPosition.AnchorBuffer, PositionAffinity.Predecessor);
    if (snapPoint.HasValue)
    {
        SnapshotSpan charSpan = textView.GetTextElementSpan(snapPoint.Value);
        var colourPos = _colourPositions.Find(cp => (cp.Item1 <= charSpan.Start) && (cp.Item2 >= charSpan.Start));
        if(colourPos != null)
        {
            Image image = CreateSwatchImage(colourPos, charSpan);

            _layer.AddAdornment(AdornmentPositioningBehavior.TextRelative, charSpan, null, image, null);
            Thread t = new Thread(p =>
            {
                Thread.Sleep(3500);
                lock (lockObject)
                {
                    Application.Current.Dispatcher.Invoke(new Action(() =>
                    {
                        _layer.RemoveAdornmentsByVisualSpan(charSpan);
                    }), new object[]{});
                }
            });
            t.Start();
        }
    }
}

The Thread in there just makes sure that the colour swatch disappears after three and a half seconds. CreateSwatchImage uses a lot of the code from the example project that Visual Studio gives you to start with, and just draws the colour swatch on a black and white background for contrast.

That is pretty much all the important code that I wrote in constructing the extension. There is one last snippet, I had to modify a single line in the auto-generated factory class so that the swatch would be above the text: [Order(After = PredefinedAdornmentLayers.Text, Before = PredefinedAdornmentLayers.Caret)]. Before that the property made the adornment go behind the text, which looked silly for my purposes.

The last thing that tripped me up was installing the extension. Obviously I can't sign my extension because I'm too cheap to pay for a certificate to do that with, so I can't get it put on the online extensions thing. However I was sure I could find a way. My first attempt was to double click on the .vsix file that Visual Studio had generated for me. This looked promising - it ran me through an install process and told me it had been successful, so I loaded up Visual Studio but my extension was no where to be found. I tried rebooting my computer, just in case, but to no avail. So I sought out where the extension had been placed and deleted it - which is how you are meant to uninstall extension, by the way - and went online to find out The Right Way™. A few places told me to put the extension in a folder under %appdata%, but that didn't seem to work. Eventually I found an MSDN page that explained I should be putting it under %localappdata%, which sorted me right out. Essentially the path should go something like %localappdata%Microsoft\VisualStudio\10.0\Extensions\[company]\[extensionName]\[version]\ although you can probably leave out [company] and [version] and it will still work. Once I put the extension there and loaded up Visual Studio, I checked the Extensions Manager in the tools menu and it was there, but needed enabling. After being enabled, and restarting Visual Studio, the extension was working like a charm! No more wondering about what a hex colour string means for me.

what #deadbeef; looks like


To view all the code for my extension, and download it for yourself, visit my Github repository.

Wednesday, October 19, 2011

Object Thinking - Objects have actions

This post follows on from Object Thinking - Objects: a neurological basis

The paper being reviewed is Micro-affordance: The potentiation of components of action by seen objects (Ellis and Tucker, 2000)[1]
 
The paper focuses on two experiments. The first is concerned with power and precision micro-affordance, and the second with wrist rotation micro-affordance.

In the first experiment the participants were told to memorise objects as they were shown them. They were then tested on the objects halfway through the experiment and at the end. During the memorisation phase, whenever they heard a tone, the participant was to either squeeze a cylindrical button with their whole hand, or pinch a small button between their index finger and thumb.

The type of grip response would be dependant on the type of tone; high or low. So there were two mappings known to the participants: high – large grip, low – small grip, and high – small grip, low – large grip. There were also two unknown mappings: high – large object, low – small object, and high – small object, low large object.

Each participant was assigned one mapping from each of the two groups and this was sustained throughout the experiment.

In the results from the experiment there was a statistically significant positive correlation between grip type and object type.

The second experiment was set up much the same as the first. The differences were that instead of large or small grips, the participant would make clockwise or anticlockwise wrist rotations dependant on tone, and the objects were categorised as ones more easily grasped with an anticlockwise or clockwise wrist rotation.

The results showed a statistically significant positive correlation between wrist rotation and object type.

The paper classifies micro-affordance (MA) as the state of an observer that gives rise to stimulus-response compatibility (SRC) between what the viewer sees and what actions they perform regardless of their intention. The theory is meant as a solution to the symbol grounding problem. (The reference to this problem in the paper is Harnad, 1990[2].)

The paper explains that SRC is demonstrated in many previous experiments, by various researchers, in forced choice reaction time tests. For example an advantage is gained when reaching for something on the left with the left hand, and similarly for the right. In fact an advantage is gained even in non-reaching tasks, where the location of the stimulus gives an advantage when it is on the same side as the response, this is known as the Simon Effect.

Previous experiments by Ellis and Tucker show that location is not the only action related feature encoded in this way.

This preparedness for action is thought to be a coordination of the what and where pathways in the brain.

The paper reports that the theoretical implications of the results of the study are:
  1. MA are different from Gibsonian affordance in that they suggest the affordance is encoded in the viewer's nervous system (not the object being viewed), they only apply to grasping, and only grasping appropriate to the object.
  2. SRC works because what is being responded to is unrelated to what is causing the compatibility effect. SRC theories suggest that stimulus → response options elicit particular mental codes, so the location of an object elicits a left or right handed response. MA, however, can be evoked without evoking a coherent action.
    This means that MA should interfere with SRC experiments.
    SRC effects have been modelled as ecological relations between visual properties and actions. They have also been modelled as effect codes that can be combined into whole actions.
    MA and these two approaches share the assumption that a compatibility effect arises from visual objects and possible, real-world actions that can be performed on them.
    MA diverges from the ecological approach by retaining representation of objects, and from effect codes by having a direct connection between vision and action. MA diverges from both because it states that actions are potentiated whenever an object is seen, regardless of the intention of the viewer.
  3. Developmentally, MA fits in well with the popular theory of Neural Darwinism. Development of adaptive behaviours requires integration of sensory and motor processes. The paper proposes learning coordinated actions result from gradual adaption of the neuron groups involved. This leads to coupling of motor and sensory systems.
    The implication of the experiments is that MA reflect the involvement of the motor components of the global mapping, which have come to represent visual objects.

So what does this tell us about how natural object thinking is? Object thinking requires that you understand the objects your are working with in terms of the behaviours that they can perform. You need to be able to create your objects so that discovering what behaviours are available is intuitive — i.e. when others come to your API they aren't spending hours going through the documentation, they can just get on and use it.

Ellis and Tucker show that the brain is well suited to understanding and preparing for expected behaviours. When we see an object, we immediately know the actions that the object has available, and are primed to use them.

This implies that once we have a good understanding of a problem domain, we should be able to model the behaviours of the objects in the domain intuitively, and anyone else with a good understanding of the problem domain will be able to intuitively discover each object and its behaviours.

The behaviour driven aspects of object thinking are intrinsic to how the human mind works at the brain level.

The next section deals with anthropomorphism, why OT needs it and where it comes from: Object Thinking - Anthropomorphism.

[1] Micro-affordance: The potentiation of components of action by seen objects; Rob Ellis, Mike Tucker. British Journal of Psychology (2000), 91, 451-471
[2] Harnad, S. (1990). The symbol grounding problem. Physica D, 42, 335±346. (As sited in [1])

Thursday, September 22, 2011

Object Thinking - Objects: a neurological basis



This post deals with how the brain perceives the world as objects.

A neurological perspective of how perception work, via studying perceptual disorders, is covered in chapter two of Neuropsychology: from theory to practice [1]. This is a review of that chapter.

Studying perceptual disorders tell us how we work by looking at damaged brains in people, or damaging brains in animals, and seeing how that affects what is perceived.

The chapter concentrates largely on visual perception, due to “the natural dominance of our visual sensory system”. It starts out by identifying two major pathways in the brain, the “what” pathway, which is responsible for identification of objects, and the “where” pathway, which is responsible for location position and motion. These were originally identified in monkeys in 1983 by Mishkin, Ungerleider and Macko. Milner and Goodale (1995) expanded on this model to explain that the “where” pathway is dedicated to the preparation of movement.

This demonstrates that humans understand the world as objects and actions. 

The chapter goes onto explain that these two pathways are linked, essentially the flow of data goes primary visual cortex → “what” pathway → “where” pathway → motor cortex. The system also gets feedback, via other pathways, from interactions with the environment to aid in learning. This of course means that we get better at performing actions the more we do them.

The next section of the chapter deals with sensation versus perception. It is not particularly relevant to this discussion. In short summary: sensation occurs before perception, and is not consciously recognised. In vision the sensation pathways are those that link the retina to the visual cortex. People with damage to these pathway will not notice that they don't see something, unless they are made aware of it appearing and disappearing from view.

Discussion of the hierarchy of the visual cortex follows on. This has quite a strong neurological focus, and describes a lot of the brain's structure in this area. The key point relevant here is that the brain is modular and parallel, which means that human thinking is modular and parallel, which is clearly analogous to separation of concerns. The parallelism is accomplished through pathways that allow feedback between modules. This could be thought of as message passing, although it might be a stretch to say it scales up to conscious thought.

Next the chapter discusses what certain disorders show us about visual perception. The two types of disorder covered are apperceptive agnosia – a condition that means the patient has a difficulty distinguishing between objects – and associative agnosia – in which the patient is unable to recognise objects or their functions.

Apperceptive agnosia, and its milder counter part; categorisation deficit, give strong evidence that the mind perceives the world as objects. People with these disorders cannot discern one object from another. This impedes problem solving, as the person with the condition does not know how to act on what they see. In fact, in the case of apperceptive agnosia, it can be equivalent to blindness, as those with the condition find it easier to navigate with their eyes shut.

Associative agnosia, prevents people from being able to recognise objects or their functions. This class of agnosia can affect any of the senses. The book focuses on vision.

People with associative agnosia can copy (e.g. by drawing) and match objects, but they cannot recognise. So it appears that primary perceptual processing is intact.

The current theory for what causes this agnosia is that the “what” pathway has become disconnected from the memory store for associative meaning. People with this condition can write something down, such as their name or address, but are completely unable to read it back. This is clear evidence that we use background knowledge to solve problems.

The chapter gives an example (p. 53) of a patient, with associative visual agnosia, who can only tell what a banana is after eating it, and even then only through logical deduction: “...and here I go right back to the stage where I say well if it's not a banana, we wouldn't have this fruit.”

The next section of the chapter discusses object and face recognition. The focus is on how this works at a neurological level, and the difference between face recognition and object recognition. The key point it makes is that the left hemisphere of the brain deals with parts of objects, and the right deals with objects as a whole. (Faces, are a special case, however, as they seem to be perceived as a whole, and not as parts, i.e. most of facial recognition is done in the right hemisphere.) The brain is set up to understand about composition.

The rest of the chapter focuses on describing top down (using past experience to influence perception) and bottom up (working from first principles) processing of visual information, and come to a conclusion about how the left and right hemispheres interact to give what we see meaning. Essentially they work together, the left hemisphere identifying objects and the meaning of objects, while the right analyses structural form, orientation and does holistic analysis of an object.

So, in conclusion, the chapter lays out clearly that human beings perceive the world as objects, even at a neurological level. This is our nature. Thus is makes sense when designing software to think of our problem space in terms of the objects in it. 

The next section will deal with why action is integral to how we think about the world, and can be found here: Object Thinking - Objects have actions.

[1] Neuropsychology: from theory to practice, David Andrewes (2001, Psychology Press)