Software Programming

Kunuk Nykjaer

Clustering K-means and Grid with C# example and html-canvas (part 2)

with 3 comments


This is a follow up to my previous post.
https://kunuk.wordpress.com/2011/09/15/clustering-grid-cluster

The next post with updated code can be seen here.
https://kunuk.wordpress.com/2011/09/20/markerclusterer-with-c-example-and-html-canvas-part-3/

I have implemented K-means clustering algorithm which automatically increments the K value until the errorlevel is low enough to be accepted. The running time of the K-means runs reasonable fast. It’s best used in combination of chosing the right amount of points and setting the max_error. If the max_error is small and the number of points are big then the running time will be very slow. There are max iterations set to avoid very long running time as I believe K-means is NP-hard.

The insertion of the extra cluster point is linear and works by inserting them at the best place and keeping the current cluster points. It finds the current bad performing cluster point regarding error value and insert a new cluster point in that region farthest away from the bad performing cluster point. By doing this the new points are inserted at place which seems logical while minimizing the overall error level.

The Grid clustering algorithm runs pretty fast. Even with 100.000 points the time is below 2 sec. For 10.000 points the time is below 1 sec.

There is added an optional functionality to move the cluster point position to the closest point in the region of the clustered area after the final centroid calculation. By using this your cluster point will be positioned visually on a true existing point in the region.

When used with maps and websites the grid clustering algorithm is a good candidate of chose because of it’s fast running time which is O(m*n) where m is the number of grids and n is the number of points, please comment if I have made a wrong statement 🙂
The K-means can also be used when you know the expected points and setting the max_error accordingly.

Below are some image examples and the C# source code bundled in one file to keep it simple.

K-means 0.7 sec 10.000 points, max_error 100

K-means 2.2 sec 10.000 points, max_error 50

K-means 0.2 sec 150 points, max_error 30

K-means, an example of how K-means can’t “see” all the true clusters. Here the K is manually set to 3.
This will not happen with the current code, but this is mainly to show some weakness of the K-means algorithm.

To many points view

Grid clustering algorithm with grid 7×6 took 0.04 sec. with 160 points.
The cluster points location are moved to the nearest point in the region after the centroid calculation in this example. Some grid merging has also occurred because the clusters were too close to each other.

Final clustered result view

Performance test of the grid by using lots of points.
Grid 8×7, 1.3 sec, 100.000 points.
With Grid 6×5, it’s about 3 sec, without centroid calculation and without saving to file for 1.000.000 points.

Algorithm.cs

 
using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Text;

// By Kunuk Nykjaer
// Clustering package, Grid clustering and auto increasing K-means clustering
// Version 0.2
public class Algorithm
{
    public static DateTime Starttime;
    static void Main(string[] args)
    {
        Starttime = DateTime.Now;
        var algo = new Algorithm();

        // run kmeans or grid
        algo.Run(ClusterType.KMeans); 
        //algo.Run(ClusterType.Grid);

        algo.GenerateJavascriptDrawFile(); //create draw.js

        var timespend = DateTime.Now.Subtract(Starttime).TotalSeconds;     
        Console.WriteLine(timespend+" sec. press a key ...");
        Console.ReadKey();
    }

    // use centroid or semi-centroid cluster point placement visualization?
    public const bool DoUpdateAllCentroidsToNearestContainingPoint = false;
    public const bool UseProfiling = false; //debug, output time spend

    public enum ClusterType { Unknown = -1, Grid, KMeans } ;
    static readonly CultureInfo Culture_enUS = new CultureInfo("en-US");
    const string S = "G";

    readonly List<XY> _points = new List<XY>();
    List<XY> _clusters = new List<XY>();

    public static readonly Random _rand = new Random();
    public const double MinX = 10;
    public const double MinY = 10;
    public const double MaxX = 400;
    public const double MaxY = 300;

    // used with Grid cluster algo
    public const int GridX = 8;
    public const int GridY = 7;

    const double AbsSizeX = MaxX - MinX;
    const double AbsSizeY = MaxY - MinY;

    // for bucket placement calc, grid cluster algo
    const double DeltaX = AbsSizeX / GridX;
    const double DeltaY = AbsSizeY / GridY;

    static private readonly string NL = Environment.NewLine;
    private const string ctx = "ctx"; // javascript canvas
    public bool DisplayGridInCanvas; //autoset by used cluster type

    public Algorithm()
    {
        CreateDataSet();
    }

    public static string GetId(int idx, int idy) //O(1)
    {
        return idx + ";" + idy;
    }

    void CreateDataSet() //O(1)
    {
        // CREATE DATA SET

        // Create random scattered points
        //for (int i = 0; i < 150; i++)
        //{
        //    var x = MinX + _rand.NextDouble() * (AbsSizeX);
        //    var y = MinY + _rand.NextDouble() * (AbsSizeY);
        //    _points.Add(new XY(x, y));
        //}
        
        // Create random region of clusters
        for (int i = 0; i < 50; i++)
        {
            var x = MinX + _rand.NextDouble() * 100;
            var y = MinY + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }

        for (int i = 0; i < 50; i++)
        {
            var x = MinX + 200 + _rand.NextDouble() * 100;
            var y = MinY + 10 + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }

        for (int i = 0; i < 50; i++)
        {
            var x = MinX + 50 + _rand.NextDouble() * 100;
            var y = MinY + 150 + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }         
        
    }

    public void Run(ClusterType clustertype)
    {
        switch (clustertype)
        {
            case ClusterType.Grid:
                _clusters = new GridBaseCluster(_points).GetCluster();
                DisplayGridInCanvas = true;
                break;
            case ClusterType.KMeans:
                _clusters = new KMeans(_points).GetCluster();
                DisplayGridInCanvas = false;
                break;
        }
    }

    public static string GetRandomColor()
    {
        int r = _rand.Next(10, 240);
        int g = _rand.Next(10, 240);
        int b = _rand.Next(10, 240);
        return string.Format("'rgb({0},{1},{2})'", r, g, b);
    }

    static void CreateFile(string s)
    {
        var path = new FileInfo(@"C:\temp\draw.js");
        var isCreated = FileUtil.WriteFile(s, path);
        Console.WriteLine(isCreated + " = File is Created");
    }
    public void GenerateJavascriptDrawFile()
    {
        var sb = new StringBuilder();

        // markers
        var head = NL + "function drawMarkers(ctx) {" + NL;
        var tail = NL + "}" + NL;

        sb.Append(head);

        // grid
        if (DisplayGridInCanvas)
        {
            sb.Append("ctx.beginPath();" + NL);
            for (int i = 0; i < GridX + 1; i++)
            {
                sb.Append(String.Format("ctx.moveTo({0}, {1});{2}",
                                        ToStringEN(MinX + i * DeltaX), 
                                        ToStringEN(MinY), NL));
                sb.Append(String.Format("ctx.lineTo({0}, {1});{2}",
                                        ToStringEN(MinX + i * DeltaX), 
                                        ToStringEN(MaxY), NL));
            }
            for (int j = 0; j < GridY + 1; j++)
            {
                sb.Append(String.Format("ctx.moveTo({0}, {1});{2}",
                                        ToStringEN(MinX), 
                                        ToStringEN(MinY + j * DeltaY), NL));
                sb.Append(String.Format("ctx.lineTo({0}, {1});{2}",
                                        ToStringEN(MaxX), 
                                        ToStringEN(MinY + j * DeltaY), NL));
            }
            sb.Append("ctx.stroke(); " + NL);
        }

        // markers
        if (_points != null)
            foreach (var p in _points)
                sb.Append(string.Format("drawMark({0}, {1}, {2}, {3});{4}",
                    ToStringEN(p.X), ToStringEN(p.Y), p.Color, ctx, NL));

        string clusterInfo = "0";
        if (_clusters != null)
        {
            foreach (var c in _clusters)
                sb.Append(string.Format("drawCluster({0}, {1}, {2}, {3}, {4});{5}",
                    ToStringEN(c.X), ToStringEN(c.Y), c.Color,
                                        c.Size, ctx, NL));

            clusterInfo = _clusters.Count + string.Empty;
        }

        // bottom text                
        sb.Append("ctx.fillStyle = 'rgb(0,0,0)';" + NL);
        sb.Append(string.Format("ctx.fillText('Clusters = ' + {0}, {1}, {2});{3}",
            clusterInfo, ToStringEN(MinX + 10), ToStringEN(MaxY + 20), NL));

        sb.Append(tail);
        CreateFile(sb.ToString());
        //Console.WriteLine(sb.ToString());
    }
    public static string ToStringEN(double d)
    {
        double rounded = Math.Round(d, 3);
        return rounded.ToString(S, Culture_enUS);
    }
    public static void Profile(string s)
    {
        if(!UseProfiling)
            return;        
        var timespend = DateTime.Now.Subtract(Starttime).TotalSeconds;
        Console.WriteLine(timespend + " sec. " + s);
    }//O(1)

    public class XY : IComparable
    {
        public double X { get; set; }
        public double Y { get; set; }
        public string Color { get; set; }
        public int Size { get; set; }
        public string XToString { get { return X.ToString(S, Culture_enUS); } }
        public string YToString { get { return Y.ToString(S, Culture_enUS); } }
        public XY(double x, double y)
        {
            X = x;
            Y = y;
            Color = "'rgb(0,0,0)'";//default
            Size = 1;
        }
        public XY(XY p) //clone
        {
            this.X = p.X;
            this.Y = p.Y;
            this.Color = p.Color;
            this.Size = p.Size;
        }

        public int CompareTo(object o) // if used in sorted list
        {
            if (this.Equals(o))
                return 0;

            var other = (XY)o;
            if (this.X > other.X)
                return -1;
            if (this.X < other.X)
                return 1;

            return 0;
        }
        public override int GetHashCode()
        {
            var x = X * 10000;
            var y = Y * 10000;
            var r = x * 17 + y * 37;
            return (int)r;
        }
        private const int ROUND = 6;
        public override bool Equals(Object o)
        {
            if (o == null)
                return false;
            var other = o as XY;
            if (other == null)
                return false;

            var x = Math.Round(this.X, ROUND) == Math.Round(other.X, ROUND);
            var y = Math.Round(this.Y, ROUND) == Math.Round(other.Y, ROUND);
            return x && y;
        }
    }
    public class Bucket
    {
        public string Id { get; private set; }
        public List<XY> Points { get; private set; }
        public XY Centroid { get; set; }
        public int Idx { get; private set; }
        public int Idy { get; private set; }
        public double ErrorLevel { get; set; } // clusterpoint and points avg dist
        private bool _IsUsed;
        public bool IsUsed
        {
            get { return _IsUsed && Centroid != null; }
            set{_IsUsed = value; }
        }
        public Bucket(string id)
        {
            IsUsed = true;
            Centroid = null;
            Points = new List<XY>();
            Id = id;
        }
        public Bucket(int idx, int idy)
        {
            IsUsed = true;
            Centroid = null;
            Points = new List<XY>();
            Idx = idx;
            Idy = idy;
            Id = GetId(idx, idy);
        }
    }

    public abstract class BaseClusterAlgorithm
    {
        public List<XY> BaseDataset; // all points
        //id, bucket
        public readonly Dictionary<string, Bucket> BaseBucketsLookup = 
            new Dictionary<string, Bucket>();
        public abstract List<XY> GetCluster();
        //O(k?? random fn can be slow)
        public static XY[] BaseGetRandomCentroids(List<XY> list, int k)
        {
            var set = new HashSet<XY>();
            int i = 0;
            var kcentroids = new XY[k];

            int MAX = list.Count;
            while (MAX >= k)
            {
                int index = _rand.Next(0, MAX - 1);
                var xy = list[index];
                if (set.Contains(xy))
                    continue;

                set.Add(xy);
                kcentroids[i++] = new XY(xy.X, xy.Y) { Color = GetRandomColor() };

                if (i >= k)
                    break;
            }

            Profile("BaseGetRandomCentroids");
            return kcentroids;
        }

        public List<XY> BaseGetClusterResult()//O(k*n)
        {
            var clusterPoints = new List<XY>();
            // collect used buckets and return the result

            if (DoUpdateAllCentroidsToNearestContainingPoint)
                BaseUpdateAllCentroidsToNearestContainingPoint();

            foreach (var item in BaseBucketsLookup)
            {
                var bucket = item.Value;
                if (bucket.IsUsed)
                    clusterPoints.Add(bucket.Centroid);
            }

            Profile("BaseGetClusterResult");
            return clusterPoints;
        }
        public static XY BaseGetCentroidFromCluster(List<XY> list) //O(n)
        {
            int count = list.Count;
            if (list == null || count == 0)
                return null;

            // color is set for the points and the cluster point here
            var color = GetRandomColor(); //O(1)
            XY centroid = new XY(0, 0) { Size = list.Count };//O(1)
            foreach (XY p in list)
            {
                p.Color = color;
                centroid.X += p.X;
                centroid.Y += p.Y;
            }
            centroid.X /= count;
            centroid.Y /= count;
            var cp = new XY(centroid.X, centroid.Y) { Size = count, Color = color };

            Profile("BaseGetCentroidFromCluster");
            return cp;
        }
        //O(k*n)
        public static void BaseSetCentroidForAllBuckets(IEnumerable<Bucket> buckets)
        {
            foreach (var item in buckets)
            {
                var bucketPoints = item.Points;
                var cp = BaseGetCentroidFromCluster(bucketPoints);
                item.Centroid = cp;
            }
            Profile("BaseSetCentroidForAllBuckets");
        }
        public double BaseGetTotalError()//O(k)
        {
            int centroidsUsed = BaseBucketsLookup.Values.Count(b => b.IsUsed);
            double sum = BaseBucketsLookup.Values.Where(b => b.IsUsed).Sum(b => b.ErrorLevel);
            return sum / centroidsUsed;
        }
        public string BaseGetMaxError() //O(k)
        {
            double maxError = -double.MaxValue;
            string id = string.Empty;
            foreach (var b in BaseBucketsLookup.Values)
            {
                if (!b.IsUsed || b.ErrorLevel <= maxError) 
                    continue;

                maxError = b.ErrorLevel;
                id = b.Id;
            }
            return id;
        }
         public XY BaseGetClosestPoint(XY from, List<XY> list) //O(n)
         {
             double min = double.MaxValue;
             XY closests = null;
             foreach (var p in list)
             {
                 var d = MathTool.Distance(from, p);
                 if (d >= min) 
                     continue;

                 // update
                 min = d;
                 closests = p;
             }
             return closests;
         }        
         public XY BaseGetLongestPoint(XY from, List<XY> list) //O(n)
         {
             double max = -double.MaxValue;
             XY longest = null;
             foreach (var p in list)
             {
                 var d = MathTool.Distance(from, p);
                 if (d <= max)
                     continue;

                 // update
                 max = d;
                 longest = p;
             }
             return longest;
         }

        // update centroid location to nearest point, 
        // e.g. if you want to show cluster point on a real existing point area
         //O(n)
         public void BaseUpdateCentroidToNearestContainingPoint(Bucket bucket) 
         {
             if(bucket==null || bucket.Centroid==null || 
                 bucket.Points==null||bucket.Points.Count==0)
                 return;

             var closest = BaseGetClosestPoint(bucket.Centroid, bucket.Points);
             bucket.Centroid.X = closest.X;
             bucket.Centroid.Y = closest.Y;
             Profile("BaseUpdateCentroidToNearestContainingPoint");
         }
         //O(k*n)
         public void BaseUpdateAllCentroidsToNearestContainingPoint() 
        {
             foreach (var bucket in BaseBucketsLookup.Values)             
                 BaseUpdateCentroidToNearestContainingPoint(bucket);
             Profile("BaseUpdateAllCentroidsToNearestContainingPoint");
        }
    }

    // O(exponential) ~ can be slow when n or k is big
    public class KMeans : BaseClusterAlgorithm
    {
        private const int _InitClusterSize = 1; // start from this cluster points

        // heuristic, set tolerance for cluster density, has effect on running time.
        // Set high for many points in dataset, can be lower for fewer points
        private const double MAX_ERROR = 30;             
            
        private const int _MaxIterations = 100; // cluster point optimization iterations
        private const int _MaxClusters = 100;

        public KMeans(List<XY> dataset)
        {
            if (dataset==null || dataset.Count==0)
                throw new ApplicationException(
                    string.Format("KMeans dataset is null or empty") );

            this.BaseDataset = dataset;            
        }

        public override List<XY> GetCluster()
        {
            var cluster = RunClusterAlgo();
            return cluster;
        }

        List<XY> RunClusterAlgo()
        {
            /*
            ITERATIVE LINEAR ADDING CLUSTER UNTIL 
            REPEATE iteration of clusters convergence 
             * until the max error is small enough
            if not, insert a new cluster at worst place 
            (farthest in region of worst cluster area) and run again 
            keeping current cluster points              
             
             // one iteration of clusters convergence is defined as ..
          1) Random k centroids
          2) Cluster data by euclidean distance to centroids
          3) Update centroids by clustered data,     
          4) Update cluster
          5) Continue last two steps until error is small, error is sum of diff
              between current and updated centroid                          
           */

            var clusterPoints = new List<XY>();
            // params invalid
            if (BaseDataset == null || BaseDataset.Count == 0)
                return clusterPoints;

            RunAlgo();         
            return BaseGetClusterResult();
        }

        void RunAlgo()
        {
            // Init clusters
            var centroids = BaseGetRandomCentroids(BaseDataset, _InitClusterSize);
            for (int i = 0; i < centroids.Length; i++)
            {
                var newbucket = new Bucket(i.ToString()) { Centroid = centroids[i] };
                BaseBucketsLookup.Add(i.ToString(), newbucket);
            }            

            //
            double currentMaxError = double.MaxValue;
            while (currentMaxError > MAX_ERROR && BaseBucketsLookup.Count < _MaxClusters)
            {
                RunIterationsUntilKClusterPlacementAreDone();                

                var id = BaseGetMaxError();
                var bucket = BaseBucketsLookup[id];
                currentMaxError = bucket.ErrorLevel; //update
                if (currentMaxError > MAX_ERROR)
                {                                        
// Here it is linear speed when putting one new centroid at a time
// should be semi-fast because the new point is inserted at best area 
//from current centroids view.
// possible improvement exists by log2 search by inserting multiple centroids and 
// reducing centroid again by using closest pair algo which can be O(nlogn) to 
// find closest cluster dist and analysing the dist length, if its to small, 
//then reduce centroids.

// put new centroid in area where maxError but farthest away from current centroid in area
                    var longest = BaseGetLongestPoint(bucket.Centroid, bucket.Points);
                    var newcentroid = new XY(longest);
                    var newid = BaseBucketsLookup.Count.ToString();
                    var newbucket = new Bucket(newid) { Centroid = newcentroid };
                    BaseBucketsLookup.Add(newid, newbucket);
                }
            }                     
        }

        void RunIterationsUntilKClusterPlacementAreDone()
        {
            double prevError = Double.MaxValue;
            double currError = Double.MaxValue;

            for (int i = 0; i < _MaxIterations; i++)
            {
                prevError = currError;
                currError = RunOneIteration();
                if (currError >= prevError) // no improvement then stop
                    break;
            }
            Profile("RunIterationsUntilKClusterPlacementAreDone");
        }

        double RunOneIteration() //O(k*n)
        {
            // update points, assign points to cluster
            UpdatePointsByCentroid();
            // update centroid pos by its points
            BaseSetCentroidForAllBuckets(BaseBucketsLookup.Values);//O(k*n)
            var clustersCount = BaseBucketsLookup.Count;
            for (int i = 0; i < clustersCount; i++)
            {
                var currentBucket = BaseBucketsLookup[i.ToString()];
                if (currentBucket.IsUsed == false)
                    continue;

                //update centroid                
                var newcontroid = BaseGetCentroidFromCluster(currentBucket.Points);
                //no need to update color, autoset
                currentBucket.Centroid = newcontroid; 
                currentBucket.ErrorLevel = 0;
                //update error                
                foreach (var p in currentBucket.Points)
                {
                    var dist = MathTool.Distance(newcontroid, p);
                    currentBucket.ErrorLevel += dist;
                }
                var val = currentBucket.ErrorLevel/currentBucket.Points.Count;
                currentBucket.ErrorLevel = val; //Math.Sqrt(val);
            }

            Profile("RunOneIteration");
            return BaseGetTotalError();
        }
        void UpdatePointsByCentroid()//O(n*k)
        {
            int count = BaseBucketsLookup.Count();
          
            // clear points in the buckets, they will be re-inserted
            foreach (var bucket in BaseBucketsLookup.Values)
                bucket.Points.Clear();

            foreach (XY p in BaseDataset)
            {
                double minDist = Double.MaxValue;
                int index = -1;
                for (int i = 0; i < count; i++)
                {
                    var bucket = BaseBucketsLookup[i.ToString()];
                    if (bucket.IsUsed == false)
                        continue;

                    var centroid = bucket.Centroid;
                    var dist = MathTool.Distance(p, centroid);
                    if (dist < minDist)
                    {
                        // update
                        minDist = dist;
                        index = i;
                    }
                }
                //update color for point to match centroid and re-insert
                var closestBucket = BaseBucketsLookup[index.ToString()];
                p.Color = closestBucket.Centroid.Color;
                closestBucket.Points.Add(p);
            }

            Profile("UpdatePointsByCentroid");
        }
    }

    public class GridBaseCluster : BaseClusterAlgorithm
    {
        public GridBaseCluster(List<XY> dataset)
        {
            BaseDataset = dataset;
        }

        public override List<XY> GetCluster()
        {
            var cluster = RunClusterAlgo(GridX, GridY);
            return cluster;
        }

        // O(k*n)
        void MergeClustersGrid()
        {
            foreach (var key in BaseBucketsLookup.Keys)
            {
                var bucket = BaseBucketsLookup[key];
                if (bucket.IsUsed == false)
                    continue;

                var x = bucket.Idx;
                var y = bucket.Idy;

                // get keys for neighbors
                var N = GetId(x, y + 1);
                var NE = GetId(x + 1, y + 1);
                var E = GetId(x + 1, y);
                var SE = GetId(x + 1, y - 1);
                var S = GetId(x, y - 1);
                var SW = GetId(x - 1, y - 1);
                var W = GetId(x - 1, y);
                var NW = GetId(x - 1, y - 1);
                var neighbors = new[] { N, NE, E, SE, S, SW, W, NW };

                MergeClustersGridHelper(key, neighbors);
            }
        }
        void MergeClustersGridHelper(string currentKey, string[] neighborKeys)
        {
            double minDistX = DeltaX / 2.0;//heuristic, higher value gives less merging
            double minDistY = DeltaY / 2.0;
            //if clusters in grid are too close to each other, merge them
            double minDist = MathTool.Max(minDistX, minDistY);

            foreach (var neighborKey in neighborKeys)
            {
                if (!BaseBucketsLookup.ContainsKey(neighborKey))
                    continue;
                
                var neighbor = BaseBucketsLookup[neighborKey];
                if (neighbor.IsUsed == false)
                    continue;

                var current = BaseBucketsLookup[currentKey];
                var dist = MathTool.Distance(current.Centroid, neighbor.Centroid);
                if (dist > minDist)
                    continue;

                // merge
                var color = current.Centroid.Color;
                foreach (var p in neighbor.Points)
                {
                    //update color
                    p.Color = color;
                }

                current.Points.AddRange(neighbor.Points);//O(n)

                // recalc centroid
                var cp = BaseGetCentroidFromCluster(current.Points);
                current.Centroid = cp;
                neighbor.IsUsed = false; //merged, then not used anymore
            }
        }

        public List<XY> RunClusterAlgo(int gridx, int gridy)
        {
            var clusterPoints = new List<XY>();
            // params invalid
            if (BaseDataset == null || BaseDataset.Count == 0 || 
                gridx <= 0 || gridy <= 0)
                return clusterPoints;

            // put points in buckets
            foreach (var p in BaseDataset)
            {
                // find relative val
                var relativeX = p.X - MinX;
                var relativeY = p.Y - MinY;

                int idx = (int)(relativeX / DeltaX);
                int idy = (int)(relativeY / DeltaY);

                // bucket id
                string id = GetId(idx, idy);

                // bucket exists, add point
                if (BaseBucketsLookup.ContainsKey(id))
                    BaseBucketsLookup[id].Points.Add(p);

                // new bucket, create and add point
                else
                {
                    Bucket bucket = new Bucket(idx, idy);
                    bucket.Points.Add(p);
                    BaseBucketsLookup.Add(id, bucket);
                }
            }

            // calc centrod for all buckets
            BaseSetCentroidForAllBuckets(BaseBucketsLookup.Values);

            // merge if gridpoint is to close
            MergeClustersGrid();

            return BaseGetClusterResult();
        }
    }

    public class MathTool
    {
        private const double Exp = 2; // 2=euclid, 1=manhatten
        //Minkowski dist
        public static double Distance(Algorithm.XY a, Algorithm.XY b)
        {
            return Math.Pow(Math.Pow(Math.Abs(a.X - b.X), Exp) +
                Math.Pow(Math.Abs(a.Y - b.Y), Exp), 1.0 / Exp);
        }

        public static double Min(double a, double b)
        {
            return a <= b ? a : b;
        }
        public static double Max(double a, double b)
        {
            return a >= b ? a : b;
        }
    }

    public static class FileUtil
    {
        public static bool WriteFile(string data, FileInfo fileInfo)
        {
            bool isSuccess = false;
            try
            {
                using (StreamWriter streamWriter = 
                    File.CreateText(fileInfo.FullName))
                {
                    streamWriter.Write(data);
                    isSuccess = true;
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.StackTrace + "\nPress a key ... ");
                Console.ReadKey();
            }
            return isSuccess;
        }
    }
}

Put both the html and javascript file in the c:\temp folder.
The C# program creates a new draw.js file in the c:\temp folder.

Open the html file with a modern browser that support canvas, like Firefox or Chrome.

canvas.html

<html>
<head>
    <title>Cluster test</title>
    <script type="text/javascript">
        var G = {
            yellow: "#FF0",
            red: "#F00",
            green: "#0F0",
            blue: "#00F",
            black: "#000"
        }       

        window.onload = function draw() {
            var canvas = document.getElementById('canvas');
            if (canvas.getContext) {
                var ctx = canvas.getContext('2d');
                ctx.strokeStyle = G.black;
                ctx.font = "10pt Arial";
                drawMarkers(ctx); //use fn defined in draw.js
            }
        }

        function drawMark(cx, cy, color, ctx) {
            //draw a circle
            var radius = 4;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();
            ctx.fillStyle = color;
            ctx.fill();
            ctx.stroke();
        }

        function drawCluster(cx, cy, color, n, ctx) {
            //draw a bigger circle
            var radius = 12;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();
            ctx.stroke();
            ctx.fillStyle = color;
            ctx.fillText("" + n, cx-8, cy+2);
        }
    </script>
    <script type="text/javascript" src="draw.js"></script> 

    <style type="text/css">
        body{ margin-left: 10px; margin-top:10px;}
        canvas  {  border: 1px solid red; }
    </style>
</head>
<body>
    <canvas id="canvas" width="410" height="330">
        <p>Your browser doesn't support canvas.</p>
    </canvas>
</body>
</html>

The javascript file will be overwritten when you run the C# code.
draw.js

function drawMarkers(ctx) {
    drawMark(161.089, 215.198, 'rgb(82,128,16)', ctx);
    drawMark(283.556, 12.712, 'rgb(92,49,66)', ctx);
    drawMark(98.915, 37.67, 'rgb(181,136,150)', ctx);
    drawMark(318.432, 102.905, 'rgb(92,49,66)', ctx);
    drawMark(378.091, 238.713, 'rgb(182,83,25)', ctx);
    drawMark(318.736, 80.199, 'rgb(92,49,66)', ctx);
    drawMark(122.702, 252.349, 'rgb(82,128,16)', ctx);
    drawMark(173.836, 44.669, 'rgb(181,136,150)', ctx);
    drawMark(141.157, 172.273, 'rgb(82,128,16)', ctx);
    drawMark(102.749, 56.737, 'rgb(181,136,150)', ctx);
    drawCluster(306.908, 65.272, 'rgb(92,49,66)', 3, ctx);
    drawCluster(378.091, 238.713, 'rgb(182,83,25)', 1, ctx);
    drawCluster(141.649, 213.273, 'rgb(82,128,16)', 3, ctx);
    drawCluster(125.167, 46.359, 'rgb(181,136,150)', 3, ctx);
    ctx.fillStyle = 'rgb(0,0,0)';
    ctx.fillText('Clusters = ' + 4, 20, 320);
}
Advertisements

Written by kunuk Nykjaer

September 17, 2011 at 9:35 am

3 Responses

Subscribe to comments with RSS.

  1. […] with one comment Grid clustering With merging supported if the centroid of the grid clusters are too close to each other. The second post with updated code can be seen here. https://kunuk.wordpress.com/2011/09/17/clustering-k-means-and-grid-with-c-example-and-html-canvas-par… […]

  2. I would like to use this Kmeans simulation with a SQL server database as a datasource. It is possible? Would you kindly suggest where should I start or what step to get started in doing so. Thanks in advanced. (Sorry my english, and still a newbie in programming c#)

    Gynehs

    August 23, 2013 at 6:47 pm


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: