Software Programming

Kunuk Nykjaer

Posts Tagged ‘clustering

K-nearest neighbor in 2D dimension space

with 4 comments

K-nearest neighbor

There are miscellaneous algorithms for searching nearest neighbors.

An alternative method is to use grid indexing strategy.
Slowly expand the grid boxes from the center to find the k-nearest neighbors.
The grid is used as a filtering mechanism to reduce the search space.

This works fast for distributed data across the space and is a good alternative algorithm for dynamic data that changes position frequently.

Reference links:

Project implemented in C# is available at Github:
A survey of techniques for fixed radius near neighbor searching:

knn

The origin is in ring 0.
The origin can anywhere in the box and the distance to test on first iteration must be at least 1 x grid distance. The first iteration starts with data from ring 0 and 1.

The algorithm goes like this:

i = 0
Point origin
list currRing = empty
list nextRing = empty
list temp
while all rings not explored
   i = i + 1
   temp = empty
   for all point in nextRing test distance between origin and point
       if distance is within i * grid put it in currRing
       else put it in temp

   nextRing = empty
   Add all from temp to nextRing

   temp = empty
   if(i==1) temp = points from ring 0 and ring 1
   else temp = points from ring i

   For all point in temp test distance between origin and point
       if distance is within i * grid put it in currRing
       else put it in nextRing

   If there are at least k points in currRing then exit while loop
end while loop

if currRing count is < k then add all from nextRing to currRing

sort currRing by distance
take first k from currRing


knn example

Grid version algorithm:
Apply algorithm described above.
Searching: O(n * m) where m is grid cells and k << n.

Naive version algorithm:
For origin point check the distance to every other point.
Take k first items sorted by distance.
Searching: O(n log n)

Advertisements

Written by kunuk Nykjaer

January 21, 2013 at 9:36 pm

Single detection in 2D dimension

leave a comment »

Single detection

Here I define single detection as data which has a certain minimum distance to every other data.
It can be used for anomaly detection or outlier detection of data set. It can also be used for collision detection.

Reference links:

Project implemented in C# is available at Github:
A survey of techniques for fixed radius near neighbor searching:

Given a list of points in 2D dimension how can you find the ones which are not close to the other points?

I will include my Big O notation calculations and some test runs of the implementations.

There is a naive implementation in O(n2).
There are also better alternatives which runs faster than the naive version.

First generate a grid of squares like this and put all your data in the grid according to their positions in the dimension.

grid layout

In the figure above there is a single item illustrated with a red dot inside the blue box.

For anomaly detection iterate the grid boxes and find those which has a single item inside a box. Those items are candidate as an anomaly item. If there are multiple items inside the blue box then it cannot be a candidate as an anomaly item because of the distance constraint. The candidate can be anywhere inside the box, thus the outer circle shows which areas that must be examined. If there is no other item inside the outer circle, then we know the candidate is an anomaly. On worst case 20 boxes must be examined as illustrated above.


Euclidean distance
is used.
The relation between the max distance and grid square distance can be calculated using Pythagoras.
Max Distance = c
Grid box distance = a

c2 = a2 + a2 => a = sqrt( (c2) / 2)

Naive version algorithm:
For all the points check the distance to every other point.
Searching: O(n2)

Grid version algorithm:
Insert all the items in the grid buckets.
Take the neighbor items of the candidate item’s grid. You will select items from 20 boxes as a worst case. Then iterate all the items and test the distance. If all the distance are above the maximum allowed then the candidate is detected as an anomaly.

This algorithm is much better than the naive version, runs very fast and is relatively simple to implement. Each grid box is a bucket of items. Use a Hash table for each bucket.
Inserting the items into the grids take O(n). Removing and inserting an item takes O(1).
Searching for anomalies take O(m * n).

If there are A anomalies in the data set and m grids this algorithm will run in:
Initialization: O(n + m) + Searching: O(m * A * n/A) => O(m * n)

K-d tree with nearest neighbor algorithm:
Insert all the items in the grid buckets takes O(n).
Do also apply a K-d tree data structure. From the candidate item’s in the box apply the first nearest neighbor algorithm. If the distance is above maximum allowed then it is detected as an anomaly.

It take O(n log n) time to generate the K-d tree. The nearest neighbor search takes
O(log n).
Removing and inserting an item takes O(log n). If there are A anomalies in the data set and m grids then searching for anomalies take O(m * A * log n).

This algorithm will run in:
initialization: O(m + n * log n) + Searching: O(m * A * log n)

Test results
With my test runs of items randomly distributed data the grid version performs best. The naive version is very slow. The K-d Tree version is much faster than the naive version.

I found this C# K-d Tree implementation with nearest neighbor search which I used for testing the algorithm.

Sample picture:
test result


The test runs are the time to detect the anomalies, the searching running time.
The data structure initialization is not included.
For n = 1.000.000.
K-d Tree it took about 30 seconds.
Grid and Naive about 2 seconds.

Random distributed, n = 100
Algorithm:     Milli seconds:   Single detection found:
NeighborGrid       12                   40
K-d Tree          128                   40
Naive               2                   40


Random distributed, n = 10.000
Algorithm:     Milli seconds:   Single detection found:
NeighborGrid       141                  8384
K-d Tree          2543                  8384
Naive            36421                  8384


Random distributed, n = 20.000
Grid: 4286 x 3572
MaxDistance: 0.2

Algorithm:     Milli seconds:   Single detection found:
NeighborGrid       1434                   18682
K-d Tree           5878                   18682
Naive             129401                  18682


Random distributed, n = 30.000
Grid: 1705 x 1421
MaxDistance: 0.5

Algorithm:     Milli seconds:   Single detection found:
NeighborGrid        521                     27100
K-d Tree            7936                    27100
Naive               slow                  


grid size is smaller thus the NeighborGrid runs fast.


Random distributed, n = 50.000
Grid: 4286 x 3572
MaxDistance: 0.2

Algorithm:     Milli seconds:   Single detection found:
NeighborGrid        1472                   42268
K-d Tree           13361                   42268
Naive              very slow                  


Random distributed, n = 1.000.000
Grid: 1705 x 1421
MaxDistance: 0.5

Algorithm:     Milli seconds:   Single detection found:
NeighborGrid           983              35824
K-d Tree             23966              35824
Naive              forever                  


Conclusion
These are just sample runs from random distributed data.
If the number of anomalies were very high and the data set were different then the K-d Tree algorithm might run faster. I would guess that the data set would have to be very specific and not likely to be a common scenario. I have given my best estimated Big O running time analysis for the algorithms.

The algorithm runs fast because of the single item grid box test. First they iterate the grid and skip every grid box except for those who has a single item inside. If you expect few anomalies for large data set then this would run very fast because most grid boxes would be skipped for further examination. Thus the running time for anomaly search is closer to O(m) for A << n.

If you are working with dynamic data where positions changes, then the Grid algorithm will run faster because of the O(1) operations. With K-d tree you will have to rebuild the tree occasionally and delete/update takes O(log n).

Sample picture for n = 100
test result

Written by kunuk Nykjaer

January 13, 2013 at 8:14 am

Google Maps Server-side Clustering with Asp.net C#

with 35 comments

  • This project is available at Github
  • Original project location is available at Google Code

For a project in my work I made a server-side clustering of data to be displayed in Google Maps.

The clustering is a grid-based clustering using geo-stationary grid-placement.
I was impressed by the performance of the solution from Maptimize
They have this example webside Crunch Panorama

I wanted to make something similar and have made an example project of server-side clustering. The demo can be seen here Google Maps Clustering. The grid has been included in the view for demonstration. Clusters that are close to each other are merged.

The search options is included by example code from Cibul Tech Blog.


googlemapclustering

Google Maps Clustering Viewport:

Here is the overview of how the viewport works.
googlemaps-clustering-viewport

Client-server Ajax communication demonstration:

This Google Maps clustering project uses two webservice methods: GetMarkers and GetMarkerDetail.
The GetMarkers method returns the marker and cluster information and GetMarkerDetail method returns specific information for a selected marker.

googlemapclustering SD

GetMarkers method:
The client sends the boundary of the screen using Norht-east and South-west latlon point and the zoom level.
Client json get markers request:

 
{"nelat":"62.511244050310154","nelon":"30.008460312500006",
  "swlat":"47.574956142655","swlon":"-5.235680312499994","zoomlevel":"5"}

The server does the server-side clustering based on the information from the client and returns only the data that is to be displayed to the client. Here it is two markers with latlon information. When the count (the C) is larger than 1, then it is a cluster marker, else it is a details marker. The details marker has the I and T information set. The I is the id and the T is type information.
Server json show markers reply:

 
{"Points":[
  {"Y":"56.05777","X":"9.30294","C":1624,"I":"","T":""},
  {"Y":"55.09446","X":"15.11401","C":"1","I":"aa005c3b-9301-4040-bac3-3dee8cb29b48","T":"2"}
]}

The client examines the reply and removes markers that should no longer be visible and only adds new marker to the screen. The latlon and count information are used to build a key for identifying unique marker objects.

GetMarkerDetail method:
The client sends a details request for a specific marker using an id-information.
Client json get marker detail: request

 
  {"id":"aa005c3b-9301-4040-bac3-3dee8cb29b48"}

The server replies by giving content for the specific marker.
Server json show marker detail reply:

 
  {"Content":"CONTENT INFORMATION HERE"}

Math information

The centroid calculation is based on circular mean. This is because the longitude overlaps at -180 and 180 degrees. There are no overlapping using latitude with Google Maps (haven’t seen or noticed it).

The distances are calculated using Euclidean distance. This calculation is fast and it is good enough for clustering the points into a cluster.

You usually only need as a maximum 6 decimal precision when using latlon and Google Maps. This gives within 1 meter precision.

The grid-size is fixed and adapts to the zoom level.
By doing one step zooming in Google Maps the latlon window range is doubled or halved.
The geo-stationary grid-placement indexes are achieved using divide and modulus calculation.

Time complexity is on average(m*n) and space complexity is O(m*n)
where n is the number of points used in total and m is the number of grids returned to the client.
You are welcome to analyse it yourself and correct me. I use hashset and hashmap to minimize the time complexity. For fast enough response time (below 1 sec) the number of markers should be max 300.000.
Time complexity is ~ O(n^2) on worst case but extremely unlikely,
happens if most centroids are merged with neighbor-centroids.

The DB in this project is a simple file with list of points which is loaded in the memory. The algorithm uses linear search and can (probably) be improved using spatial data structure.

Written by kunuk Nykjaer

November 5, 2011 at 6:46 pm

MarkerClusterer with C# example and html-canvas (part 3)

with 2 comments

This is part 3 of my topic on point clustering.
I will look at the marker clustering algorithm.

This is a follow up of my previous post.
https://kunuk.wordpress.com/2011/09/17/clustering-k-means-and-grid-with-c-example-and-html-canvas-part-2/

The MarkerClusterer algorithm is simple. It iterates the point and put them in boxes.
The first item is put in a box. The following items are only put in a box if they are within a specified distance from any existing box. If not then a new box is created and the item is put in the new box.

The algorithm is also explained here:
https://developers.google.com/maps/articles/toomanymarkers

My implementation running time is O(k*n) where k is number of boxes and n is number of points.
(Please make a comment if I am wrong).
It is possible to manipulate the running time by setting the box-size. If the box-size is big the k will be smaller and the running time faster.

Imagine you have a set of point like this.
And you are interested in clustering to avoid overlapping points and for reducing the visual overload.

Here’s and example of how marker clustering works.
Every point has been put in box by running the defined algorithm above. the boxsize is user-defined.
The distance of the cluster point can minimum be boxsize / 2.
Thus we don’t need to consider merging clusters if they should be to close to each other and overlap.

Here is the box-area removed for a more clean look.
Notice that the points are not all inside the nearest box. This usually happens but there is a simple fix explained later if needed.

And the points are removed for an even more clean look. Now there are only cluster points with numbers that tells how many points they contain.

As the last step in the algorithm all the points are iterated and compared to the existing cluster points. They are assigned to the nearest cluster. This takes O(k*n). The cluster points are located at the center of the boxes.
Here you can see an example how it looks like but this image is for another dataset.
Notice at top left corner there is one red point inside the green box which is not inside it’s ‘correct’ box. This is not an error but because the nearest cluster point for the red point is the red box, the distances are calculated by comparing the center of the boxes to the points. If needed it can also be ‘fixed’ but that is usually not needed.

Alternatively I could also have used a distance-based approach, where you use circles instead of boxes. This is explained in the google code reference link at the top and I will skip the description of the algorithm because it is very similar to the marker clusterer. But I have included the source code for the distance clustering at the bottom. Here’s an image of how the clustering would like like.

Here are some performance test for the marker clusterer algorithm executed from my laptop.
about 0.15 sec for 10.000 points.
about 7.5 sec for 1.000.000 points where the k = 4.

With a bigger box area which can contain all the points, i.e. k = 1, the performance is about 4.8 sec for 1.000.000 points and 0.4 sec. for 100.000 points.

Summary
For the task of clustering overlapping points.
In summary of the 4 clusters: grid, k-means, marker and distance I would usually prefer grid, marker or distance clustering as they have fast running time O(k*n) and you can easily manipulate the running time by adjusting the k-value by defining the grid-size, boxsize or distance-size. K-means can have poor running time depending on your dataset and max_error level. K-means will run slower than the other 3 algorithms for large n.

This is my bundle file that contains the 4 clustering algorithms (grid, k-means, marker, distance) in one package.
The other two clustering algorithms (grid and k-means) are explained in my previous post.
Out of the box, the C# file runs marker clustering with similar result as the images above. It generates a draw.js
file with Canvas drawing information used by the canvas.html file.
All the classes are put in one file for simple distribution. Refactor as you like.

AlgorithmClusterBundle.cs

 
using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;

namespace ClusterBundle
{
    // By Kunuk Nykjaer
    // Clustering package, Grid clustering, MarkerClusterer and auto increasing K-means clustering
    // should split all class to each file and refactor all global const away
    // quick source code bundle version
    // Version 0.4
    //
    // for visualization this generates a draw.js file which is used by the canvas.html file
    // the canvas.html is distributed elsewhere, e.g. at my blog at https://kunuk.wordpress.com
    public class AlgorithmClusterBundle
    {
        public static DateTime Starttime;
        static void Main(string[] args)
        {
            Starttime = DateTime.Now;
            var algo = new AlgorithmClusterBundle();

            // run clustertype
            algo.Run(clusterType);
            algo.GenerateJavascriptDrawFile(clusterType); //create draw.js

            var timespend = DateTime.Now.Subtract(Starttime).TotalSeconds;
            Console.WriteLine(timespend + " sec. press a key ...");
            Console.ReadKey();
        }

        //------------------------------
        // USER CONFIG, CUSTOMIZE BELOW
        public const ClusterType clusterType = ClusterType.MarkerClusterer;
        public static readonly int N = 100; //number of points

        // use centroid or semi-centroid cluster point placement visualization?
        public const bool DoUpdateAllCentroidsToNearestContainingPoint = false;
        public const bool UseProfiling = false; //debug, output time spend

        // window frame boundary, aka viewport
        public const double MinX = 10;
        public const double MinY = 10;
        public const double MaxX = 400;
        public const double MaxY = 300;
        // where the files are saved or loaded
        public const string FolderPath = @"c:\temp\";

        // K-MEANS config
        // heuristic, set tolerance for cluster density, has effect on running time.
        // Set high for many points in dataset, can be lower for fewer points
        public const double MAX_ERROR = 50;

        // MARKERCLUSTERER config
        // box area i.e. cluster size
        public const int MARKERCLUSTERER_SIZE = 100;

        // DISTANCECLUSTERER config
        // radius size i.e. cluster size
        public const int DISTANCECLUSTERER_SIZE = 62;

        // GRIDCLUSTER config
        // grid size
        public const int GridX = 7;
        public const int GridY = 6;        
        public const bool DoMergeGridIfCentroidsAreCloseToEachOther = true;
        // USER CONFIG, CUSTOMIZE ABOVE
        //------------------------------



        public enum ClusterType { Unknown = -1, Grid, KMeans, MarkerClusterer, DistanceClusterer } ;
        static readonly CultureInfo Culture_enUS = new CultureInfo("en-US");
        const string S = "G";

        readonly List<XY> _points = new List<XY>();
        List<XY> _clusters = new List<XY>();

        public static readonly Random Rand = new Random();

        const double AbsSizeX = MaxX - MinX;
        const double AbsSizeY = MaxY - MinY;

        // for bucket placement calc, grid cluster algo
        const double DeltaX = AbsSizeX / GridX;
        const double DeltaY = AbsSizeY / GridY;

        static private readonly string NL = Environment.NewLine;
        private const string ctx = "ctx"; // javascript canvas
        public bool DisplayGridInCanvas = false; //autoset by used cluster type

        public AlgorithmClusterBundle()
        {
            CreateTestDataSet(N); // create points in memory
            //SaveDataSetToFile(_points); // save test points to file
            //_points = LoadDataSetFromFile(); // load test points from file            
        }

        // dictionary lookup key used by grid cluster algo
        public static string GetId(int idx, int idy) //O(1)
        {
            return idx + ";" + idy;
        }

        // save points from mem to file
        private const string DatasetSerializeName = "dataset.ser";
        static void SaveDataSetToFile(List<XY> dataset)
        {
            var objectToSerialize = new DatasetToSerialize { Dataset = dataset };
            new Serializer().SerializeObject(FolderPath + DatasetSerializeName, objectToSerialize);
        }
        // load points from file to mem
        static List<XY> LoadDataSetFromFile()
        {
            var objectToSerialize = (DatasetToSerialize)
                (new Serializer().DeSerializeObject(FolderPath + DatasetSerializeName));
            return objectToSerialize.Dataset;
        }

        void CreateTestDataSet(int n) //O(n)
        {
            // CREATE DATA SET

            // Create random scattered points
            //for (int i = 0; i < n; i++)
            //{
            //    var x = MinX + Rand.NextDouble() * (AbsSizeX);
            //    var y = MinY + Rand.NextDouble() * (AbsSizeY);
            //    _points.Add(new XY(x, y));
            //}


            // Create random region of clusters
            for (int i = 0; i < n / 3; i++)
            {
                var x = MinX + 50 + Rand.NextDouble() * 100;
                var y = MinY + 50 + Rand.NextDouble() * 100;
                _points.Add(new XY(x, y));
            }

            for (int i = 0; i < n / 3; i++)
            {
                var x = MinX + 260 + Rand.NextDouble() * 100;
                var y = MinY + 110 + Rand.NextDouble() * 100;
                _points.Add(new XY(x, y));
            }

            for (int i = 0; i < n / 3; i++)
            {
                var x = MinX + 100 + Rand.NextDouble() * 100;
                var y = MinY + 200 + Rand.NextDouble() * 100;
                _points.Add(new XY(x, y));
            }
        }

        public void Run(ClusterType clustertype)
        {
            switch (clustertype)
            {
                case ClusterType.Grid:
                    _clusters = new GridCluster(_points).GetCluster();
                    DisplayGridInCanvas = true;
                    break;
                case ClusterType.KMeans:
                    _clusters = new KMeans(_points).GetCluster();
                    break;
                case ClusterType.MarkerClusterer:
                    _clusters = new MarkerClusterer(_points).GetCluster();
                    break;
                case ClusterType.DistanceClusterer:
                    _clusters = new DistanceClusterer(_points).GetCluster();
                    break;
                case ClusterType.Unknown:
                    break;
            }
        }

        public static string GetRandomColor()
        {
            int r = Rand.Next(10, 250);
            int g = Rand.Next(10, 250);
            int b = Rand.Next(10, 250);
            return string.Format("'rgb({0},{1},{2})'", r, g, b);
        }

        static void CreateFile(string s)
        {
            var path = new FileInfo(FolderPath + "draw.js");
            var isCreated = FileUtil.WriteFile(s, path);
            Console.WriteLine(isCreated + " = File is Created");
        }
        public void GenerateJavascriptDrawFile(ClusterType clusterType)
        {
            var sb = new StringBuilder();

            // markers
            var head = "function drawMarkers(ctx) {" + NL;
            var tail = NL + "}" + NL;

            sb.Append(head);

            // grid
            if (DisplayGridInCanvas)
            {
                sb.Append("ctx.beginPath();" + NL);
                for (int i = 0; i < GridX + 1; i++)
                {
                    sb.Append(String.Format("ctx.moveTo({0}, {1});{2}",
                                            ToStringEN(MinX + i * DeltaX),
                                            ToStringEN(MinY), NL));
                    sb.Append(String.Format("ctx.lineTo({0}, {1});{2}",
                                            ToStringEN(MinX + i * DeltaX),
                                            ToStringEN(MaxY), NL));
                }
                for (int j = 0; j < GridY + 1; j++)
                {
                    sb.Append(String.Format("ctx.moveTo({0}, {1});{2}",
                                            ToStringEN(MinX),
                                            ToStringEN(MinY + j * DeltaY), NL));
                    sb.Append(String.Format("ctx.lineTo({0}, {1});{2}",
                                            ToStringEN(MaxX),
                                            ToStringEN(MinY + j * DeltaY), NL));
                }
                sb.Append("ctx.stroke(); " + NL);
            }

            // if to many points, the canvas can not be drawn or is slow, 
            // use max points and clusters for drawing
            const int max = 10000;

            // markers
            if (_points != null)
            {
                sb.Append(NL);
                for (int i = 0; i < _points.Count; i++)
                {
                    var p = _points[i];
                    sb.Append(string.Format("drawMark({0}, {1}, {2}, {3});{4}",
                              ToStringEN(p.X), ToStringEN(p.Y), p.Color, ctx, NL));
                    if (i > max)
                        break;
                }
            }
            string clusterInfo = "0";
            if (_clusters != null)
            {
                sb.Append(NL);

                if (clusterType == ClusterType.Grid || clusterType == ClusterType.KMeans)
                {
                    for (int i = 0; i < _clusters.Count; i++)
                    {
                        var c = _clusters[i];
                        sb.Append(string.Format("drawCluster({0}, {1}, {2}, {3}, {4});{5}",
                                                ToStringEN(c.X), ToStringEN(c.Y), c.Color,
                                                c.Size, ctx, NL));

                        if (i > max)
                            break;
                    }
                }

                else if (clusterType == ClusterType.MarkerClusterer)
                {
                    for (int i = 0; i < _clusters.Count; i++)
                    {
                        var c = _clusters[i];
                        sb.Append(string.Format(
                            "drawMarkerCluster({0}, {1}, {2}, {3}, {4}, {5});{6}",
                                                ToStringEN(c.X), ToStringEN(c.Y), c.Color,
                                                c.Size, MARKERCLUSTERER_SIZE, ctx, NL));

                        if (i > max)
                            break;
                    }
                }
                else if (clusterType == ClusterType.DistanceClusterer)
                {
                    for (int i = 0; i < _clusters.Count; i++)
                    {
                        var c = _clusters[i];
                        sb.Append(string.Format(
                            "drawDistanceCluster({0}, {1}, {2}, {3}, {4}, {5});{6}",
                                                ToStringEN(c.X), ToStringEN(c.Y), c.Color,
                                                c.Size, DISTANCECLUSTERER_SIZE, ctx, NL));

                        if (i > max)
                            break;
                    }
                }

                clusterInfo = _clusters.Count + string.Empty;
            }

            // bottom text                
            sb.Append("ctx.fillStyle = 'rgb(0,0,0)';" + NL);
            sb.Append(string.Format("ctx.fillText('Clusters = ' + {0}, {1}, {2});{3}",
                clusterInfo, ToStringEN(MinX + 10), ToStringEN(MaxY + 20), NL));

            sb.Append(tail);
            CreateFile(sb.ToString());
            //Console.WriteLine(sb.ToString());
        }
        public static string ToStringEN(double d)
        {
            double rounded = Math.Round(d, 3);
            return rounded.ToString(S, Culture_enUS);
        }
        public static void Profile(string s)//O(1)
        {
            if (!UseProfiling)
                return;
            var timespend = DateTime.Now.Subtract(Starttime).TotalSeconds;
            Console.WriteLine(timespend + " sec. " + s);
        }


        [Serializable()]
        public class XY : IComparable, ISerializable
        {
            public double X { get; set; }
            public double Y { get; set; }
            public string Color { get; set; }
            public int Size { get; set; }

            //public string XToString(){return X.ToString(S, Culture_enUS);}
            //public string YToString(){return Y.ToString(S, Culture_enUS);}

            public XY() { }
            public XY(double x, double y)
            {
                X = x;
                Y = y;
                Color = "'rgb(0,0,0)'";//default            
            }
            public XY(XY p) //clone
            {
                this.X = p.X;
                this.Y = p.Y;
                this.Color = p.Color;
                this.Size = p.Size;
            }

            public int CompareTo(object o) // if used in sorted list
            {
                if (this.Equals(o))
                    return 0;

                var other = (XY)o;
                if (this.X > other.X)
                    return -1;
                if (this.X < other.X)
                    return 1;

                return 0;
            }

            // used by k-means random distinct selection of cluster point
            public override int GetHashCode()
            {
                var x = X * 10000; //make the decimals be important
                var y = Y * 10000;
                var r = x * 17 + y * 37;
                return (int)r;
            }
            private const int ROUND = 6;
            public override bool Equals(Object o)
            {
                if (o == null)
                    return false;
                var other = o as XY;
                if (other == null)
                    return false;

                // rounding could be skipped
                // depends on granularity of wanted decimal precision
                // note, 2 points with same x,y is regarded as being equal
                var x = Math.Round(this.X, ROUND) == Math.Round(other.X, ROUND);
                var y = Math.Round(this.Y, ROUND) == Math.Round(other.Y, ROUND);
                return x && y;
            }


            public XY(SerializationInfo info, StreamingContext ctxt)
            {
                this.X = (double)info.GetValue("X", typeof(double));
                this.Y = (double)info.GetValue("Y", typeof(double));
                this.Color = (string)info.GetValue("Color", typeof(string));
                this.Size = (int)info.GetValue("Size", typeof(int));
            }

            public void GetObjectData(SerializationInfo info, StreamingContext ctxt)
            {
                info.AddValue("X", this.X);
                info.AddValue("Y", this.Y);
                info.AddValue("Color", this.Color);
                info.AddValue("Size", this.Size);
            }
        }
        public class Bucket
        {
            public string Id { get; private set; }
            public List<XY> Points { get; private set; }
            public XY Centroid { get; set; }
            public int Idx { get; private set; }
            public int Idy { get; private set; }
            public double ErrorLevel { get; set; } // clusterpoint and points avg dist
            private bool _IsUsed;
            public bool IsUsed
            {
                get { return _IsUsed && Centroid != null; }
                set { _IsUsed = value; }
            }
            public Bucket(string id)
            {
                IsUsed = true;
                Centroid = null;
                Points = new List<XY>();
                Id = id;
            }
            public Bucket(int idx, int idy)
            {
                IsUsed = true;
                Centroid = null;
                Points = new List<XY>();
                Idx = idx;
                Idy = idy;
                Id = GetId(idx, idy);
            }
        }


        public abstract class BaseClusterAlgorithm
        {
            public List<XY> BaseDataset; // all points
            //id, bucket
            public readonly Dictionary<string, Bucket> BaseBucketsLookup =
                new Dictionary<string, Bucket>();

            public BaseClusterAlgorithm(){}
            public BaseClusterAlgorithm(List<XY> dataset)
            {
                if (dataset == null || dataset.Count == 0)
                    throw new ApplicationException(
                        string.Format("dataset is null or empty"));

                BaseDataset = dataset;
            }

            public abstract List<XY> GetCluster();
            //O(k?? random fn can be slow, but is not slow because atm the k is always 1)
            public static XY[] BaseGetRandomCentroids(List<XY> list, int k)
            {
                Profile("BaseGetRandomCentroids beg");
                var set = new HashSet<XY>();
                int i = 0;
                var kcentroids = new XY[k];

                int MAX = list.Count;
                while (MAX >= k)
                {
                    int index = Rand.Next(0, MAX - 1);
                    var xy = list[index];
                    if (set.Contains(xy))
                        continue;

                    set.Add(xy);
                    kcentroids[i++] = new XY(xy.X, xy.Y) { Color = GetRandomColor() };

                    if (i >= k)
                        break;
                }
                return kcentroids;
            }

            public List<XY> BaseGetClusterResult()
            {
                Profile("BaseGetClusterResult beg");                

                // collect used buckets and return the result
                var clusterPoints = new List<XY>();
                foreach (var item in BaseBucketsLookup)
                {
                    var bucket = item.Value;
                    if (bucket.IsUsed)
                    {
                        bucket.Centroid.Size = bucket.Points.Count;
                        clusterPoints.Add(bucket.Centroid);
                    }
                }

                return clusterPoints;
            }
            public static XY BaseGetCentroidFromCluster(List<XY> list) //O(n)
            {
                Profile("BaseGetCentroidFromCluster beg");
                int count = list.Count;
                if (list == null || count == 0)
                    return null;

                // color is set for the points and the cluster point here
                var color = GetRandomColor(); //O(1)
                XY centroid = new XY(0, 0) { Size = list.Count };//O(1)
                foreach (XY p in list)
                {
                    p.Color = color;
                    centroid.X += p.X;
                    centroid.Y += p.Y;
                }
                centroid.X /= count;
                centroid.Y /= count;
                var cp = new XY(centroid.X, centroid.Y) { Size = count, Color = color };

                return cp;
            }
            //O(k*n)
            public static void BaseSetCentroidForAllBuckets(IEnumerable<Bucket> buckets)
            {
                Profile("BaseSetCentroidForAllBuckets beg");
                foreach (var item in buckets)
                {
                    var bucketPoints = item.Points;
                    var cp = BaseGetCentroidFromCluster(bucketPoints);
                    item.Centroid = cp;
                }
            }
            public double BaseGetTotalError()//O(k)
            {
                int centroidsUsed = BaseBucketsLookup.Values.Count(b => b.IsUsed);
                double sum = BaseBucketsLookup.Values.
                    Where(b => b.IsUsed).Sum(b => b.ErrorLevel);
                return sum / centroidsUsed;
            }
            public string BaseGetMaxError() //O(k)
            {
                double maxError = -double.MaxValue;
                string id = string.Empty;
                foreach (var b in BaseBucketsLookup.Values)
                {
                    if (!b.IsUsed || b.ErrorLevel <= maxError)
                        continue;

                    maxError = b.ErrorLevel;
                    id = b.Id;
                }
                return id;
            }
            public XY BaseGetClosestPoint(XY from, List<XY> list) //O(n)
            {
                double min = double.MaxValue;
                XY closests = null;
                foreach (var p in list)
                {
                    var d = MathTool.Distance(from, p);
                    if (d >= min)
                        continue;

                    // update
                    min = d;
                    closests = p;
                }
                return closests;
            }
            public XY BaseGetLongestPoint(XY from, List<XY> list) //O(n)
            {
                double max = -double.MaxValue;
                XY longest = null;
                foreach (var p in list)
                {
                    var d = MathTool.Distance(from, p);
                    if (d <= max)
                        continue;

                    // update
                    max = d;
                    longest = p;
                }
                return longest;
            }
            // assign all points to nearest cluster
            public void BaseUpdatePointsByCentroid()//O(n*k)
            {
                Profile("UpdatePointsByCentroid beg");
                int count = BaseBucketsLookup.Count();

                // clear points in the buckets, they will be re-inserted
                foreach (var bucket in BaseBucketsLookup.Values)
                    bucket.Points.Clear();

                foreach (XY p in BaseDataset)
                {
                    double minDist = Double.MaxValue;
                    string index = string.Empty;
                    //for (int i = 0; i < count; i++)
                    foreach (var i in BaseBucketsLookup.Keys)
                    {
                        var bucket = BaseBucketsLookup[i];
                        if (bucket.IsUsed == false)
                            continue;

                        var centroid = bucket.Centroid;
                        var dist = MathTool.Distance(p, centroid);
                        if (dist < minDist)
                        {
                            // update
                            minDist = dist;
                            index = i;
                        }
                    }
                    //update color for point to match centroid and re-insert
                    var closestBucket = BaseBucketsLookup[index];
                    p.Color = closestBucket.Centroid.Color;
                    closestBucket.Points.Add(p);
                }
            }

            // update centroid location to nearest point, 
            // e.g. if you want to show cluster point on a real existing point area
            //O(n)
            public void BaseUpdateCentroidToNearestContainingPoint(Bucket bucket)
            {
                Profile("BaseUpdateCentroidToNearestContainingPoint beg");
                if (bucket == null || bucket.Centroid == null ||
                    bucket.Points == null || bucket.Points.Count == 0)
                    return;

                var closest = BaseGetClosestPoint(bucket.Centroid, bucket.Points);
                bucket.Centroid.X = closest.X;
                bucket.Centroid.Y = closest.Y;
            }
            //O(k*n)
            public void BaseUpdateAllCentroidsToNearestContainingPoint()
            {
                Profile("BaseUpdateAllCentroidsToNearestContainingPoint beg");
                foreach (var bucket in BaseBucketsLookup.Values)
                    BaseUpdateCentroidToNearestContainingPoint(bucket);
            }
        }

        public class DistanceClusterer : BaseClusterAlgorithm
        {
            public DistanceClusterer(List<XY> dataset)
                : base(dataset)
            {                
            }

            public override List<XY> GetCluster()
            {
                var cluster = RunClusterAlgo();
                return cluster;
            }

            // O(k*n)
            List<XY> RunClusterAlgo()
            {
                // put points in buckets     
                int allPointsCount = BaseDataset.Count;
                var firstPoint = BaseDataset[0];
                firstPoint.Color = GetRandomColor();
                var firstId = 0.ToString();
                var firstBucket = new Bucket(firstId) { Centroid = firstPoint };
                BaseBucketsLookup.Add(firstId, firstBucket);

                for (int i = 1; i < allPointsCount; i++)
                {
                    var set = new HashSet<string>(); //cluster candidate list
                    var p = BaseDataset[i];
                    // iterate clusters and collect candidates
                    foreach (var bucket in BaseBucketsLookup.Values)
                    {                        
                        var isInCluster = MathTool.DistWithin(p, bucket.Centroid, DISTANCECLUSTERER_SIZE);
                        if (!isInCluster)
                            continue;

                        set.Add(bucket.Id);
                        //use first, short dist will be calc at last step before returning data
                        break;
                    }

                    // if not within box area, then make new cluster   
                    if (set.Count == 0)
                    {
                        var pid = i.ToString();
                        p.Color = GetRandomColor();
                        var newbucket = new Bucket(pid) { Centroid = p };
                        BaseBucketsLookup.Add(pid, newbucket);
                    }
                }

                //important, align all points to closest cluster point
                BaseUpdatePointsByCentroid();

                return BaseGetClusterResult();
            }

        }

        public class MarkerClusterer : BaseClusterAlgorithm
        {
            public MarkerClusterer(List<XY> dataset) : base(dataset)
            {                
            }

            public override List<XY> GetCluster()
            {
                var cluster = RunClusterAlgo();
                return cluster;
            }

            // O(k*n)
            List<XY> RunClusterAlgo()
            {                
                // put points in buckets     
                int allPointsCount = BaseDataset.Count;
                var firstPoint = BaseDataset[0];
                firstPoint.Color = GetRandomColor();
                var firstId = 0.ToString();
                var firstBucket = new Bucket(firstId) { Centroid = firstPoint };                
                BaseBucketsLookup.Add(firstId, firstBucket);
                
                for (int i = 1; i < allPointsCount; i++)
                {
                    var set = new HashSet<string>(); //cluster candidate list
                    var p = BaseDataset[i];
                    // iterate clusters and collect candidates
                    foreach (var bucket in BaseBucketsLookup.Values)
                    {
                        var isInCluster = MathTool.BoxWithin(p, bucket.Centroid, MARKERCLUSTERER_SIZE);
                        if (!isInCluster) 
                            continue;

                        set.Add(bucket.Id);
                        //use first, short dist will be calc at last step before returning data
                        break;
                    }

                    // if not within box area, then make new cluster   
                    if (set.Count == 0)
                    {
                        var pid = i.ToString();
                        p.Color = GetRandomColor();
                        var newbucket = new Bucket(pid) { Centroid = p };                        
                        BaseBucketsLookup.Add(pid, newbucket);
                    }
                }

                //important, align all points to closest cluster point
                BaseUpdatePointsByCentroid();
               
                return BaseGetClusterResult();
            }
        }

        // O(exponential) ~ can be slow when n or k is big
        public class KMeans : BaseClusterAlgorithm
        {
            private readonly int _InitClusterSize; // start from this cluster points
            // Rule of thumb k = sqrt(n/2)        

            // cluster point optimization iterations
            private const int _MaxIterations = 100; 
            private const int _MaxClusters = 100;

            public KMeans(List<XY> dataset) : base(dataset)
            {                
                _InitClusterSize = 1;
            }

            public override List<XY> GetCluster()
            {
                var cluster = RunClusterAlgo();
                return cluster;
            }

            List<XY> RunClusterAlgo()
            {
                /*
                ITERATIVE LINEAR ADDING CLUSTER UNTIL 
                REPEATE iteration of clusters convergence 
                   until the max error is small enough
                if not, insert a new cluster at worst place 
                (farthest in region of worst cluster area) and run again 
                keeping current cluster points              
             
                 // one iteration of clusters convergence is defined as ..
              1) Random k centroids
              2) Cluster data by euclidean distance to centroids
              3) Update centroids by clustered data,     
              4) Update cluster
              5) Continue last two steps until error is small, error is sum of diff
                  between current and updated centroid                          
               */

                RunAlgo();

                if (DoUpdateAllCentroidsToNearestContainingPoint)
                    BaseUpdateAllCentroidsToNearestContainingPoint();
                return BaseGetClusterResult();
            }

            void RunAlgo()
            {
                // Init clusters
                var centroids = BaseGetRandomCentroids(BaseDataset, _InitClusterSize);
                for (int i = 0; i < centroids.Length; i++)
                {
                    var pid = i.ToString();
                    var newbucket = new Bucket(pid) { Centroid = centroids[i] };
                    BaseBucketsLookup.Add(pid, newbucket);
                }

                //
                double currentMaxError = double.MaxValue;
                while (currentMaxError > MAX_ERROR && BaseBucketsLookup.Count < _MaxClusters)
                {
                    RunIterationsUntilKClusterPlacementAreDone();

                    var id = BaseGetMaxError();
                    var bucket = BaseBucketsLookup[id];
                    currentMaxError = bucket.ErrorLevel; //update
                    if (currentMaxError > MAX_ERROR)
                    {
// Here it is linear speed when putting one new centroid at a time
// should be semi-fast because the new point is inserted at best area 
//from current centroids view.
// possible improvement exists by log2 search by inserting multiple centroids and 
// reducing centroid again if needed

// put new centroid in area where maxError but farthest away from current centroid in area
                        var longest = BaseGetLongestPoint(bucket.Centroid, bucket.Points);
                        var newcentroid = new XY(longest);
                        var newid = BaseBucketsLookup.Count.ToString();
                        var newbucket = new Bucket(newid) { Centroid = newcentroid };
                        BaseBucketsLookup.Add(newid, newbucket);
                    }
                }
            }

            void RunIterationsUntilKClusterPlacementAreDone()
            {
                Profile("RunIterationsUntilKClusterPlacementAreDone beg");
                double prevError = Double.MaxValue;
                double currError = Double.MaxValue;

                for (int i = 0; i < _MaxIterations; i++)
                {
                    prevError = currError;
                    currError = RunOneIteration();
                    if (currError >= prevError) // no improvement
                        break;
                }
            }

            double RunOneIteration() //O(k*n)
            {
                Profile("RunOneIteration beg");

                // update points, assign points to cluster
                BaseUpdatePointsByCentroid();
                // update centroid pos by its points
                BaseSetCentroidForAllBuckets(BaseBucketsLookup.Values);//O(k*n)
                var clustersCount = BaseBucketsLookup.Count;
                for (int i = 0; i < clustersCount; i++)
                {
                    var currentBucket = BaseBucketsLookup[i.ToString()];
                    if (currentBucket.IsUsed == false)
                        continue;

                    //update centroid                
                    var newcontroid = BaseGetCentroidFromCluster(currentBucket.Points);
                    //no need to update color, autoset
                    currentBucket.Centroid = newcontroid;
                    currentBucket.ErrorLevel = 0;
                    //update error                
                    foreach (var p in currentBucket.Points)
                    {
                        var dist = MathTool.Distance(newcontroid, p);
                        currentBucket.ErrorLevel += dist;
                    }
                    var val = currentBucket.ErrorLevel / currentBucket.Points.Count;
                    currentBucket.ErrorLevel = val; //Math.Sqrt(val);
                }

                return BaseGetTotalError();
            }

        }

        public class GridCluster : BaseClusterAlgorithm
        {
            public GridCluster(List<XY> dataset) : base(dataset)
            {                
            }

            public override List<XY> GetCluster()
            {
                var cluster = RunClusterAlgo(GridX, GridY);
                return cluster;
            }

            // O(k*n)
            void MergeClustersGrid()
            {
                foreach (var key in BaseBucketsLookup.Keys)
                {
                    var bucket = BaseBucketsLookup[key];
                    if (bucket.IsUsed == false)
                        continue;

                    var x = bucket.Idx;
                    var y = bucket.Idy;

                    // get keys for neighbors
                    var N = GetId(x, y + 1);
                    var NE = GetId(x + 1, y + 1);
                    var E = GetId(x + 1, y);
                    var SE = GetId(x + 1, y - 1);
                    var S = GetId(x, y - 1);
                    var SW = GetId(x - 1, y - 1);
                    var W = GetId(x - 1, y);
                    var NW = GetId(x - 1, y - 1);
                    var neighbors = new[] { N, NE, E, SE, S, SW, W, NW };

                    MergeClustersGridHelper(key, neighbors);
                }
            }
            void MergeClustersGridHelper(string currentKey, string[] neighborKeys)
            {
                double minDistX = DeltaX / 2.0;//heuristic, higher value gives less merging
                double minDistY = DeltaY / 2.0;
                //if clusters in grid are too close to each other, merge them
                double minDist = MathTool.Max(minDistX, minDistY);

                foreach (var neighborKey in neighborKeys)
                {
                    if (!BaseBucketsLookup.ContainsKey(neighborKey))
                        continue;

                    var neighbor = BaseBucketsLookup[neighborKey];
                    if (neighbor.IsUsed == false)
                        continue;

                    var current = BaseBucketsLookup[currentKey];
                    var dist = MathTool.Distance(current.Centroid, neighbor.Centroid);
                    if (dist > minDist)
                        continue;

                    // merge
                    var color = current.Centroid.Color;
                    foreach (var p in neighbor.Points)
                    {
                        //update color
                        p.Color = color;
                    }

                    current.Points.AddRange(neighbor.Points);//O(n)

                    // recalc centroid
                    var cp = BaseGetCentroidFromCluster(current.Points);
                    current.Centroid = cp;
                    neighbor.IsUsed = false; //merged, then not used anymore
                    neighbor.Points.Clear(); //clear mem
                }
            }

            public List<XY> RunClusterAlgo(int gridx, int gridy)
            {                
                // params invalid
                if (gridx <= 0 || gridy <= 0)
                    throw new ApplicationException("GridCluster.RunClusterAlgo gridx or gridy is <= 0");

                // put points in buckets
                foreach (var p in BaseDataset)
                {
                    // find relative val
                    var relativeX = p.X - MinX;
                    var relativeY = p.Y - MinY;

                    int idx = (int)(relativeX / DeltaX);
                    int idy = (int)(relativeY / DeltaY);

                    // bucket id
                    string id = GetId(idx, idy);

                    // bucket exists, add point
                    if (BaseBucketsLookup.ContainsKey(id))
                        BaseBucketsLookup[id].Points.Add(p);

                    // new bucket, create and add point
                    else
                    {
                        Bucket bucket = new Bucket(idx, idy);
                        bucket.Points.Add(p);
                        BaseBucketsLookup.Add(id, bucket);
                    }
                }

                // calc centrod for all buckets
                BaseSetCentroidForAllBuckets(BaseBucketsLookup.Values);

                // merge if gridpoint is to close
                if (DoMergeGridIfCentroidsAreCloseToEachOther)
                    MergeClustersGrid();

                if (DoUpdateAllCentroidsToNearestContainingPoint)
                {
                    BaseUpdateAllCentroidsToNearestContainingPoint();                                        
                }

                // check again
                // merge if gridpoint is to close
                if (DoMergeGridIfCentroidsAreCloseToEachOther 
                    && DoUpdateAllCentroidsToNearestContainingPoint)
                {
                    MergeClustersGrid();
                    // and again set centroid to closest point in bucket 
                    BaseUpdateAllCentroidsToNearestContainingPoint();
                }

                return BaseGetClusterResult();
            }
        }

        public class MathTool
        {
            private const double Exp = 2; // 2=euclid, 1=manhatten
            //Minkowski dist        
            public static double Distance(XY a, XY b)
            {
                return Math.Pow(Math.Pow(Math.Abs(a.X - b.X), Exp) +
                    Math.Pow(Math.Abs(a.Y - b.Y), Exp), 1.0 / Exp);
            }

            public static double Min(double a, double b)
            {
                return a <= b ? a : b;
            }
            public static double Max(double a, double b)
            {
                return a >= b ? a : b;
            }

            public static bool DistWithin(XY a, XY b, double d)
            {
                var dist = Distance(a, b);
                return dist < d;
            }

            public static bool BoxWithin(XY a, XY b, double boxsize)
            {
                var d = boxsize / 2;
                var withinX = a.X - d <= b.X && a.X + d >= b.X;
                var withinY = a.Y - d <= b.Y && a.Y + d >= b.Y;
                return withinX && withinY;
            }
        }

        public static class FileUtil
        {
            public static bool WriteFile(string data, FileInfo fileInfo)
            {
                bool isSuccess = false;
                try
                {
                    using (StreamWriter streamWriter =
                        File.CreateText(fileInfo.FullName))
                    {
                        streamWriter.Write(data);
                        isSuccess = true;
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.StackTrace + "\nPress a key ... ");
                    Console.ReadKey();
                }
                return isSuccess;
            }
        }

        [Serializable()]
        public class DatasetToSerialize : ISerializable
        {
            private const string Name = "Dataset";
            public List<XY> Dataset { get; set; }
            public DatasetToSerialize()
            {
                Dataset = new List<XY>();
            }
            public DatasetToSerialize(SerializationInfo info, StreamingContext ctxt)
            {
                this.Dataset = (List<XY>)info.GetValue(Name, typeof(List<XY>));
            }
            public void GetObjectData(SerializationInfo info, StreamingContext ctxt)
            {
                info.AddValue(Name, this.Dataset);
            }
        }

        public class Serializer
        {
            public void SerializeObject(string filename, object objectToSerialize)
            {
                try
                {
                    using (Stream stream = File.Open(filename, FileMode.Create))
                    {
                        BinaryFormatter bFormatter = new BinaryFormatter();
                        bFormatter.Serialize(stream, objectToSerialize);
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.StackTrace + "\nPress a key ... ");
                    Console.ReadKey();
                }
            }
            public object DeSerializeObject(string filename)
            {
                try
                {
                    using (Stream stream = File.Open(filename, FileMode.Open))
                    {
                        BinaryFormatter bFormatter = new BinaryFormatter();
                        var objectToSerialize = bFormatter.Deserialize(stream);
                        return objectToSerialize;
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.StackTrace + "\nPress a key ... ");
                    Console.ReadKey();
                }
                return null;
            }
        }
    }
}

Put both the html and javascript file in the c:\temp folder.
The C# program creates a new draw.js file in the c:\temp folder.

Open the html file with a modern browser that support canvas, like Firefox or Chrome.
If you want to test with Internet Explorer on your local machine, you should download the excanvas javascript plugin
canvas.html

<html>
<head>
    <title>Cluster Test</title>
    <!-- IE canvas not supported fallback -->
    <!--[if lt IE 10]>
	<script src="excanvas.js"></script>
    <![endif]-->
    <script type="text/javascript" src="draw.js"></script>
    <!-- default -->
    <script type="text/javascript">
        var G = {
            yellow: "#FF0",
            red: "#F00",
            green: "#0F0",
            blue: "#00F",
            black: "#000"
        }

        function drawMark(cx, cy, color, ctx) {
            //draw a circle
            var radius = 4;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();
            ctx.fillStyle = color;
            ctx.fill();
            //ctx.stroke();
        }

        function drawCluster(cx, cy, color, n, ctx) {
            //draw a bigger circle
            var radius = 12;
            ctx.strokeStyle = color;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();
            ctx.stroke();
            ctx.fillStyle = color;
            ctx.fillText("" + n, cx - 8, cy + 2);
        }

        function drawMarkerCluster(cx, cy, color, n, size, ctx) {
            drawCluster(cx, cy, color, n, ctx);
            // draw rect
            var a = cx - (size / 2);
            var b = cy - (size / 2);
            var len = size;
            ctx.lineWidth = 2;
            ctx.strokeStyle = color;
            ctx.strokeRect(a, b, len, len);
        }

        function drawDistanceCluster(cx, cy, color, n, size, ctx) {
            drawCluster(cx, cy, color, n, ctx);
            // draw radius circle
            var radius = size;
            ctx.lineWidth = 2;
            ctx.strokeStyle = color;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();
            ctx.stroke();
        }

        window.onload = function draw() {
            var canvas = document.getElementById('canvas');
            if (canvas != null && canvas != undefined && canvas.getContext) {
                var ctx = canvas.getContext('2d');
                ctx.strokeStyle = G.black;
                ctx.font = "10pt Arial";
                drawMarkers(ctx);
            }
        }
    </script>
    <style type="text/css">
        body
        {
            margin-left: 10px;
            margin-top: 10px;
        }
        canvas
        {
            border: 1px solid red;
        }
    </style>
</head>
<body>
    <canvas id="canvas" width="610" height="430">
        <p>Your browser doesn't support canvas. Try Firefox, Chrome or an another modern browser.</p>
    </canvas>
</body>
</html>

The javascript file will be overwritten when you run the C# code.
draw.js

function drawMarkers(ctx) {
    drawMark(156.434, 157.921, 'rgb(182,114,144)', ctx);
    drawMark(156.788, 139.939, 'rgb(182,114,144)', ctx);
    drawMark(150.712, 67.882, 'rgb(67,235,136)', ctx);
    drawMark(445.846, 165.556, 'rgb(153,85,202)', ctx);
    drawMark(382.982, 179.917, 'rgb(153,85,202)', ctx);
    drawMark(466.365, 172.098, 'rgb(153,85,202)', ctx);
    drawMark(302.81, 243.478, 'rgb(48,234,127)', ctx);
    drawMark(235.35, 265.31, 'rgb(48,234,127)', ctx);
    drawMark(262.092, 258.321, 'rgb(48,234,127)', ctx);

    drawMarkerCluster(156.434, 157.921, 'rgb(182,114,144)', 2, 160, ctx);
    drawMarkerCluster(150.712, 67.882, 'rgb(67,235,136)', 1, 160, ctx);
    drawMarkerCluster(445.846, 165.556, 'rgb(153,85,202)', 3, 160, ctx);
    drawMarkerCluster(302.81, 243.478, 'rgb(48,234,127)', 3, 160, ctx);
    ctx.fillStyle = 'rgb(0,0,0)';
    ctx.fillText('Clusters = ' + 4, 20, 320);
}

Written by kunuk Nykjaer

September 20, 2011 at 10:49 pm

Clustering K-means and Grid with C# example and html-canvas (part 2)

with 3 comments

This is a follow up to my previous post.
https://kunuk.wordpress.com/2011/09/15/clustering-grid-cluster

The next post with updated code can be seen here.
https://kunuk.wordpress.com/2011/09/20/markerclusterer-with-c-example-and-html-canvas-part-3/

I have implemented K-means clustering algorithm which automatically increments the K value until the errorlevel is low enough to be accepted. The running time of the K-means runs reasonable fast. It’s best used in combination of chosing the right amount of points and setting the max_error. If the max_error is small and the number of points are big then the running time will be very slow. There are max iterations set to avoid very long running time as I believe K-means is NP-hard.

The insertion of the extra cluster point is linear and works by inserting them at the best place and keeping the current cluster points. It finds the current bad performing cluster point regarding error value and insert a new cluster point in that region farthest away from the bad performing cluster point. By doing this the new points are inserted at place which seems logical while minimizing the overall error level.

The Grid clustering algorithm runs pretty fast. Even with 100.000 points the time is below 2 sec. For 10.000 points the time is below 1 sec.

There is added an optional functionality to move the cluster point position to the closest point in the region of the clustered area after the final centroid calculation. By using this your cluster point will be positioned visually on a true existing point in the region.

When used with maps and websites the grid clustering algorithm is a good candidate of chose because of it’s fast running time which is O(m*n) where m is the number of grids and n is the number of points, please comment if I have made a wrong statement 🙂
The K-means can also be used when you know the expected points and setting the max_error accordingly.

Below are some image examples and the C# source code bundled in one file to keep it simple.

K-means 0.7 sec 10.000 points, max_error 100

K-means 2.2 sec 10.000 points, max_error 50

K-means 0.2 sec 150 points, max_error 30

K-means, an example of how K-means can’t “see” all the true clusters. Here the K is manually set to 3.
This will not happen with the current code, but this is mainly to show some weakness of the K-means algorithm.

To many points view

Grid clustering algorithm with grid 7×6 took 0.04 sec. with 160 points.
The cluster points location are moved to the nearest point in the region after the centroid calculation in this example. Some grid merging has also occurred because the clusters were too close to each other.

Final clustered result view

Performance test of the grid by using lots of points.
Grid 8×7, 1.3 sec, 100.000 points.
With Grid 6×5, it’s about 3 sec, without centroid calculation and without saving to file for 1.000.000 points.

Algorithm.cs

 
using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Text;

// By Kunuk Nykjaer
// Clustering package, Grid clustering and auto increasing K-means clustering
// Version 0.2
public class Algorithm
{
    public static DateTime Starttime;
    static void Main(string[] args)
    {
        Starttime = DateTime.Now;
        var algo = new Algorithm();

        // run kmeans or grid
        algo.Run(ClusterType.KMeans); 
        //algo.Run(ClusterType.Grid);

        algo.GenerateJavascriptDrawFile(); //create draw.js

        var timespend = DateTime.Now.Subtract(Starttime).TotalSeconds;     
        Console.WriteLine(timespend+" sec. press a key ...");
        Console.ReadKey();
    }

    // use centroid or semi-centroid cluster point placement visualization?
    public const bool DoUpdateAllCentroidsToNearestContainingPoint = false;
    public const bool UseProfiling = false; //debug, output time spend

    public enum ClusterType { Unknown = -1, Grid, KMeans } ;
    static readonly CultureInfo Culture_enUS = new CultureInfo("en-US");
    const string S = "G";

    readonly List<XY> _points = new List<XY>();
    List<XY> _clusters = new List<XY>();

    public static readonly Random _rand = new Random();
    public const double MinX = 10;
    public const double MinY = 10;
    public const double MaxX = 400;
    public const double MaxY = 300;

    // used with Grid cluster algo
    public const int GridX = 8;
    public const int GridY = 7;

    const double AbsSizeX = MaxX - MinX;
    const double AbsSizeY = MaxY - MinY;

    // for bucket placement calc, grid cluster algo
    const double DeltaX = AbsSizeX / GridX;
    const double DeltaY = AbsSizeY / GridY;

    static private readonly string NL = Environment.NewLine;
    private const string ctx = "ctx"; // javascript canvas
    public bool DisplayGridInCanvas; //autoset by used cluster type

    public Algorithm()
    {
        CreateDataSet();
    }

    public static string GetId(int idx, int idy) //O(1)
    {
        return idx + ";" + idy;
    }

    void CreateDataSet() //O(1)
    {
        // CREATE DATA SET

        // Create random scattered points
        //for (int i = 0; i < 150; i++)
        //{
        //    var x = MinX + _rand.NextDouble() * (AbsSizeX);
        //    var y = MinY + _rand.NextDouble() * (AbsSizeY);
        //    _points.Add(new XY(x, y));
        //}
        
        // Create random region of clusters
        for (int i = 0; i < 50; i++)
        {
            var x = MinX + _rand.NextDouble() * 100;
            var y = MinY + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }

        for (int i = 0; i < 50; i++)
        {
            var x = MinX + 200 + _rand.NextDouble() * 100;
            var y = MinY + 10 + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }

        for (int i = 0; i < 50; i++)
        {
            var x = MinX + 50 + _rand.NextDouble() * 100;
            var y = MinY + 150 + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }         
        
    }

    public void Run(ClusterType clustertype)
    {
        switch (clustertype)
        {
            case ClusterType.Grid:
                _clusters = new GridBaseCluster(_points).GetCluster();
                DisplayGridInCanvas = true;
                break;
            case ClusterType.KMeans:
                _clusters = new KMeans(_points).GetCluster();
                DisplayGridInCanvas = false;
                break;
        }
    }

    public static string GetRandomColor()
    {
        int r = _rand.Next(10, 240);
        int g = _rand.Next(10, 240);
        int b = _rand.Next(10, 240);
        return string.Format("'rgb({0},{1},{2})'", r, g, b);
    }

    static void CreateFile(string s)
    {
        var path = new FileInfo(@"C:\temp\draw.js");
        var isCreated = FileUtil.WriteFile(s, path);
        Console.WriteLine(isCreated + " = File is Created");
    }
    public void GenerateJavascriptDrawFile()
    {
        var sb = new StringBuilder();

        // markers
        var head = NL + "function drawMarkers(ctx) {" + NL;
        var tail = NL + "}" + NL;

        sb.Append(head);

        // grid
        if (DisplayGridInCanvas)
        {
            sb.Append("ctx.beginPath();" + NL);
            for (int i = 0; i < GridX + 1; i++)
            {
                sb.Append(String.Format("ctx.moveTo({0}, {1});{2}",
                                        ToStringEN(MinX + i * DeltaX), 
                                        ToStringEN(MinY), NL));
                sb.Append(String.Format("ctx.lineTo({0}, {1});{2}",
                                        ToStringEN(MinX + i * DeltaX), 
                                        ToStringEN(MaxY), NL));
            }
            for (int j = 0; j < GridY + 1; j++)
            {
                sb.Append(String.Format("ctx.moveTo({0}, {1});{2}",
                                        ToStringEN(MinX), 
                                        ToStringEN(MinY + j * DeltaY), NL));
                sb.Append(String.Format("ctx.lineTo({0}, {1});{2}",
                                        ToStringEN(MaxX), 
                                        ToStringEN(MinY + j * DeltaY), NL));
            }
            sb.Append("ctx.stroke(); " + NL);
        }

        // markers
        if (_points != null)
            foreach (var p in _points)
                sb.Append(string.Format("drawMark({0}, {1}, {2}, {3});{4}",
                    ToStringEN(p.X), ToStringEN(p.Y), p.Color, ctx, NL));

        string clusterInfo = "0";
        if (_clusters != null)
        {
            foreach (var c in _clusters)
                sb.Append(string.Format("drawCluster({0}, {1}, {2}, {3}, {4});{5}",
                    ToStringEN(c.X), ToStringEN(c.Y), c.Color,
                                        c.Size, ctx, NL));

            clusterInfo = _clusters.Count + string.Empty;
        }

        // bottom text                
        sb.Append("ctx.fillStyle = 'rgb(0,0,0)';" + NL);
        sb.Append(string.Format("ctx.fillText('Clusters = ' + {0}, {1}, {2});{3}",
            clusterInfo, ToStringEN(MinX + 10), ToStringEN(MaxY + 20), NL));

        sb.Append(tail);
        CreateFile(sb.ToString());
        //Console.WriteLine(sb.ToString());
    }
    public static string ToStringEN(double d)
    {
        double rounded = Math.Round(d, 3);
        return rounded.ToString(S, Culture_enUS);
    }
    public static void Profile(string s)
    {
        if(!UseProfiling)
            return;        
        var timespend = DateTime.Now.Subtract(Starttime).TotalSeconds;
        Console.WriteLine(timespend + " sec. " + s);
    }//O(1)

    public class XY : IComparable
    {
        public double X { get; set; }
        public double Y { get; set; }
        public string Color { get; set; }
        public int Size { get; set; }
        public string XToString { get { return X.ToString(S, Culture_enUS); } }
        public string YToString { get { return Y.ToString(S, Culture_enUS); } }
        public XY(double x, double y)
        {
            X = x;
            Y = y;
            Color = "'rgb(0,0,0)'";//default
            Size = 1;
        }
        public XY(XY p) //clone
        {
            this.X = p.X;
            this.Y = p.Y;
            this.Color = p.Color;
            this.Size = p.Size;
        }

        public int CompareTo(object o) // if used in sorted list
        {
            if (this.Equals(o))
                return 0;

            var other = (XY)o;
            if (this.X > other.X)
                return -1;
            if (this.X < other.X)
                return 1;

            return 0;
        }
        public override int GetHashCode()
        {
            var x = X * 10000;
            var y = Y * 10000;
            var r = x * 17 + y * 37;
            return (int)r;
        }
        private const int ROUND = 6;
        public override bool Equals(Object o)
        {
            if (o == null)
                return false;
            var other = o as XY;
            if (other == null)
                return false;

            var x = Math.Round(this.X, ROUND) == Math.Round(other.X, ROUND);
            var y = Math.Round(this.Y, ROUND) == Math.Round(other.Y, ROUND);
            return x && y;
        }
    }
    public class Bucket
    {
        public string Id { get; private set; }
        public List<XY> Points { get; private set; }
        public XY Centroid { get; set; }
        public int Idx { get; private set; }
        public int Idy { get; private set; }
        public double ErrorLevel { get; set; } // clusterpoint and points avg dist
        private bool _IsUsed;
        public bool IsUsed
        {
            get { return _IsUsed && Centroid != null; }
            set{_IsUsed = value; }
        }
        public Bucket(string id)
        {
            IsUsed = true;
            Centroid = null;
            Points = new List<XY>();
            Id = id;
        }
        public Bucket(int idx, int idy)
        {
            IsUsed = true;
            Centroid = null;
            Points = new List<XY>();
            Idx = idx;
            Idy = idy;
            Id = GetId(idx, idy);
        }
    }

    public abstract class BaseClusterAlgorithm
    {
        public List<XY> BaseDataset; // all points
        //id, bucket
        public readonly Dictionary<string, Bucket> BaseBucketsLookup = 
            new Dictionary<string, Bucket>();
        public abstract List<XY> GetCluster();
        //O(k?? random fn can be slow)
        public static XY[] BaseGetRandomCentroids(List<XY> list, int k)
        {
            var set = new HashSet<XY>();
            int i = 0;
            var kcentroids = new XY[k];

            int MAX = list.Count;
            while (MAX >= k)
            {
                int index = _rand.Next(0, MAX - 1);
                var xy = list[index];
                if (set.Contains(xy))
                    continue;

                set.Add(xy);
                kcentroids[i++] = new XY(xy.X, xy.Y) { Color = GetRandomColor() };

                if (i >= k)
                    break;
            }

            Profile("BaseGetRandomCentroids");
            return kcentroids;
        }

        public List<XY> BaseGetClusterResult()//O(k*n)
        {
            var clusterPoints = new List<XY>();
            // collect used buckets and return the result

            if (DoUpdateAllCentroidsToNearestContainingPoint)
                BaseUpdateAllCentroidsToNearestContainingPoint();

            foreach (var item in BaseBucketsLookup)
            {
                var bucket = item.Value;
                if (bucket.IsUsed)
                    clusterPoints.Add(bucket.Centroid);
            }

            Profile("BaseGetClusterResult");
            return clusterPoints;
        }
        public static XY BaseGetCentroidFromCluster(List<XY> list) //O(n)
        {
            int count = list.Count;
            if (list == null || count == 0)
                return null;

            // color is set for the points and the cluster point here
            var color = GetRandomColor(); //O(1)
            XY centroid = new XY(0, 0) { Size = list.Count };//O(1)
            foreach (XY p in list)
            {
                p.Color = color;
                centroid.X += p.X;
                centroid.Y += p.Y;
            }
            centroid.X /= count;
            centroid.Y /= count;
            var cp = new XY(centroid.X, centroid.Y) { Size = count, Color = color };

            Profile("BaseGetCentroidFromCluster");
            return cp;
        }
        //O(k*n)
        public static void BaseSetCentroidForAllBuckets(IEnumerable<Bucket> buckets)
        {
            foreach (var item in buckets)
            {
                var bucketPoints = item.Points;
                var cp = BaseGetCentroidFromCluster(bucketPoints);
                item.Centroid = cp;
            }
            Profile("BaseSetCentroidForAllBuckets");
        }
        public double BaseGetTotalError()//O(k)
        {
            int centroidsUsed = BaseBucketsLookup.Values.Count(b => b.IsUsed);
            double sum = BaseBucketsLookup.Values.Where(b => b.IsUsed).Sum(b => b.ErrorLevel);
            return sum / centroidsUsed;
        }
        public string BaseGetMaxError() //O(k)
        {
            double maxError = -double.MaxValue;
            string id = string.Empty;
            foreach (var b in BaseBucketsLookup.Values)
            {
                if (!b.IsUsed || b.ErrorLevel <= maxError) 
                    continue;

                maxError = b.ErrorLevel;
                id = b.Id;
            }
            return id;
        }
         public XY BaseGetClosestPoint(XY from, List<XY> list) //O(n)
         {
             double min = double.MaxValue;
             XY closests = null;
             foreach (var p in list)
             {
                 var d = MathTool.Distance(from, p);
                 if (d >= min) 
                     continue;

                 // update
                 min = d;
                 closests = p;
             }
             return closests;
         }        
         public XY BaseGetLongestPoint(XY from, List<XY> list) //O(n)
         {
             double max = -double.MaxValue;
             XY longest = null;
             foreach (var p in list)
             {
                 var d = MathTool.Distance(from, p);
                 if (d <= max)
                     continue;

                 // update
                 max = d;
                 longest = p;
             }
             return longest;
         }

        // update centroid location to nearest point, 
        // e.g. if you want to show cluster point on a real existing point area
         //O(n)
         public void BaseUpdateCentroidToNearestContainingPoint(Bucket bucket) 
         {
             if(bucket==null || bucket.Centroid==null || 
                 bucket.Points==null||bucket.Points.Count==0)
                 return;

             var closest = BaseGetClosestPoint(bucket.Centroid, bucket.Points);
             bucket.Centroid.X = closest.X;
             bucket.Centroid.Y = closest.Y;
             Profile("BaseUpdateCentroidToNearestContainingPoint");
         }
         //O(k*n)
         public void BaseUpdateAllCentroidsToNearestContainingPoint() 
        {
             foreach (var bucket in BaseBucketsLookup.Values)             
                 BaseUpdateCentroidToNearestContainingPoint(bucket);
             Profile("BaseUpdateAllCentroidsToNearestContainingPoint");
        }
    }

    // O(exponential) ~ can be slow when n or k is big
    public class KMeans : BaseClusterAlgorithm
    {
        private const int _InitClusterSize = 1; // start from this cluster points

        // heuristic, set tolerance for cluster density, has effect on running time.
        // Set high for many points in dataset, can be lower for fewer points
        private const double MAX_ERROR = 30;             
            
        private const int _MaxIterations = 100; // cluster point optimization iterations
        private const int _MaxClusters = 100;

        public KMeans(List<XY> dataset)
        {
            if (dataset==null || dataset.Count==0)
                throw new ApplicationException(
                    string.Format("KMeans dataset is null or empty") );

            this.BaseDataset = dataset;            
        }

        public override List<XY> GetCluster()
        {
            var cluster = RunClusterAlgo();
            return cluster;
        }

        List<XY> RunClusterAlgo()
        {
            /*
            ITERATIVE LINEAR ADDING CLUSTER UNTIL 
            REPEATE iteration of clusters convergence 
             * until the max error is small enough
            if not, insert a new cluster at worst place 
            (farthest in region of worst cluster area) and run again 
            keeping current cluster points              
             
             // one iteration of clusters convergence is defined as ..
          1) Random k centroids
          2) Cluster data by euclidean distance to centroids
          3) Update centroids by clustered data,     
          4) Update cluster
          5) Continue last two steps until error is small, error is sum of diff
              between current and updated centroid                          
           */

            var clusterPoints = new List<XY>();
            // params invalid
            if (BaseDataset == null || BaseDataset.Count == 0)
                return clusterPoints;

            RunAlgo();         
            return BaseGetClusterResult();
        }

        void RunAlgo()
        {
            // Init clusters
            var centroids = BaseGetRandomCentroids(BaseDataset, _InitClusterSize);
            for (int i = 0; i < centroids.Length; i++)
            {
                var newbucket = new Bucket(i.ToString()) { Centroid = centroids[i] };
                BaseBucketsLookup.Add(i.ToString(), newbucket);
            }            

            //
            double currentMaxError = double.MaxValue;
            while (currentMaxError > MAX_ERROR && BaseBucketsLookup.Count < _MaxClusters)
            {
                RunIterationsUntilKClusterPlacementAreDone();                

                var id = BaseGetMaxError();
                var bucket = BaseBucketsLookup[id];
                currentMaxError = bucket.ErrorLevel; //update
                if (currentMaxError > MAX_ERROR)
                {                                        
// Here it is linear speed when putting one new centroid at a time
// should be semi-fast because the new point is inserted at best area 
//from current centroids view.
// possible improvement exists by log2 search by inserting multiple centroids and 
// reducing centroid again by using closest pair algo which can be O(nlogn) to 
// find closest cluster dist and analysing the dist length, if its to small, 
//then reduce centroids.

// put new centroid in area where maxError but farthest away from current centroid in area
                    var longest = BaseGetLongestPoint(bucket.Centroid, bucket.Points);
                    var newcentroid = new XY(longest);
                    var newid = BaseBucketsLookup.Count.ToString();
                    var newbucket = new Bucket(newid) { Centroid = newcentroid };
                    BaseBucketsLookup.Add(newid, newbucket);
                }
            }                     
        }

        void RunIterationsUntilKClusterPlacementAreDone()
        {
            double prevError = Double.MaxValue;
            double currError = Double.MaxValue;

            for (int i = 0; i < _MaxIterations; i++)
            {
                prevError = currError;
                currError = RunOneIteration();
                if (currError >= prevError) // no improvement then stop
                    break;
            }
            Profile("RunIterationsUntilKClusterPlacementAreDone");
        }

        double RunOneIteration() //O(k*n)
        {
            // update points, assign points to cluster
            UpdatePointsByCentroid();
            // update centroid pos by its points
            BaseSetCentroidForAllBuckets(BaseBucketsLookup.Values);//O(k*n)
            var clustersCount = BaseBucketsLookup.Count;
            for (int i = 0; i < clustersCount; i++)
            {
                var currentBucket = BaseBucketsLookup[i.ToString()];
                if (currentBucket.IsUsed == false)
                    continue;

                //update centroid                
                var newcontroid = BaseGetCentroidFromCluster(currentBucket.Points);
                //no need to update color, autoset
                currentBucket.Centroid = newcontroid; 
                currentBucket.ErrorLevel = 0;
                //update error                
                foreach (var p in currentBucket.Points)
                {
                    var dist = MathTool.Distance(newcontroid, p);
                    currentBucket.ErrorLevel += dist;
                }
                var val = currentBucket.ErrorLevel/currentBucket.Points.Count;
                currentBucket.ErrorLevel = val; //Math.Sqrt(val);
            }

            Profile("RunOneIteration");
            return BaseGetTotalError();
        }
        void UpdatePointsByCentroid()//O(n*k)
        {
            int count = BaseBucketsLookup.Count();
          
            // clear points in the buckets, they will be re-inserted
            foreach (var bucket in BaseBucketsLookup.Values)
                bucket.Points.Clear();

            foreach (XY p in BaseDataset)
            {
                double minDist = Double.MaxValue;
                int index = -1;
                for (int i = 0; i < count; i++)
                {
                    var bucket = BaseBucketsLookup[i.ToString()];
                    if (bucket.IsUsed == false)
                        continue;

                    var centroid = bucket.Centroid;
                    var dist = MathTool.Distance(p, centroid);
                    if (dist < minDist)
                    {
                        // update
                        minDist = dist;
                        index = i;
                    }
                }
                //update color for point to match centroid and re-insert
                var closestBucket = BaseBucketsLookup[index.ToString()];
                p.Color = closestBucket.Centroid.Color;
                closestBucket.Points.Add(p);
            }

            Profile("UpdatePointsByCentroid");
        }
    }

    public class GridBaseCluster : BaseClusterAlgorithm
    {
        public GridBaseCluster(List<XY> dataset)
        {
            BaseDataset = dataset;
        }

        public override List<XY> GetCluster()
        {
            var cluster = RunClusterAlgo(GridX, GridY);
            return cluster;
        }

        // O(k*n)
        void MergeClustersGrid()
        {
            foreach (var key in BaseBucketsLookup.Keys)
            {
                var bucket = BaseBucketsLookup[key];
                if (bucket.IsUsed == false)
                    continue;

                var x = bucket.Idx;
                var y = bucket.Idy;

                // get keys for neighbors
                var N = GetId(x, y + 1);
                var NE = GetId(x + 1, y + 1);
                var E = GetId(x + 1, y);
                var SE = GetId(x + 1, y - 1);
                var S = GetId(x, y - 1);
                var SW = GetId(x - 1, y - 1);
                var W = GetId(x - 1, y);
                var NW = GetId(x - 1, y - 1);
                var neighbors = new[] { N, NE, E, SE, S, SW, W, NW };

                MergeClustersGridHelper(key, neighbors);
            }
        }
        void MergeClustersGridHelper(string currentKey, string[] neighborKeys)
        {
            double minDistX = DeltaX / 2.0;//heuristic, higher value gives less merging
            double minDistY = DeltaY / 2.0;
            //if clusters in grid are too close to each other, merge them
            double minDist = MathTool.Max(minDistX, minDistY);

            foreach (var neighborKey in neighborKeys)
            {
                if (!BaseBucketsLookup.ContainsKey(neighborKey))
                    continue;
                
                var neighbor = BaseBucketsLookup[neighborKey];
                if (neighbor.IsUsed == false)
                    continue;

                var current = BaseBucketsLookup[currentKey];
                var dist = MathTool.Distance(current.Centroid, neighbor.Centroid);
                if (dist > minDist)
                    continue;

                // merge
                var color = current.Centroid.Color;
                foreach (var p in neighbor.Points)
                {
                    //update color
                    p.Color = color;
                }

                current.Points.AddRange(neighbor.Points);//O(n)

                // recalc centroid
                var cp = BaseGetCentroidFromCluster(current.Points);
                current.Centroid = cp;
                neighbor.IsUsed = false; //merged, then not used anymore
            }
        }

        public List<XY> RunClusterAlgo(int gridx, int gridy)
        {
            var clusterPoints = new List<XY>();
            // params invalid
            if (BaseDataset == null || BaseDataset.Count == 0 || 
                gridx <= 0 || gridy <= 0)
                return clusterPoints;

            // put points in buckets
            foreach (var p in BaseDataset)
            {
                // find relative val
                var relativeX = p.X - MinX;
                var relativeY = p.Y - MinY;

                int idx = (int)(relativeX / DeltaX);
                int idy = (int)(relativeY / DeltaY);

                // bucket id
                string id = GetId(idx, idy);

                // bucket exists, add point
                if (BaseBucketsLookup.ContainsKey(id))
                    BaseBucketsLookup[id].Points.Add(p);

                // new bucket, create and add point
                else
                {
                    Bucket bucket = new Bucket(idx, idy);
                    bucket.Points.Add(p);
                    BaseBucketsLookup.Add(id, bucket);
                }
            }

            // calc centrod for all buckets
            BaseSetCentroidForAllBuckets(BaseBucketsLookup.Values);

            // merge if gridpoint is to close
            MergeClustersGrid();

            return BaseGetClusterResult();
        }
    }

    public class MathTool
    {
        private const double Exp = 2; // 2=euclid, 1=manhatten
        //Minkowski dist
        public static double Distance(Algorithm.XY a, Algorithm.XY b)
        {
            return Math.Pow(Math.Pow(Math.Abs(a.X - b.X), Exp) +
                Math.Pow(Math.Abs(a.Y - b.Y), Exp), 1.0 / Exp);
        }

        public static double Min(double a, double b)
        {
            return a <= b ? a : b;
        }
        public static double Max(double a, double b)
        {
            return a >= b ? a : b;
        }
    }

    public static class FileUtil
    {
        public static bool WriteFile(string data, FileInfo fileInfo)
        {
            bool isSuccess = false;
            try
            {
                using (StreamWriter streamWriter = 
                    File.CreateText(fileInfo.FullName))
                {
                    streamWriter.Write(data);
                    isSuccess = true;
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.StackTrace + "\nPress a key ... ");
                Console.ReadKey();
            }
            return isSuccess;
        }
    }
}

Put both the html and javascript file in the c:\temp folder.
The C# program creates a new draw.js file in the c:\temp folder.

Open the html file with a modern browser that support canvas, like Firefox or Chrome.

canvas.html

<html>
<head>
    <title>Cluster test</title>
    <script type="text/javascript">
        var G = {
            yellow: "#FF0",
            red: "#F00",
            green: "#0F0",
            blue: "#00F",
            black: "#000"
        }       

        window.onload = function draw() {
            var canvas = document.getElementById('canvas');
            if (canvas.getContext) {
                var ctx = canvas.getContext('2d');
                ctx.strokeStyle = G.black;
                ctx.font = "10pt Arial";
                drawMarkers(ctx); //use fn defined in draw.js
            }
        }

        function drawMark(cx, cy, color, ctx) {
            //draw a circle
            var radius = 4;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();
            ctx.fillStyle = color;
            ctx.fill();
            ctx.stroke();
        }

        function drawCluster(cx, cy, color, n, ctx) {
            //draw a bigger circle
            var radius = 12;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();
            ctx.stroke();
            ctx.fillStyle = color;
            ctx.fillText("" + n, cx-8, cy+2);
        }
    </script>
    <script type="text/javascript" src="draw.js"></script> 

    <style type="text/css">
        body{ margin-left: 10px; margin-top:10px;}
        canvas  {  border: 1px solid red; }
    </style>
</head>
<body>
    <canvas id="canvas" width="410" height="330">
        <p>Your browser doesn't support canvas.</p>
    </canvas>
</body>
</html>

The javascript file will be overwritten when you run the C# code.
draw.js

function drawMarkers(ctx) {
    drawMark(161.089, 215.198, 'rgb(82,128,16)', ctx);
    drawMark(283.556, 12.712, 'rgb(92,49,66)', ctx);
    drawMark(98.915, 37.67, 'rgb(181,136,150)', ctx);
    drawMark(318.432, 102.905, 'rgb(92,49,66)', ctx);
    drawMark(378.091, 238.713, 'rgb(182,83,25)', ctx);
    drawMark(318.736, 80.199, 'rgb(92,49,66)', ctx);
    drawMark(122.702, 252.349, 'rgb(82,128,16)', ctx);
    drawMark(173.836, 44.669, 'rgb(181,136,150)', ctx);
    drawMark(141.157, 172.273, 'rgb(82,128,16)', ctx);
    drawMark(102.749, 56.737, 'rgb(181,136,150)', ctx);
    drawCluster(306.908, 65.272, 'rgb(92,49,66)', 3, ctx);
    drawCluster(378.091, 238.713, 'rgb(182,83,25)', 1, ctx);
    drawCluster(141.649, 213.273, 'rgb(82,128,16)', 3, ctx);
    drawCluster(125.167, 46.359, 'rgb(181,136,150)', 3, ctx);
    ctx.fillStyle = 'rgb(0,0,0)';
    ctx.fillText('Clusters = ' + 4, 20, 320);
}

Written by kunuk Nykjaer

September 17, 2011 at 9:35 am

Clustering – Grid cluster with C# and html-canvas example (part 1)

with 3 comments

Grid clustering With merging supported if the centroid of the grid clusters are too close to each other.
The second post with updated code can be seen here.
https://kunuk.wordpress.com/2011/09/17/clustering-k-means-and-grid-with-c-example-and-html-canvas-part-2

Sometimes when you want to visualize points, there can be to many points and they overlap.
To avoid this problem you can use different clustering methods and group points which are close to each. A cluster contains information such as the number of points. The idea is to apply interaction method with a click-event for the cluster where you will zoom in. The view gets expanded and the points stop overlapping.

Here I look at a grid version of clustering. You divide the screen into grid sections and put the points in a grid cluster. Then you apply a centroid calculation to put the location of the cluster point.

In this example I will visualize both the points and the cluster point. Normally you would only visualize the cluster points.
For the visualization I will use html and canvas. The clustering algorithm is implemented in C#. The C# program generates a draw.js file with point and cluster points drawing information.

The running time is something like O(m*n) where m is number of grids and n is number of points.
This clustering visualization works good enough but is not always perfect. Perfect would be always putting points to the closest cluster point.
I will compare this with K-means at a later time.

Here’s an example of how the points are clustered in grids.

The cluster points has a number which tells how many points it contains. Notice the bottom cluster with size 8. It has merged with it’s neighbor cluster where you can see the points are spread in two grid area.

The code examples will generate an image similar to above when run out of the box.

Here are the source codes.

Program.cs

using System;

// By Kunuk Nykjaer
class Program
{
    static void Main(string[] args)
    {
        var algo = new Algorithm();
        algo.Run("gridcluster");
        algo.GenerateDataString();

        Console.WriteLine("press a key ...");
        //Console.ReadKey();
    }
}

Algorithm.cs

using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Text;

// By Kunuk Nykjaer
public class Algorithm
{
    static readonly CultureInfo Culture_enUS = new CultureInfo("en-US");
    const string S = "G";

    readonly List<XY> _points = new List<XY>();
    List<XY> _clusters = new List<XY>();

    public static readonly Random _rand = new Random();
    public const double MinX = 10;
    public const double MinY = 10;
    public const double MaxX = 400;
    public const double MaxY = 300;
    public const int GridX = 4;
    public const int GridY = 3;

    const double AbsSizeX = MaxX - MinX;
    const double AbsSizeY = MaxY - MinY;

    // for bucket placement calc
    const double DeltaX = AbsSizeX / GridX;
    const double DeltaY = AbsSizeY / GridY;

    static private readonly string NL = Environment.NewLine;
    private const string ctx = "ctx";

    public Algorithm()
    {
        CreateDataSet();
    }

    void CreateDataSet()
    {
        // CREATE DATA SET

        // Create random spread points
        //for (int i = 0; i < 100; i++)
        //{        
        //    var x = MinX + rand.NextDouble() * (MaxX);
        //    var y = MinY + rand.NextDouble() * (MaxY);
        //    _points.Add(new XY(x, y));
        //}

        // Create random region of clusters
        for (int i = 0; i < 10; i++)
        {
            var x = MinX + _rand.NextDouble() * 100;
            var y = MinY + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }

        for (int i = 0; i < 10; i++)
        {
            var x = MinX + 200 + _rand.NextDouble() * 100;
            var y = MinY + 10 + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }

        for (int i = 0; i < 10; i++)
        {
            var x = MinX + 50 + _rand.NextDouble() * 100;
            var y = MinY + 150 + _rand.NextDouble() * 100;
            _points.Add(new XY(x, y));
        }
    }

    public void Run(string cluster)
    {
        if (cluster == "gridcluster")
        {
            GridCluster grid = new GridCluster(_points);
            _clusters = grid.GetCluster();
        }
    }

    public static string GetRandomColor()
    {
        int r = _rand.Next(10, 240);
        int g = _rand.Next(10, 240);
        int b = _rand.Next(10, 240);
        return string.Format("'rgb({0},{1},{2})'", r, g, b);
    }

    public static void SetCentroidForAllBuckets(IEnumerable<Bucket> buckets)
    {
        foreach (var item in buckets)
        {
            var bucketPoints = item.Points;
            var cp = GetCentroidFromCluster(bucketPoints);
            item.Centroid = cp;
        }
    }

    public static string GetId(int idx, int idy)
    {
        return idx + ";" + idy;
    }

    static void CreateFile(string s)
    {       
        var path = new FileInfo(@"C:\temp\draw.js");
        var isCreated = FileUtil.WriteFile(s, path);
        Console.WriteLine(isCreated + " = File is Created");
    }

    static XY GetCentroidFromCluster(List<XY> list)
    {
        var color = GetRandomColor();

        if (list.Count == 0)
            return null;

        XY centroid = new XY(0, 0);
        foreach (XY p in list)
        {
            p.Color = color;
            centroid.X += p.X;
            centroid.Y += p.Y;
        }

        int count = list.Count();
        centroid.X /= count;
        centroid.Y /= count;

        var cp = new XY(centroid.X, centroid.Y) { Size = count, Color = color };
        return cp;
    }

    public void GenerateDataString()
    {
        var sb = new StringBuilder();

        // markers
        var head = NL + "function drawMarkers(ctx) {" + NL;
        var tail = NL + "}" + NL;

        sb.Append(head);

        // grid
        sb.Append("ctx.beginPath();" + NL);
        for (int i = 0; i < GridX + 1; i++)
        {
            sb.Append(String.Format("ctx.moveTo({0}, {1});{2}", 
                ToStringEN(MinX + i * DeltaX), ToStringEN(MinY), NL));
            sb.Append(String.Format("ctx.lineTo({0}, {1});{2}", 
                ToStringEN(MinX + i * DeltaX), ToStringEN(MaxY), NL));
        }
        for (int j = 0; j < GridY + 1; j++)
        {
            sb.Append(String.Format("ctx.moveTo({0}, {1});{2}", 
                ToStringEN(MinX), ToStringEN(MinY + j * DeltaY), NL));
            sb.Append(String.Format("ctx.lineTo({0}, {1});{2}", 
                ToStringEN(MaxX), ToStringEN(MinY + j * DeltaY), NL));
        }
        sb.Append("ctx.stroke(); " + NL);


        // markers
        if (_points != null)
            foreach (var p in _points)
                sb.Append(string.Format("drawMark({0}, {1}, {2}, {3});{4}", 
                    ToStringEN(p.X), ToStringEN(p.Y), p.Color, ctx, NL));


        string clusterInfo = "0";
        if (_clusters != null)
        {
            foreach (var c in _clusters)
                sb.Append(string.Format("drawCluster({0}, {1}, {2}, {3}, {4});{5}", 
                    ToStringEN(c.X), ToStringEN(c.Y), c.Color,
                                        c.Size, ctx, NL));

            clusterInfo = _clusters.Count + string.Empty;
        }
        // bottom text                


        sb.Append("ctx.fillStyle = 'rgb(0,0,0)';" + NL);
        sb.Append(string.Format("ctx.fillText('Clusters = ' + {0}, {1}, {2});{3}", 
            clusterInfo, ToStringEN(MinX + 10), ToStringEN(MaxY + 16), NL));

        sb.Append(tail);

        CreateFile(sb.ToString());
        //Console.WriteLine(sb.ToString());
    }

    public static string ToStringEN(double d)
    {
        double rounded = Math.Round(d, 3);
         return rounded.ToString(S, Culture_enUS); 
    }

    public class XY
    {
        public double X { get; set; }
        public double Y { get; set; }
        public string Color { get; set; }
        public int Size { get; set; }
        public string XToString { get { return X.ToString(S, Culture_enUS); } }
        public string YToString { get { return Y.ToString(S, Culture_enUS); } }
        public XY(double x, double y)
        {
            X = x;
            Y = y;
            Color = "'rgb(0,0,0)'";//default
            Size = 1;
        }
    }
    public class Bucket
    {
        public string Id { get; private set; }
        public List<XY> Points { get; private set; }
        public XY Centroid { get; set; }
        public int Idx { get; private set; }
        public int Idy { get; private set; }
        private bool _IsUsed;
        public bool IsUsed
        {
            get { return _IsUsed && Centroid != null; }
            set { _IsUsed = value; }
        }

        public Bucket(int idx, int idy)
        {
            IsUsed = true;
            Centroid = null;
            Points = new List<XY>();
            Idx = idx;
            Idy = idy;
            Id = GetId(idx, idy);
        }
    }

    public abstract class ClusterAlgorithm
    {
        //id, bucket
        public readonly Dictionary<string, Bucket> bucketsLookup = new Dictionary<string, Bucket>(); 
        public abstract List<XY> GetCluster();
    }
    public class KMeans : ClusterAlgorithm
    {
        //NOT IMPLE YET
        public override List<XY> GetCluster()
        {
            return null;
        }
        /* 
        1) Random k centroids
        2) Cluster data by euclidean distance to centroids
        3) Update centroids by clustered data, 
             http://en.wikipedia.org/wiki/Centroid#Of_a_finite_set_of_points
        4) Update cluster
        5) Continue last two steps until error is small, error is sum of diff 
            between current and updated centroid         
         */
    }

    public class GridCluster : ClusterAlgorithm
    {
        private readonly List<XY> _dataset;
        public GridCluster(List<XY> dataset)
        {
            _dataset = dataset;
        }

        public override List<XY> GetCluster()
        {
            var cluster = GridClusterAlgo(GridX, GridY);
            return cluster;
        }


        void MergeClustersGrid()
        {
            foreach (var key in bucketsLookup.Keys)
            {
                var bucket = bucketsLookup[key];
                if (bucket.IsUsed == false)
                    continue;

                var x = bucket.Idx;
                var y = bucket.Idy;

                // get keys for neighbors
                var N = GetId(x, y + 1);
                var NE = GetId(x + 1, y + 1);
                var E = GetId(x + 1, y);
                var SE = GetId(x + 1, y - 1);
                var S = GetId(x, y - 1);
                var SW = GetId(x - 1, y - 1);
                var W = GetId(x - 1, y);
                var NW = GetId(x - 1, y - 1);
                var neighbors = new[] { N, NE, E, SE, S, SW, W, NW };

                MergeClustersGridHelper(key, neighbors);
            }
        }

        void MergeClustersGridHelper(string currentKey, string[] neighborKeys)
        {
            double minDistX = DeltaX / 2.0;//heuristic
            double minDistY = DeltaY / 2.0;
            //if clusters in grid are too close to each other, merge them
            double minDist = MathTool.Min(minDistX, minDistY); 

            foreach (var neighborKey in neighborKeys)
            {
                if (!bucketsLookup.ContainsKey(neighborKey))
                    continue;

                var current = bucketsLookup[currentKey];
                var neighbor = bucketsLookup[neighborKey];
                if (neighbor.IsUsed == false)
                    continue;

                var dist = MathTool.Distance(current.Centroid, neighbor.Centroid);
                if (dist >= minDist)
                    continue;

                // merge
                var color = current.Centroid.Color;
                foreach (var p in neighbor.Points)
                {
                    //update color
                    p.Color = color;
                }

                current.Points.AddRange(neighbor.Points);

                // recalc centroid
                var cp = GetCentroidFromCluster(current.Points);
                current.Centroid = cp;
                neighbor.IsUsed = false; //merged, then not used anymore
            }
        }


        public List<XY> GridClusterAlgo(int gridx, int gridy)
        {
            var clusterPoints = new List<XY>();
            // params invalid
            if (_dataset == null || _dataset.Count == 0 || gridx <= 0 || gridy <= 0) 
                return clusterPoints;

            // put points in buckets
            foreach (var p in _dataset)
            {
                // find relative val
                var relativeX = p.X - MinX;
                var relativeY = p.Y - MinY;

                int idx = (int)(relativeX / DeltaX);
                int idy = (int)(relativeY / DeltaY);

                // bucket id
                string id = GetId(idx, idy);

                // bucket exists, add point
                if (bucketsLookup.ContainsKey(id))
                    bucketsLookup[id].Points.Add(p);

                // new bucket, create and add point
                else
                {
                    Bucket bucket = new Bucket(idx, idy);
                    bucket.Points.Add(p);
                    bucketsLookup.Add(id, bucket);
                }
            }

            // calc centrod for all buckets
            SetCentroidForAllBuckets(bucketsLookup.Values);

            // merge if gridpoint is to close
            MergeClustersGrid();

            // collect buckets non-empty
            foreach (var item in bucketsLookup)
            {
                var bucket = item.Value;
                if (bucket.IsUsed)
                    clusterPoints.Add(bucket.Centroid);
            }
            return clusterPoints;
        }
    }


    public class MathTool
    {
        private const double Exp = 2; // 2=euclid, 1=manhatten
        //Minkowski dist
        public static double Distance(Algorithm.XY a, Algorithm.XY b) 
        {
            return Math.Pow(Math.Pow(Math.Abs(a.X - b.X), Exp) + 
                Math.Pow(Math.Abs(a.Y - b.Y), Exp), 1.0 / Exp);
        }

        public static double Min(double a, double b)
        {
            return a <= b ? a : b;
        }
    }

    public static class FileUtil
    {
        public static bool WriteFile(string data, FileInfo fileInfo)
        {
            bool isSuccess = false;
            try
            {
                using (StreamWriter streamWriter = File.CreateText(fileInfo.FullName))
                {
                    streamWriter.Write(data);
                    isSuccess = true;
                }
            }
            catch (Exception ex) { Console.WriteLine(ex.StackTrace+"\nPress a key ... ");
                Console.ReadKey(); }

            return isSuccess;
        }
    }
}

Put both the html and javascript file in the c:\temp folder.
The C# program creates a new draw.js file in the c:\temp folder.

Open the html file with a modern browser that support canvas, like Firefox or Chrome.
The html file will display the same as the example image above out of the box.

canvas.html

<html>
<head>
    <title>Cluster test</title>            
    <script type="text/javascript">
        var G = {
            yellow: "#FF0",
            red: "#F00",
            green: "#0F0",
            blue: "#00F",
            black: "#000"
        }       

        window.onload = function draw() {
            var canvas = document.getElementById('canvas');
            if (canvas.getContext) {
                var ctx = canvas.getContext('2d');
                ctx.strokeStyle = G.black;
                ctx.font = "10pt Arial";
                drawMarkers(ctx); //use fn defined in draw.js
            }
        }

        function drawMark(cx, cy, color, ctx) {
            //draw a circle
            var radius = 4;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();
            ctx.fillStyle = color;
            ctx.fill();
            ctx.stroke();
        }

        function drawCluster(cx, cy, color, n, ctx) {
            //draw a bigger circle            
            var radius = 12;
            ctx.beginPath();
            ctx.arc(cx, cy, radius, 0, Math.PI * 2, true);
            ctx.closePath();            
            ctx.stroke();
            ctx.fillStyle = color;
            ctx.fillText("" + n, cx-8, cy+2);
        }	  
    </script>
    <script type="text/javascript" src="draw.js"></script> 

    <style type="text/css">
        body{ margin-left: 10px; margin-top:10px;}
        canvas  {  border: 1px solid red; }        
    </style>
</head>
<body>
    <canvas id="canvas" width="410" height="330">
        <p>Your browser doesn't support canvas.</p>
    </canvas>
</body>
</html>

The javascript file will be overwritten when you run the C# code.

draw.js

function drawMarkers(ctx) {
    // By Kunuk Nykjaer
    ctx.beginPath();
    ctx.moveTo(10, 10);
    ctx.lineTo(10, 300);
    ctx.moveTo(107.5, 10);
    ctx.lineTo(107.5, 300);
    ctx.moveTo(205, 10);
    ctx.lineTo(205, 300);
    ctx.moveTo(302.5, 10);
    ctx.lineTo(302.5, 300);
    ctx.moveTo(400, 10);
    ctx.lineTo(400, 300);
    ctx.moveTo(10, 10);
    ctx.lineTo(400, 10);
    ctx.moveTo(10, 106.667);
    ctx.lineTo(400, 106.667);
    ctx.moveTo(10, 203.333);
    ctx.lineTo(400, 203.333);
    ctx.moveTo(10, 300);
    ctx.lineTo(400, 300);
    ctx.stroke();
    drawMark(24.566, 48.809, 'rgb(197,120,199)', ctx);
    drawMark(63.55, 22.318, 'rgb(197,120,199)', ctx);
    drawMark(36.17, 66.005, 'rgb(197,120,199)', ctx);
    drawMark(30.083, 23.226, 'rgb(197,120,199)', ctx);
    drawMark(96.758, 53.378, 'rgb(197,120,199)', ctx);
    drawMark(97.994, 13.421, 'rgb(197,120,199)', ctx);
    drawMark(91.573, 15.839, 'rgb(197,120,199)', ctx);
    drawMark(59.451, 23.006, 'rgb(197,120,199)', ctx);
    drawMark(63.699, 81.094, 'rgb(197,120,199)', ctx);
    drawMark(21.665, 15.711, 'rgb(197,120,199)', ctx);
    drawMark(298.417, 77.83, 'rgb(11,24,61)', ctx);
    drawMark(291.38, 86.759, 'rgb(11,24,61)', ctx);
    drawMark(281.345, 95.517, 'rgb(11,24,61)', ctx);
    drawMark(284.307, 92.13, 'rgb(11,24,61)', ctx);
    drawMark(240.746, 106.23, 'rgb(11,24,61)', ctx);
    drawMark(247.004, 85.546, 'rgb(11,24,61)', ctx);
    drawMark(291.715, 33.158, 'rgb(11,24,61)', ctx);
    drawMark(216.012, 33.45, 'rgb(11,24,61)', ctx);
    drawMark(260.559, 46.371, 'rgb(11,24,61)', ctx);
    drawMark(239.061, 103.01, 'rgb(11,24,61)', ctx);
    drawMark(67.791, 210.702, 'rgb(111,163,157)', ctx);
    drawMark(146.852, 211.327, 'rgb(111,163,157)', ctx);
    drawMark(82.629, 164.594, 'rgb(106,75,235)', ctx);
    drawMark(71.372, 244.351, 'rgb(111,163,157)', ctx);
    drawMark(110.755, 255.688, 'rgb(111,163,157)', ctx);
    drawMark(144.035, 185.499, 'rgb(96,173,79)', ctx);
    drawMark(116.13, 217.092, 'rgb(111,163,157)', ctx);
    drawMark(112.972, 216.736, 'rgb(111,163,157)', ctx);
    drawMark(117.428, 246.791, 'rgb(111,163,157)', ctx);
    drawMark(100.972, 210.653, 'rgb(111,163,157)', ctx);
    drawCluster(58.551, 36.281, 'rgb(197,120,199)', 10, ctx);
    drawCluster(265.055, 76, 'rgb(11,24,61)', 10, ctx);
    drawCluster(105.534, 226.668, 'rgb(111,163,157)', 8, ctx);
    drawCluster(82.629, 164.594, 'rgb(106,75,235)', 1, ctx);
    drawCluster(144.035, 185.499, 'rgb(96,173,79)', 1, ctx);
    ctx.fillStyle = 'rgb(0,0,0)';
    ctx.fillText('Clusters = ' + 5, 20, 316);
}

Written by kunuk Nykjaer

September 15, 2011 at 11:24 pm