Archive for the ‘Visualization’ Category
Favorite material design button with or without JS
I am having fun creating things at Codepen.io.
I have created a toggle button with JS and without using JS.
For the non-JS version I used the radio-button hack technique with stacked radio buttons.
Without JS
With JS
Being creative – Making logos with css
My latest hobby is to find logos as images and re-create them with css and responsive design.
I have added mouse over effects to make the logos dynamic.
You can resize the width of the browser window and you should see the logos adapts to the screen width.
Build for modern browsers, tested with latest Chrome and Firefox
(I don’t want to spend too much time with browser-testing in my leisure time).
Here are some of the results.
Lego Kaizen at work
I have been working a couple of years now using agile methodologies (Scrum).
I started in a new team recently where they use the Scrum methodology.
The team works in the spirit of agile mindset and is constantly finding small improvements. Besides the daily scrum tasks we have the flexibility to take upon Kaizen tasks with the purpose of improving the daily workflow which are not necessarily related to the user stories.
The next thing recently just introduced are the ability to flag your current mode at your desk.
- Green – Available if you have questions or want help
- Yellow – Only contact me if it is important
- Red – Busy at the moment
You take a lego piece and display your current mode.
With this hopefully the team can be more effective where communications is directed at those available and the busy ones have ability to flag that they want to be left alone to complete their tasks without being disturbed.
K-nearest neighbor in 2D dimension space
K-nearest neighbor
There are miscellaneous algorithms for searching nearest neighbors.
An alternative method is to use grid indexing strategy.
Slowly expand the grid boxes from the center to find the k-nearest neighbors.
The grid is used as a filtering mechanism to reduce the search space.
This works fast for distributed data across the space and is a good alternative algorithm for dynamic data that changes position frequently.
Reference links:
Project implemented in C# is available at Github:
A survey of techniques for fixed radius near neighbor searching:
The origin is in ring 0.
The origin can anywhere in the box and the distance to test on first iteration must be at least 1 x grid distance. The first iteration starts with data from ring 0 and 1.
The algorithm goes like this:
i = 0 Point origin list currRing = empty list nextRing = empty list temp while all rings not explored i = i + 1 temp = empty for all point in nextRing test distance between origin and point if distance is within i * grid put it in currRing else put it in temp nextRing = empty Add all from temp to nextRing temp = empty if(i==1) temp = points from ring 0 and ring 1 else temp = points from ring i For all point in temp test distance between origin and point if distance is within i * grid put it in currRing else put it in nextRing If there are at least k points in currRing then exit while loop end while loop if currRing count is < k then add all from nextRing to currRing sort currRing by distance take first k from currRing
Grid version algorithm:
Apply algorithm described above.
Searching: O(n * m)
where m is grid cells and k << n.
Naive version algorithm:
For origin point check the distance to every other point.
Take k first items sorted by distance.
Searching: O(n log n)
Single detection in 2D dimension
Single detection
Here I define single detection as data which has a certain minimum distance to every other data.
It can be used for anomaly detection or outlier detection of data set. It can also be used for collision detection.
Reference links:
Project implemented in C# is available at Github:
A survey of techniques for fixed radius near neighbor searching:
Given a list of points in 2D dimension how can you find the ones which are not close to the other points?
I will include my Big O notation calculations and some test runs of the implementations.
There is a naive implementation in O(n2)
.
There are also better alternatives which runs faster than the naive version.
First generate a grid of squares like this and put all your data in the grid according to their positions in the dimension.
In the figure above there is a single item illustrated with a red dot inside the blue box.
For anomaly detection iterate the grid boxes and find those which has a single item inside a box. Those items are candidate as an anomaly item. If there are multiple items inside the blue box then it cannot be a candidate as an anomaly item because of the distance constraint. The candidate can be anywhere inside the box, thus the outer circle shows which areas that must be examined. If there is no other item inside the outer circle, then we know the candidate is an anomaly. On worst case 20 boxes must be examined as illustrated above.
Euclidean distance is used.
The relation between the max distance and grid square distance can be calculated using Pythagoras.
Max Distance = c
Grid box distance = a
c2 = a2 + a2 => a = sqrt( (c2) / 2)
Naive version algorithm:
For all the points check the distance to every other point.
Searching: O(n2)
Grid version algorithm:
Insert all the items in the grid buckets.
Take the neighbor items of the candidate item’s grid. You will select items from 20 boxes as a worst case. Then iterate all the items and test the distance. If all the distance are above the maximum allowed then the candidate is detected as an anomaly.
This algorithm is much better than the naive version, runs very fast and is relatively simple to implement. Each grid box is a bucket of items. Use a Hash table for each bucket.
Inserting the items into the grids take O(n)
. Removing and inserting an item takes O(1)
.
Searching for anomalies take O(m * n)
.
If there are A anomalies in the data set and m grids this algorithm will run in:
Initialization: O(n + m)
+ Searching: O(m * A * n/A)
=> O(m * n)
K-d tree with nearest neighbor algorithm:
Insert all the items in the grid buckets takes O(n).
Do also apply a K-d tree data structure. From the candidate item’s in the box apply the first nearest neighbor algorithm. If the distance is above maximum allowed then it is detected as an anomaly.
It take O(n log n)
time to generate the K-d tree. The nearest neighbor search takes
O(log n)
.
Removing and inserting an item takes O(log n)
. If there are A anomalies in the data set and m grids then searching for anomalies take O(m * A * log n)
.
This algorithm will run in:
initialization: O(m + n * log n)
+ Searching: O(m * A * log n)
Test results
With my test runs of items randomly distributed data the grid version performs best. The naive version is very slow. The K-d Tree version is much faster than the naive version.
I found this C# K-d Tree implementation with nearest neighbor search which I used for testing the algorithm.
Sample picture:
The test runs are the time to detect the anomalies, the searching running time.
The data structure initialization is not included.
For n = 1.000.000.
K-d Tree it took about 30 seconds.
Grid and Naive about 2 seconds.
Random distributed, n = 100 Algorithm: Milli seconds: Single detection found: NeighborGrid 12 40 K-d Tree 128 40 Naive 2 40
Random distributed, n = 10.000 Algorithm: Milli seconds: Single detection found: NeighborGrid 141 8384 K-d Tree 2543 8384 Naive 36421 8384
Random distributed, n = 20.000 Grid: 4286 x 3572 MaxDistance: 0.2 Algorithm: Milli seconds: Single detection found: NeighborGrid 1434 18682 K-d Tree 5878 18682 Naive 129401 18682
Random distributed, n = 30.000 Grid: 1705 x 1421 MaxDistance: 0.5 Algorithm: Milli seconds: Single detection found: NeighborGrid 521 27100 K-d Tree 7936 27100 Naive slow
grid size is smaller thus the NeighborGrid runs fast.
Random distributed, n = 50.000 Grid: 4286 x 3572 MaxDistance: 0.2 Algorithm: Milli seconds: Single detection found: NeighborGrid 1472 42268 K-d Tree 13361 42268 Naive very slow
Random distributed, n = 1.000.000 Grid: 1705 x 1421 MaxDistance: 0.5 Algorithm: Milli seconds: Single detection found: NeighborGrid 983 35824 K-d Tree 23966 35824 Naive forever
Conclusion
These are just sample runs from random distributed data.
If the number of anomalies were very high and the data set were different then the K-d Tree algorithm might run faster. I would guess that the data set would have to be very specific and not likely to be a common scenario. I have given my best estimated Big O running time analysis for the algorithms.
The algorithm runs fast because of the single item grid box test. First they iterate the grid and skip every grid box except for those who has a single item inside. If you expect few anomalies for large data set then this would run very fast because most grid boxes would be skipped for further examination. Thus the running time for anomaly search is closer to O(m)
for A << n
.
If you are working with dynamic data where positions changes, then the Grid algorithm will run faster because of the O(1)
operations. With K-d tree you will have to rebuild the tree occasionally and delete/update takes O(log n)
.
Sample picture for n = 100
Percent distribution of numbers with C#
You have a list of n integers and you want to display the percentage distribution of the values.
The requirement:
You must round the numbers such the percent sum is 100 and the percent values must be integer values.
The percent values must be fair distributed according to the values.
How would you do it?
If you calculate the distribution and round the numbers the sum might not be equal to 100.
Here is one way to do it.
program.cs (algorithm)
using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text; class Program { const string DrawFilePath = @"C:\temp\"; static void Main(string[] args) { // Insert test data here var list = new List<int> { 30, 70, 10, 90 }; var result = GetValuesMapped(list); GenerateJavascriptDrawFile(result); Console.WriteLine("Open canvas.html to watch the result\npress a key to exit .."); Console.ReadKey(); } static List<Data> GetValuesMapped(List<int> list) { const int MAPTO = 100; // 100 % const int min = 0; const int max = 10000; var mapped = new List<MapData>(); var noresult = new Data[list.Count].ToList(); double sumList = list.Sum(); // No value if (sumList <= 0) { return noresult; } // Sanitize data for (int i = 0; i < list.Count; i++) { if (list[i] < min) { list[i] = min; } if (list[i] > max) { list[i] = max; } } // Map data int id = 0; foreach (var i in list) { var percent = Map(i, 0, sumList, 0, MAPTO); mapped.Add(new MapData { Id = id++, Modulus = percent - Math.Truncate(percent), Divisor = (int)percent }); } // Adjustment is needed // Iterate increment by 1 until sum is correct // sort by decimal values var sorted = mapped.OrderByDescending(i => i.Modulus).ToList(); int sumFloor = mapped.Sum(i => i.Divisor); int addsNeeded = MAPTO - sumFloor; // Add need values until for (int i = 0; i < addsNeeded; i++) { sorted[i % sorted.Count].Divisor++; } // Sort back by id var ordered = sorted.OrderBy(i => i.Id).ToList(); // Populate data var result = list.Select((t, i) => new Data { Value = t, Percent = ordered[i].Divisor }).ToList(); return result; } internal class MapData { public int Id { get; set; } public double Modulus { get; set; } public int Divisor { get; set; } } internal class Data { public int Value { get; set; } public int Percent { get; set; } } /// <summary> /// Value x in range [a;b] is mapped to a new value in range [c;d] /// </summary> static double Map(double x, double a, double b, double c, double d) { var r = (x - a) / (b - a) * (d - c) + c; return r; } internal static class FileUtil { public static void WriteFile(string data, FileInfo fileInfo) { try { using (StreamWriter streamWriter = File.CreateText(fileInfo.FullName)) { streamWriter.Write(data); } } catch { throw; } } } // Create canvas data in draw.js file static void GenerateJavascriptDrawFile(List<Data> list) { var sb = new StringBuilder("function drawData(ctx) {\n"); const int xbeg = 30; // start x coord const int ybeg = 100; const int lenMultiply = 5; // percentage length display const int yline = 40; // next line offset const int txtX = 50; // text display offset const int txtY = 4; const int colorMin = 20; // min color range const int colorMax = 200; const int topY = -70; const string textCount = "count"; const string textPercent = "%"; if (list != null && list.Count > 0) { // draw top sb.Append("\t// top\n"); sb.AppendFormat("\tctx.fillText('{0}{1}', {2}, {3});\n", 100, textPercent, xbeg, topY + ybeg - txtY); sb.AppendFormat("\tdrawLine({0}, {1}, 'rgb(0,0,0)', {2}, ctx);\n", xbeg, ybeg + topY, (100 * lenMultiply) + 1); // draw lines var rand = new Random(); sb.Append("\n\t// lines\n"); for (int i = 0; i < list.Count(); i++) { var color = string.Format("'rgb({0},{1},{2})'", rand.Next(colorMin, colorMax), rand.Next(colorMin, colorMax), rand.Next(colorMin, colorMax)); sb.AppendFormat("\tdrawLine({0}, {1}, {2}, {3}, ctx);\n", xbeg, (ybeg + i * yline), color, (list[i].Percent * lenMultiply) + 1); } // draw text sb.Append("\n\t// text\n"); sb.Append("\tctx.fillStyle = 'rgb(0,0,0)';\n"); for (int i = 0; i < list.Count(); i++) { sb.AppendFormat("\tctx.fillText('{0}{1}', {2}, {3});\n", list[i].Percent, textPercent, xbeg, i * yline + ybeg - txtY); sb.AppendFormat("\tctx.fillText('{0}: {1}', {2}, {3});\n", textCount, list[i].Value, xbeg + txtX, i * yline + ybeg - txtY); } } sb.Append("}"); var path = new FileInfo(string.Concat(DrawFilePath, "draw.js")); FileUtil.WriteFile(sb.ToString(), path); } }
Put both the html and javascript file in the c:\temp folder.
The C# program creates a new draw.js file in the c:\temp folder.
Open the html file with a modern browser that support canvas, like Firefox, Chrome or Internet Explorer 10+.
canvas.html (visualization)
<html> <head> <title>Canvas</title> <script type="text/javascript" src="draw.js?v=1"></script> <script type="text/javascript"> function drawLine(x, y, color, len, ctx) { ctx.lineWidth = 2; ctx.strokeStyle = color; ctx.strokeRect(x, y, len, 2); } window.onload = function draw() { var canvas = document.getElementById('canvas'); if (canvas != null && canvas != undefined && canvas.getContext) { var ctx = canvas.getContext('2d'); ctx.strokeStyle = "black"; ctx.font = "10pt Arial"; drawData(ctx); } } </script> <style type="text/css"> body { margin-left: 10px; margin-top: 10px; } canvas { border: 1px solid red; } </style> </head> <body> <canvas id="canvas" width="600" height="400"> <p> Your browser doesn't support canvas. Try Firefox, Chrome, Internet Explorer 10+ or an another modern browser. </p> </canvas> </body> </html>
The javascript file will be overwritten when you run the C# code.
draw.js (data)
function drawData(ctx) { // top ctx.fillText('100%', 30, 26); drawLine(30, 30, 'rgb(0,0,0)', 501, ctx); // lines drawLine(30, 100, 'rgb(101,78,132)', 76, ctx); drawLine(30, 140, 'rgb(32,101,168)', 176, ctx); drawLine(30, 180, 'rgb(139,151,71)', 26, ctx); drawLine(30, 220, 'rgb(163,73,143)', 226, ctx); // text ctx.fillStyle = 'rgb(0,0,0)'; ctx.fillText('15%', 30, 96); ctx.fillText('count: 30', 80, 96); ctx.fillText('35%', 30, 136); ctx.fillText('count: 70', 80, 136); ctx.fillText('5%', 30, 176); ctx.fillText('count: 10', 80, 176); ctx.fillText('45%', 30, 216); ctx.fillText('count: 90', 80, 216); }
For the list = 30, 70, 10, 90
gives this result: