Software Programming

Kunuk Nykjaer

Archive for the ‘Csharp’ Category

Custom Configuration Section Samples – CSharp

leave a comment »

I think creating custom configuration section for a web.config or app.config file can be cumbersome.

I have created code samples for various custom configuration.
You can use it for inspiration if you need to create a custom configuration section.

Sometimes I think it’s easier to read and run code samples rather than lengthy API documentation.

https://github.com/kunukn/CustomSection-CSharp

<?xml version="1.0" encoding="utf-8" ?>
<configuration>

    <configSections>                    
        <section name="companySection" 
           type="CompanyApplication.Code.CompanySection, CompanyApplication" />
      <!-- type = "full namespace path, assembly name" -->  
    </configSections>
    
    <companySection>
        <companies>
            <company name="Google" code="googl"/>
            <company name="Apple" code="aapl"/>
            <company name="Microsoft" code="msft"/>
        </companies>
    </companySection>
    
</configuration>


I recommend these articles if you want to read some guides about custom sections.

Advertisements

Written by kunuk Nykjaer

October 12, 2014 at 11:23 am

Posted in Csharp

Tagged with

How to write unit tests – mvc, test framework and mock example

leave a comment »

The code examples are also avaiable at

https://github.com/kunukn/UnitTestExample-CSharp


Synopsis
You have been given a task to write tests for a methods in a class.

You are working with a MVC web app framework, a test framework and a mocking framework.
For this example, I will use Asp.net MVC, Visual Studio Unit Testing Framework and Rhino Mocks.

Your task is to write unit tests for the actions in controller class.
Your have to unit test that the action returns the correct outcome for the possible inputs.

This is the controller class you must test.


DataController

using System.Collections.Generic;
using System.Net;
using System.Web.Mvc;
using MyApp.Services;

namespace MyApp.Controllers
{
    public class DataController : Controller
    {
        public ActionResult GetData(string subject)
        {
            if (subject == "foo")
            {
                new ReportService().ReportAbuseUsage(subject);
                return new HttpStatusCodeResult(HttpStatusCode.Forbidden);
            }

            IList<string> data = new DataService().GetData(subject);
                        
            return Json(data, JsonRequestBehavior.AllowGet);
        }
    }
}


Analysis

The GetData action can have two possible outcome.

  • When the subject is foo it will invoke a ReportAbuseUsage method and return a http forbidden result
  • When the subject is anything else it fetch data from a service and then return a json result

The unit testing should test those two cases, input with foo and with something else than foo.
Then inspect the result and see if the outcome was as expected.

The action is dependent on two services. DataService and ReportService.
To unit-test the action you are not supposed to test those two services.
If you do then it is no longer a unit test of the action method but an integration test.

What you have is a method, which is not unit testable. You must refactor the action and mock the depended services.
The class has high coupling with the services.
Loose coupling is often preferred for various reasons. One is for better test-ability of your code.

There are various methods for refactoring to make it more testable.
I will show following techniques: constructor injection and property injection.



Refactoring

You start with adding interfaces to your services.


IDataService

using System.Collections.Generic;
namespace MyApp.Interfaces
{
    public interface IDataService
    {
        IList<string> GetData(string subject);
    }
}

IReportService

namespace MyApp.Interfaces
{
    public interface IReportService
    {
        void ReportAbuseUsage(string subject);
    }
}

DataService

using System.Collections.Generic;
using MyApp.Interfaces;

namespace MyApp.Services
{
    public class DataService : IDataService
    {
        public IList<string> GetData(string subject)
        {
            // returns data by subject, simulate get data 
            return new List<string> { "apple", "orange", "banana" };
        }
    }
}

ReportService

using MyApp.Interfaces;

namespace MyApp.Services
{
    public class ReportService : IReportService
    {
        public void ReportAbuseUsage(string subject)
        {
            // simulate report something to a repository
        }
    }
}


Property injection example

The controller class has been refactored to use property injection

using System.Collections.Generic;
using System.Net;
using System.Web.Mvc;
using MyApp.Interfaces;
using MyApp.Services;

namespace MyApp.Controllers
{
    public class DataController : Controller
    {
        private IDataService _dataService;
        public IDataService DataService
        {
            get { return _dataService ?? new DataService(); }
            set { _dataService = _dataService ?? value; }
        }

        private IReportService _reportService;
        public IReportService ReportService
        {
            get { return _reportService ?? new ReportService(); }
            set { _reportService = _reportService ?? value; }
        }
        
        public ActionResult GetData(string subject)
        {            
            if (subject == "foo")
            {
                ReportService.ReportAbuseUsage(subject);
                return new HttpStatusCodeResult(HttpStatusCode.Forbidden);
            }

            IList<string> data = DataService.GetData(subject);

            return Json(data, JsonRequestBehavior.AllowGet);
        }
    }
}

The getters only create a new instance of the service if it has not been set first.
The setters only allow to be set once if the value is not null.
The injection is done by setting the value before using the getters of the properties.


Unit testable state

Now the controller has dependencies on interfaces and not the implementation.
From here you are able to write unit test where you mock the services.

A Unit Test project has been created with reference to the project and system.web.mvc
and the nuget package has been installed: Install-Package RhinoMocks


Unit test implementation

I will use the Arrange Act Assert (AAA) Pattern.

The service methods has been mocked
(the reason I call them mocks and not stub is because I assert against them. Whether they have been called and with what arguments).

For the first unit test, I test that the action returns a json result for every input but foo.
I also verify that the DataService method was invoked and verify the ReportService method was not invoked.

For the second unit test, I test that the action returns http denied for foo input.
I also verify that the DataService method was not invoked and verify the ReportService method was invoked.

This is the unit test implementation using Visual Studio Unit Testing Framework and Rhino Mocks.

Test naming standard used is MethodName_StateUnderTest_ExpectedBehavior

using System.Collections.Generic;
using System.Net;
using System.Web.Mvc;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using MyApp.Controllers;
using MyApp.Interfaces;
using Rhino.Mocks;

namespace UnitTestMyApp
{
    [TestClass]
    public class DataControllerTest
    {
        // Injected services
        private IDataService dataService;
        private IReportService reportService;

        [TestInitialize]
        public void Setup()
        {
            dataService = MockRepository.GenerateStub<IDataService>();
            reportService = MockRepository.GenerateStub<IReportService>();

            dataService
                .Stub(s => s.GetData(Arg<string>.Is.Anything))
                .Return(new List<string>());
        }

        [TestMethod]
        public void 
        GetData_WhenCalledWithAnythingButFoo_InvokeGetDataAndReturnsJsonResult()
        {
            // Arrange            
            var controller = new DataController
            {
                DataService = dataService,
                ReportService = reportService
            };

            // act
            ActionResult news = controller.GetData(subject: "news");
            ActionResult fooish = controller.GetData(subject: "fooish");

            // Assert            
            Assert.IsNotNull(news as JsonResult);
            Assert.IsNotNull(fooish as JsonResult);
            
            dataService
                .AssertWasCalled(s => s.GetData(Arg<string>.Is.Anything));
            
            reportService
                .AssertWasNotCalled(s => s.ReportAbuseUsage(Arg<string>.Is.Anything));
        }


        [TestMethod]
        public void 
        GetData_WhenCalledWithFoo_InvokeReportAbuseUsageAndReturnsHttpDenied()
        {
            // Arrange            
            var controller = new DataController
            {
                DataService = dataService,
                ReportService = reportService
            };

            var forbidden = new HttpStatusCodeResult(HttpStatusCode.Forbidden);

            // act
            ActionResult foo = controller.GetData(subject: "foo");
            var fooHttpStatusCodeResult = foo as HttpStatusCodeResult;

            // Assert            
            Assert.IsNotNull(fooHttpStatusCodeResult);
            Assert.AreEqual(fooHttpStatusCodeResult.StatusCode, forbidden.StatusCode);
            
            dataService
                .AssertWasNotCalled(s => s.GetData(Arg<string>.Is.Anything));
            
            reportService
                .AssertWasCalled(s => s.ReportAbuseUsage(Arg<string>.Is.Anything));
        }
    }
}


Constructor injection example

The unit test for the constructor injection example is very similar to the property injection.
The DataController class looks similar to the injection example.

The refactoring for the DataController class looks like this.

DataController

using System.Collections.Generic;
using System.Net;
using System.Web.Mvc;
using MyApp.Interfaces;

namespace MyApp.Controllers
{
    public class DataController : Controller
    {
        private readonly IDataService dataService;
        private readonly IReportService reportService;

        public DataConstructorController()
            : this(new DataService(), new ReportService())
        {
        }

        public DataController(IDataService dataService, IReportService reportService)
        {
            this.dataService = dataService;
            this.reportService = reportService;
        }
        
        public ActionResult GetData(string subject)
        {            
            if (subject == "foo")
            {
                reportService.ReportAbuseUsage(subject);
                return new HttpStatusCodeResult(HttpStatusCode.Forbidden);
            }

            IList<string> data = dataService.GetData(subject);
            
            return Json(data, JsonRequestBehavior.AllowGet);
        }
    }
}


The refactoring for the DataControllerTest class looks like this.

DataControllerTest

using System.Collections.Generic;
using System.Net;
using System.Web.Mvc;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using MyApp.Controllers;
using MyApp.Interfaces;
using Rhino.Mocks;

namespace UnitTestMyApp
{
    [TestClass]
    public class DataControllerTest
    {
        // Injected services
        private IDataService dataService;
        private IReportService reportService;

        [TestInitialize]
        public void Setup()
        {
            dataService = MockRepository.GenerateStub<IDataService>();
            reportService = MockRepository.GenerateStub<IReportService>();

            dataService
                .Stub(s => s.GetData(Arg<string>.Is.Anything))
                .Return(new List<string>());
        }

        [TestMethod]
        public void 
        GetData_WhenCalledWithAnythingButFoo_InvokeGetDataAndReturnsJsonResult()
        {
            // Arrange            
            var controller = new DataController(dataService, reportService);
            
            // act
            ActionResult news = controller.GetData(subject: "news");
            ActionResult fooish = controller.GetData(subject: "fooish");

            // Assert            
            Assert.IsNotNull(news as JsonResult);
            Assert.IsNotNull(fooish as JsonResult);

            dataService
                .AssertWasCalled(s => s.GetData(Arg<string>.Is.Anything));

            reportService
                .AssertWasNotCalled(s => s.ReportAbuseUsage(Arg<string>.Is.Anything));
        }


        [TestMethod]
        public void 
        GetData_WhenCalledWithFoo_InvokeReportAbuseUsageAndReturnsHttpDenied()
        {
            // Arrange            
            var controller = new DataController(dataService, reportService);            
            var forbidden = new HttpStatusCodeResult(HttpStatusCode.Forbidden);

            // act
            ActionResult foo = controller.GetData(subject: "foo");
            var fooHttpStatusCodeResult = foo as HttpStatusCodeResult;

            // Assert            
            Assert.IsNotNull(fooHttpStatusCodeResult);
            Assert.AreEqual(fooHttpStatusCodeResult.StatusCode, forbidden.StatusCode);

            dataService
                .AssertWasNotCalled(s => s.GetData(Arg<string>.Is.Anything));

            reportService
                .AssertWasCalled(s => s.ReportAbuseUsage(Arg<string>.Is.Anything));
        }
    }
}


Written by kunuk Nykjaer

September 7, 2014 at 7:31 pm

Monads for C# – tool for cleaner code

with one comment

Tedious codes

You want to extract a phone number from a data source.

 

Do you ever write code like this?


Building building = GetBuildingFromSomeSource();            
string phone = null;
if (building != null && building.Manager != null && building.Manager.ContactInfo != null)
{
    phone = building.Manager.ContactInfo.PhoneNumber;
}

All those null checks can be tedious. And the code doesn’t look ‘clean’

 

Here’s is an alternative method of the same idea.

var phone = building == null
    ? null
    : (building.Manager == null
        ? null
        : (building.Manager.ContactInfo == null
            ? null
            : building.Manager.ContactInfo.PhoneNumber));

The code still looks tedious.

 

Here’s another alternative method of same idea.

var phone = building != null 
    && building.Manager != null 
    && building.Manager.ContactInfo != null 
        ? building.Manager.ContactInfo.PhoneNumber 
        : null;

Still looks pretty tedious.

 

Naïve

What if you just write like this.

var phone = building.Manager.ContactInfo.PhoneNumber;

Problem is you could get null reference exception because any property in the chain could be null
(which you knew because you used all those null checks in your code).

 

Then what about this?

Try catch

string phone = null;
try { phone = building.Manager.ContactInfo.PhoneNumber; }
catch { }

No that’s is wrong. You should not use exception handling for controlling the flow.

More about this here

 

Null propagating operator

What if you could write like this?

var phone = building?.Manager?.ContactInfo?.PhoneNumber;

Unfortunately it is not available yet.
You can read more about the operator here.

 

Monad extension

What we want is something like the safe navigation operator where the code is clean.

var phone = building.=> Manager.=> ContactInfo.=> PhoneNumber;

That is not a valid syntax.


If we add some parentheses and some _ to make the syntax valid we could do this.

var phone = building._(_=>_.Manager)._(_=>_.ContactInfo)._(_=>_.PhoneNumber);


This looks a little cryptic I admit but the intend should be clear.
You want to safely extract the PhoneNumber without getting null reference exception.

This syntax is possible if you use this extension method.

MonadExtension.cs

public static class MonadExtension
    {
        public static TTo _<TFrom, TTo>(this TFrom input, Func<TFrom, TTo> evaluator) 
             where TFrom : class
        {
            return input == null ? default(TTo) : evaluator(input);
        }    
    }

This is an example of a Maybe Monad.

You can read more about Monads here.

 

Cleaner syntax

The syntax is not very clean and is usually to cryptic for must people in the example above.
The Monad extension in C# are usually preferred in this style.

var phone = building
            .With( b => b.Manager)
            .With( m => m.ContactInfo)
            .With( c => c.PhoneNumber);

If you want to return a default value in case there is a null somewhere in the property chain.
Then the return extension method is used.

var phone = building
            .With( b => b.Manager)
            .With( m => m.ContactInfo)
            .Return( c => c.PhoneNumber, "unknown");

MonadExtension.cs

public static class MonadExtension
    {
        public static TTo With<TFrom, TTo>(this TFrom input, Func<TFrom, TTo> evaluator) 
             where TFrom : class
        {
            return input == null ? default(TTo) : evaluator(input);
        }    

        public static TTo Return<TFrom, TTo>(
             this TFrom input, Func<TFrom, TTo> evaluator, TTo failureValue) 
             where TFrom : class
        {
            return input == null ? failureValue : evaluator(input);
        }
    }

This is an example of a Monads implementation for C#. I encourage you to read this channel9 post. This might inspire you to write ‘more clean code’.


EDIT:
There is also this alternative found at http://stackoverflow.com/a/4281533/815507

var phone = building.GetValueOrDefault(b => b.Manager.ContactInfo.PhoneNumber);

Written by kunuk Nykjaer

June 4, 2014 at 10:38 am

Posted in Csharp

Tagged with ,

Use the DI container as a way to monitor and maintain the health of the architecture

leave a comment »

Imagine your have been given the task as a technical architect for a large project.

Are you going to use a DI container?
If not you should consider using a DI container.

DI containers makes your architecture more manageable.
Not only manageable for the implementation details but it also enables you to control how your developers will implement and use the services in the project.

Imagine you you have N developers and they are giving various tasks to implement.
How do you manage, monitor, handle changes and orchestrate their usages and implementation of the various architectural services such as:
caching, repositories, CMS-services, feed-services, logging?

How do you make sure the developers don’t make duplicated implementations and how do the developers know where to find the existing implementations used across the architectural layer?

What you can do is to let the developers express their need and let a framework (e.g. DI container) provide the services they need.

Taken from How to explain dependency injection to a 5-year old?

“When you go and get things out of the refrigerator for yourself, you can cause problems. You might leave the door open, you might get something Mommy or Daddy doesn’t want you to have. You might even be looking for something we don’t even have or which has expired.

What you should be doing is stating a need, “I need something to drink with lunch,” and then we will make sure you have something when you sit down to eat.”

Here the developers are the 5-year olds – metaphorically 🙂
They might break things and do stuff not approved by the architect.
What the architect should do is to provide the services which the developers need.

I am using MVC Asp.Net as an example (MVC based web-site)
You are building a news website.

Imagine you have a NewsController with the action Politics and a SportController with the action Tennis.

The developer needs to read data from a CMS system and return a view with populated model.
The developer has a need (read data from CMS system).

Example 1 – using DI

    public class NewsController : Controller
    {
        private readonly ICmsService _cmsService;
        public NewsController(ICmsService cmsService)
        {
            _cmsService = cmsService;
        }
		
		public ActionResult Politics()
        {
            var viewModel = new ViewModel { Data = _cmsService.GetData("politics") 
            return View(viewModel);
        }
	}
	
    public class SportController : Controller
    {
        private readonly ICmsService _cmsService;
        public SportController(ICmsService cmsService)
        {
            _cmsService = cmsService;
        }
        public ActionResult Tennis()
        {
            var viewModel = new ViewModel { Data = _cmsService.GetData("tennis") };
            return View(viewModel);
        }
    }

The developers simply add the needed service in the constructor and the service will be provided.
The developers don’t have to update the code if the service implementation is changed.
The developers must use the correct interface and the object instance is provided from elsewhere, i.e. injected.

Imagine the alternative version

Example 2 – not using DI

    public class NewsController : ServiceController
    {
        private readonly ICmsServiceImplementedByDeveloper1 _cmsService;
        public NewsController()
        {
            _cmsService = new CmsServiceImplementedByDeveloper1();
        }

        public ActionResult Politics()
        {
            var viewModel = new ViewModel { Data = _cmsService.GetData("politics") };
            return View(viewModel);
        }


    public class SportController : Controller
    {
        private readonly ICmsServiceImplementedByDeveloper2 _cmsService;
        public SportController()
        {
            _cmsService = new CmsServiceImplementedByDeveloper2();   
        }
        public ActionResult Tennis()
        {
            var viewModel = new ViewModel { Data = _cmsService.GetData("tennis") };
            return View(viewModel);
        }
    }

Here the developers have implemented their own interfaces because it was not provided and they made their own implementation to read from the CMS due to lack of communication or because they needed an implementation which was not provided at the time. Here we see code duplication. And worse when there are changes for the CMS service then all the developers must update their implementation of the CMS service. This makes it harder to maintain and adjust the system.

Written by kunuk Nykjaer

March 11, 2014 at 12:01 am

Posted in Csharp

Tagged with , ,

Dependency Injection in .NET

leave a comment »

I have worked with miscellaneous IoC containers for C# and wanted to get a more in dept knowledge about the IoC topic.

If you are a software developer then the concept IoC is a must for you.
It’s a pattern used for loose coupling and IoC containers will make your software architecture more easy to manage.

Here are some benchmarks of the various containers which is frequently updated.

I recently read a book called Dependency Injection in .NET which I highly recommend if you are into C#. I will get geeky about the DI topic and geeky is good if you want to learn 🙂

The books covers the concept and introduce you with a poor mans DI container. Then it demonstrate IoC patterns and anti-patterns and ends with review of 5 different DI containers.

Written by kunuk Nykjaer

June 28, 2013 at 10:44 am

Posted in Csharp, Framework

Tagged with

Get best k items from n items as fast as possible

leave a comment »

SortedList2 – data structure example using C#

Reference: Selection algorithm

data structure

Recently I needed something like a SortedSet or SortedDictionary which supports multiple identical values.
I Explored the BCL but could not found the data structure I needed.

The scenarie:
I have a dataset size n where I want the k best items.
This is an alternative approach than using the selection algorithm.
Using selection algorithm is also much faster than the naive approach.

I will use Big O notations (beginners guide).

A sorted data structure should have the following operations (Binary search tree):

Insert(item) -> O(log n)
Exists(item) -> O(log n)
Remove(item) -> O(log n)
Max -> O(1)
Min -> O(1)
Count -> O(1)

Neither SortedList, SortedSet or SortedDictionary supports identical values and the listed operations.
The C5 collections has the TreeBag data structure and can be used for value types.

Naive version
Sort the data and take the k best items.
The worst case is O(n * log n).
The fastest optimistic running time will be Ω(n) (if the dataset is already sorted).

What if the we have the best k items on k iterations?
Inserting the first k items takes O(k * log k)

Checking for max item takes O(1).
For the n – k iteration: checking if there exists a better item takes O( (n - k) * log (1) )

On best case scenario this gives: Ω(k * log k + (n - k) * log (1)).

For k << n
that is Ω(n).

On average case for random distributed data where k << n the running time is:
Ω(n * log k).

I will implement a data structure SortedList2 which supports multiple identical comparable values
and test the running vs. a naive implementation.

I will use the SortedSet and the Dictionary structure.

The best item in this example is defined as: smallest even number.

Test cases

best case input

random input

worst case input

The result shows how the Sortelist2 performs for various k values versus the naive version.

To avoid the worst case input you can run the data through a randomizer filter which takes O(n).
Then the running time would be similar to the random input (It’s implemented in the attached source code).
When k < 10% of n then Sortedlist2 performs better.

n = 1.000.000
k = 5

Data distribution: best case

SortedList2 Elapsed: 556 msec.
UId: 003                Comparer: 0             Name: n0
UId: 001                Comparer: 2             Name: duplicate
UId: 002                Comparer: 2             Name: duplicate
UId: 004                Comparer: 2             Name: n1
UId: 005                Comparer: 4             Name: n2

Naive Elapsed: 2707 msec.
UId: 003                Comparer: 0             Name: n0
UId: 001                Comparer: 2             Name: duplicate
UId: 002                Comparer: 2             Name: duplicate
UId: 004                Comparer: 2             Name: n1
UId: 005                Comparer: 4             Name: n2


I assume the Naive version runs fast because the data is already sorted (compiler branch prediction).
The OrderBy runs faster than O(n * log n)


Data distribution: random

SortedList2 Elapsed: 523 msec.
UId: 001                Comparer: 2             Name: duplicate
UId: 002                Comparer: 2             Name: duplicate
UId: 773997             Comparer: 2             Name: n773994
UId: 142607             Comparer: 6             Name: n142604
UId: 757235             Comparer: 6             Name: n757232

Naive Elapsed: 8483 msec.
UId: 001                Comparer: 2             Name: duplicate
UId: 002                Comparer: 2             Name: duplicate
UId: 773997             Comparer: 2             Name: n773994
UId: 142607             Comparer: 6             Name: n142604
UId: 757235             Comparer: 6             Name: n757232


I ran this multiple times and the result were similar.
The Naive version runs clearly slow here.


Data distribution: worst case

SortedList2 Elapsed: 3269 msec.
UId: 001                Comparer: 2             Name: duplicate
UId: 002                Comparer: 2             Name: duplicate
UId: 1000002            Comparer: 2             Name: n999999
UId: 1000001            Comparer: 4             Name: n999998
UId: 1000000            Comparer: 6             Name: n999997

Naive Elapsed: 2967 msec.
UId: 001                Comparer: 2             Name: duplicate
UId: 002                Comparer: 2             Name: duplicate
UId: 1000002            Comparer: 2             Name: n999999
UId: 1000001            Comparer: 4             Name: n999998
UId: 1000000            Comparer: 6             Name: n999997


Here the Naive version is best for worst case input.
I assume the Naive version runs fast because the data is reverse sorted.
The OrderBy runs faster than O(n * log n)


n = 1.000.000
k = 100.000

Data distribution: best case

SortedList2 Elapsed: 1768 msec.
Naive Elapsed: 2675 msec.


Data distribution: random

SortedList2 Elapsed: 6364 msec.
Naive Elapsed: 6064 msec.


Data distribution: worst case

SortedList2 Elapsed:16478 msec.
Naive Elapsed: 2590 msec.


Conclusion

If you want something fast for k << n then the Sortedlist2 (or the selection algorithms) are a better option than the naive approach.

Source code

Program.cs

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Globalization;
using System.Linq;
using System.Threading;

namespace Datastructure
{
    public class Program
    {
        static readonly Action<object> CW = Console.WriteLine;
        const int MaxSize = 5;
        const int N = 2 * 100 * 1000;

        public static void Main(string[] args)
        {
            Thread.CurrentThread.CurrentCulture = CultureInfo.GetCultureInfo("en-US");

            var stopwatch = new Stopwatch();
            stopwatch.Start();

            Run();

            stopwatch.Stop();

            CWF("\nSec: {0}\nPress a key to exit", stopwatch.Elapsed.ToString());
            Console.ReadKey();
        }

        static void CWF(string s, params object[] a)
        {
            Console.WriteLine(s, a);
        }

        static void Run()
        {
            var comparer = new ObjComparer<Comparer>();
            
            var rand = new Random();
            var datas = new List<IObj>();
            
            for (var i = 0; i < N; i++)
            {
                //datas.Add(new Obj { Comparer = new Comparer(i * 2), Name = "n" + i }); // best case
                datas.Add(new Obj { Comparer = new Comparer(rand.Next(1, N)), Name = "n" + i }); // random
                //datas.Add(new Obj { Comparer = new Comparer((N - i) * 2), Name = "n" + i }); // worst case
            }

            const bool displayList = true;

            // --- Run sortedlist2
            var sw = new Stopwatch();
            sw.Start();

            var sorted = new SortedList2(comparer);

            //sorted.AddAll(datas, MaxSize, false); // method 1
            foreach (var i in datas) sorted.Add(i, MaxSize); // method 2

            var result = sorted.GetAll();
            sw.Stop();
            CWF("SortedList2 Elapsed: {0} msec.", sw.ElapsedMilliseconds);
            if (displayList) foreach (var i in result) CW(i);            

            // --- Run naive
            sw = new Stopwatch();
            sw.Start();

            datas.Sort(ObjComparer<IObj>.DoCompare);
            result = datas.Take(MaxSize).ToList(); // method 1
            //result = datas.OrderBy(i => i.Comparer).Take(MaxSize).ToList(); // method 2
            
            sw.Stop();

            CWF("\nNaive Elapsed: {0} msec.", sw.ElapsedMilliseconds);
            if (displayList) foreach (var i in result) CW(i);


            // --- Run selection algo
            sw = new Stopwatch();
            sw.Start();

            var s = new Selection { List = datas, K = MaxSize };
            s.Algo();
            result = s.GetAll();

            sw.Stop();

            CWF("\nSelection algo Elapsed: {0} msec.", sw.ElapsedMilliseconds);
            if (displayList) foreach (var i in result) CW(i);
        }

    }

    public class Comparer : IComparable
    {
        public Comparer(int i) { Value = i; }

        public long Value { get; set; }
        public override int GetHashCode() { return this.Value.GetHashCode(); }
        public override bool Equals(object obj)
        {
            var other = obj as Comparer;
            if (other == null) return false;

            var eq = this.GetHashCode().Equals(other.GetHashCode());
            return eq;
        }
        public override string ToString()
        {
            return string.Format("{0}", Value.ToString());
        }

        /// <summary>
        /// Comparison algo is implemented here
        /// 
        /// Even is best
        /// If both or none are even then smallest is best
        /// </summary>
        /// <param name="obj"></param>
        /// <returns></returns>
        public int CompareTo(object obj)
        {
            var other = obj as Comparer;
            if (other == null) return -1;

            var a = (this.Value & 1) == 0; // is even?
            var b = (other.Value & 1) == 0; // is even?

            if (a && !b) return -1; // this is even, other is not
            if (!a && b) return 1; // this is not even, other is

            return this.Value.CompareTo(other.Value);
        }
    }

    public class Obj : AObj, IObj
    {
        // Insert your custom properties here
        public string Name { get; set; }
        public override string ToString()
        {
            return string.Format("UId: {0:000} \t\tComparer: {1} \t\tName: {2}",
                Uid, Comparer, Name);
        }

        public override int GetHashCode()
        {
            return Comparer.GetHashCode();
        }

        public override bool Equals(object obj)
        {
            var other = obj as IObj;
            return other != null && this.GetHashCode().Equals(other.GetHashCode());
        }
    }

    public interface IObj : IComparable
    {
        string Name { get; set; }
        Comparer Comparer { get; set; }
    }

    public abstract class AObj : IComparable
    {
        private static int _counter;
        public virtual int Uid { get; private set; }
        protected AObj() { Uid = ++_counter; }

        public Comparer Comparer { get; set; }

        public int CompareTo(object obj)
        {
            var other = obj as AObj;
            if (other == null) return -1;

            return ObjComparer<IObj>.DoCompare(this.Comparer, other.Comparer);
        }
    }

    /// <summary>
    /// Thread safe
    /// </summary>
    public class SortedList2
    {
        private readonly object _lock = new object();

        private int _count;
        private readonly Dictionary<Comparer, LinkedList<IObj>> _lookup =
            new Dictionary<Comparer, LinkedList<IObj>>();
        private readonly SortedSet<Comparer> _set;
        private readonly IComparer<Comparer> _comparer;

        public SortedList2(IComparer<Comparer> comparer)
        {
            _comparer = comparer;
            _set = new SortedSet<Comparer>(comparer);
        }

        // O(log n)
        public bool Add(IObj i)
        {
            return this.Add(i, long.MaxValue);
        }

        // O(log n)
        public bool Add(IObj i, long k)
        {
            lock (_lock)
            {
                if (i == null || k <= 0) return false;

                Comparer val = i.Comparer;

                if (_count < k) _count++;
                else
                {
                    Comparer max = _set.Max;
                    if (_comparer.Compare(val, max) >= 0) return false; // Don't add

                    // Remove old
                    this.Remove(max);
                }

                if (_set.Contains(val))
                {
                    _lookup[val].AddLast(i); // Append
                }
                else
                {
                    // Insert new
                    _set.Add(val);

                    var ps = new LinkedList<IObj>();
                    ps.AddLast(i);
                    _lookup.Add(val, ps);
                }

                return true;
            }
        }

        public void AddAll(List<IObj> objs, bool randomizeFirst = false)
        {
            AddAll(objs, int.MaxValue, randomizeFirst);
        }

        public void AddAll(List<IObj> objs, int k, bool randomizeFirst = false)
        {
            if (randomizeFirst)
            {
                var list = objs;

                #region maintain input order                
                //list = new List<IObj>();
                //list.AddRange(objs);
                #endregion 

                Randomize(list);
                foreach (var i in list) Add(i, k);
            }
            else foreach (var i in objs) Add(i, k);
        }

        // http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
        private static void Randomize(IList<IObj> list)
        {
            var rand = new Random();
            var n = list.Count;
            for (var i = 0; i < n; i++)
            {
                var j = rand.Next(n);
                var tmp = list[i];
                list[i] = list[j];
                list[j] = tmp;
            }
        }

        // O(n)
        public List<IObj> GetAll()
        {
            lock (_lock)
            {
                var all = new List<IObj>();
                var dists = _set.ToList();
                foreach (var dist in dists) all.AddRange(_lookup[dist]);
                return all;
            }
        }

        public int Count
        {
            get
            {
                lock (_lock) return _count;
            }
        }

        // O(log n)
        public bool Remove(IObj i)
        {
            lock (_lock)
            {
                if (i == null) return false;
                var isRemoved = this.Remove(i.Comparer);
                if (isRemoved) _count--;

                return isRemoved;
            }
        }

        // O(log n)
        public bool Remove(Comparer val)
        {
            lock (_lock)
            {
                return this.RemoveHelper(val);
            }
        }

        // O(log n)
        private bool RemoveHelper(Comparer val)
        {
            if (_set.Contains(val))
            {
                var bag = _lookup[val];
                bag.RemoveLast(); // O(1)
                if (bag.Count == 0)
                {
                    _lookup.Remove(val); // O(1)
                    _set.Remove(val); // O(log n)
                }

                return true;
            }
            return false;
        }
    }

    public class ObjComparer<T> : IComparer<T> where T : IComparable
    {
        public int Compare(T a, T b)
        {
            return DoCompare(a, b);
        }
        public static int DoCompare<U>(U a, U b) where U : IComparable
        {
            return a.CompareTo(b); // ascending
            //return b.CompareTo(a); // descending
        }
    }


    // http://en.wikipedia.org/wiki/Selection_algorithm
    public class Selection
    {
        public List<IObj> List = new List<IObj>();
        public int K = 1;

        public List<IObj> GetAll()
        {
            return List.Take(K).ToList();
        }

        /*     
      function select(list[1..n], k)
     for i from 1 to k
         minIndex = i
         minValue = list[i]
         for j from i+1 to n
             if list[j] < minValue
                 minIndex = j
                 minValue = list[j]
         swap list[i] and list[minIndex]
     return list[k]
     */
        public void Algo()
        {
            var n = List.Count;
            for (int i = 0; i < K; i++)
            {
                var minIndex = i;
                var minValue = List[i];
                for (int j = i + 1; j < n; j++)
                {
                    if (List[j].CompareTo(minValue) < 0)
                    {
                        minIndex = j;
                        minValue = List[j];
                    }
                }
                Swap(i, minIndex);
            }
        }

        void Swap(int i, int j)
        {
            var tmp = List[i];
            List[i] = List[j];
            List[j] = tmp;
        }
    }
}

Written by kunuk Nykjaer

February 23, 2013 at 2:18 pm

Posted in Algorithm, Csharp

Tagged with ,

Facebook Hacker Cup 2013 Round 1 Solution part 1

leave a comment »

Card Game

References:
FB hacker cup
Analysis

John is playing a game with his friends. The game’s rules are as follows: There is deck of N cards from which each person is dealt a hand of K cards. Each card has an integer value representing its strength. A hand’s strength is determined by the value of the highest card in the hand. The person with the strongest hand wins the round. Bets are placed before each player reveals the strength of their hand.

John needs your help to decide when to bet. He decides he wants to bet when the strength of his hand is higher than the average hand strength. Hence John wants to calculate the average strength of ALL possible sets of hands. John is very good at division, but he needs your help in calculating the sum of the strengths of all possible hands.

Problem
You are given an array a with N ≤ 10 000 different integer numbers and a number, K, where 1 ≤ K ≤ N. For all possible subsets of a of size K find the sum of their maximal elements modulo 1 000 000 007.

Input
The first line contains the number of test cases T, where 1 ≤ T ≤ 25

Each case begins with a line containing integers N and K. The next line contains N space-separated numbers 0 ≤ a [i] ≤ 2 000 000 000, which describe the array a.

Output
For test case i, numbered from 1 to T, output “Case #i: “, followed by a single integer, the sum of maximal elements for all subsets of size K modulo 1 000 000 007.

Example
For a = [3, 6, 2, 8] and N = 4 and K = 3, the maximal numbers among all triples are 6, 8, 8, 8 and the sum is 30.

Example input

5
4 3
3 6 2 8
5 2
10 20 30 40 50
6 4
0 1 2 3 5 8
2 2
1069 1122
10 5
10386 10257 10432 10087 10381 10035 10167 10206 10347 10088

Example output

Case #1: 30
Case #2: 400
Case #3: 103
Case #4: 1122
Case #5: 2621483

Solution by Facebook
reference:

The was the simplest problem in the competition with a 60% of success rate. 
For a given an array a of n distinct integers, 
we need to print the sum of maximum values among all possible subsets with k elements. 
The final number should be computed modulo MOD=1000000007, which is a prime number. 
First we should sort all numbers, such that a [1] < a [2] < ... < a [n].
 
Let's see in how many times the number a [i] appears as the maximum number in some subsets, 
provided that i >= k. From all numbers less than a [i] we can choose any k - 1, 
which is exactly equal to bin [i - 1][k - 1] where bin [n][k] is a binomial coefficient 
(see http://en.wikipedia.org/wiki/Binomial_coefficient). 
Therefore, the final solution is the sum of a [i] * bin [i - 1][k - 1], 
where i goes from k to n, and we need to compute all binomial coefficients 
bin [k - 1][k - 1], ..., bin [n - 1][k - 1]. 
That can be done in many ways. 
The simplest way is to precompute all binomial coefficient using simple recurrent formula
 
  bin [0][0] = 1;
  for (n = 1; n < MAXN; n++) {
    bin [n][0] = 1;
    bin [n][n] = 1;
    for (k = 1; k < n; k++) {
      bin [n][k] = bin [n - 1][k] + bin [n - 1][k - 1];
      if (bin [n][k] >= MOD) {
        bin [n][k] -= MOD;
      }
    }
  }
 
  qsort (a, n, sizeof(long), compare);
  sol = 0;
  for (int i = k - 1; i < n; i++) {
    sol += ((long long) (a [i] % MOD)) * bin [i][k - 1];
    sol = sol % MOD;
  } 
 
Note that we are not using % operator in the calculation of the binomial coefficient, 
as subtraction is much faster. 
The overall time complexity is O (n log n) for sorting and O (n^2) 
for computing the binomial coefficients.
 
Another way is to use recurrent formula 
bin [n + 1][k] = ((n + 1) / (n + 1 - k)) * bin [n][k] 
and use Big Integer arithmetics involving division. As this might be too slow, 
these values can be precomputed modulo MOD and stored in a temporary file 
as the table is independent of the actual input and thus needs to be computed only once.
Since MOD is a prime number and use calculate the inverse of the number (n + 1 - k) 
using Extended Eucledian algorithm (see http://en.wikipedia.org/wiki/Modular_multiplicative_inverse) 
and multiply with the inverse instead of dividing. This yields on O(n log n) solution.
 
By direct definition bin [n][k] = n! / (n - k)! k!, one can iterate through all prime numbers p 
less than or equal to n, and calculate the power of p in bin [n][k] using the formula
 a (n, k) = [n / p] + [n / p^2] + [n / p^3] + ... for the maximum power of p dividing the factorial n!. 
 
The most common mistakes were because competitors did not test the edge cases 
when k = 1 or k = n, and forgot to define bin [0][0] = 1. 
Another mistake was not storing the result in a 64-bit integer when multiplying two numbers.

Program.cs

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;

/// <summary>
/// Author: Kunuk Nykjaer
/// </summary>
class Program
{
    static void Main(string[] args)
    {        
        var sw = new Stopwatch();
        sw.Start();

        var lines = ReadFile("input.txt");
        Run(lines.ToList());

        sw.Stop();
        Console.WriteLine("Elapsed: {0}", sw.Elapsed.ToString());
        Console.WriteLine("press exit.. ");
        Console.ReadKey();
    }

    static void Run(IList<string> lines)
    {
        var result = new List<string>();
        var nb = 1;
        for (var i = 1; i < lines.Count; i += 2)
        {
            if (string.IsNullOrWhiteSpace(lines[i])) continue;
            if (lines[i].StartsWith("#")) continue;

            var one = lines[i].Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries);
            var two = lines[i + 1].Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries);

            var n = int.Parse(one[0]);
            var k = int.Parse(one[1]);

            var numbers = two.Select(int.Parse).ToList();
            numbers = numbers.OrderByDescending(x => x).ToList();

            var r = Algo(k, numbers);
            result.Add(string.Format("Case #{0}: {1}", nb++, r));
        }
        
        WriteFile(result);
    } 

    static class Binomial
    {                        
        public static long C(long n, long k)
        {
            // n! / (k! * (n-k)!)

            if (n < k) return 0;
            if (k == 0 || n == 1) return 1;            
            if (n == k) return 1;
            
            // This function is less efficient, but is more likely to not overflow when N and K are large.
            // Taken from:  http://blog.plover.com/math/choose.html
            //
            long r = 1;
            long d;
            if (k > n) return 0;
            for (d = 1; d <= k; d++)
            {
                r *= n--;
                r /= d;
            }
            return r;
        }
    }

    static long Algo(int k, IList<int> numbers)
    {
        const long modulus = 1000000007;
        long sum = 0;

        for (var i = 0; i < numbers.Count - 1; i++)
        {
            long a = numbers[i];
            var b = Binomial.C(numbers.Count - 1 - i, k - 1);
            sum = (sum + ((a * b) % modulus)) % modulus;
        }

        sum = sum % modulus;
        return sum;
    }
   
    #region ** File

    static IEnumerable<string> ReadFile(string path)
    {
        var list = new List<string>();
        try
        {
            using (var reader = new StreamReader(path, true))
            {
                var line = reader.ReadLine();
                while (line != null)
                {
                    list.Add(line);
                    line = reader.ReadLine();
                }
            }
        }
        catch { throw; }

        return list.ToArray();
    }
    static bool WriteFile(IEnumerable<string> lines)
    {
        var fileInfo = new FileInfo("output.txt");

        try
        {
            using (StreamWriter sw = fileInfo.CreateText())
            {
                foreach (var line in lines) sw.WriteLine(line);
            }
            return true;
        }
        catch { throw; }
    }

    #endregion File
}

Written by kunuk Nykjaer

February 3, 2013 at 11:24 pm

Posted in Algorithm, Csharp

Tagged with ,