Wednesday, February 19, 2014

C#: Speeding up Dictionary

Level: 2 where 1 is noob and 5 is totally awesome
System: C# on .NET

I like micro optimizing, I find it very funny, relaxing and I learn a lot while doing it. When working on HaveBox, I sometimes discover odd things, which can add performance to HaveBox. Most of the work on HaveBox is micro optimization, which is often expensive and not worth the effort in larger system. But some days ago, when I worked on HaveBox, I did stumble onto something regarding the Dictionary class. It is still micro optimization, but it is so simple and cheap, it might be interesting for other than me.

My Discovery was


When trying to get a value from a key in a Dictionary, the Dictionary class is using a comparer class, to compare the given key with the keys in the dictionary. When instantiating a Dictionary class with the default constructor, The Dictionary class is using the EqualityComparer class as default comparer class. The EqualityComparer class is designed to handle general situations, and is used in other contexts than Dictionary, so it have null checks. These null checks can slow down the Dictionary.

The Trick


As you might have guess, the trick is to write you own comparer, which is amazingly simple. It is inheriting the interface IEqualityComparer, implement the 2 method and then inject it into the Dictionary contructor. Like:

1:  using System;  
2:  using System.Collections.Generic;  
3:  using System.Diagnostics;  
4:    
5:  namespace DictionaryPerformance  
6:  {  
7:    public class Program  
8:    {  
9:      public class DIYCompare : IEqualityComparer<string>  
10:      {  
11:        public bool Equals(string x, string y)  
12:        {  
13:          return x == y;  
14:        }  
15:    
16:        public int GetHashCode(string obj)  
17:        {  
18:          return obj.GetHashCode();  
19:        }  
20:      }  
21:    
22:      static void Main(string[] args)  
23:      {  
24:        IDictionary<string, string> strings = new Dictionary<string, string>();  
25:        IDictionary<string, string> stringsWithDIYCompare = new Dictionary<string, string>(new DIYCompare());  
26:    
27:        for(var i = 0; i < 1000; i++)  
28:        {  
29:          strings.Add("Ipsum " + i, "Not important");  
30:          stringsWithDIYCompare.Add("Ipsum " + i, "Not important");  
31:        }  
32:    
33:        var stopwatch = new Stopwatch();  
34:    
35:        Console.WriteLine("@Intel I5-3337");  
36:    
37:        string tempStr = "";  
38:    
39:        stopwatch.Restart();  
40:        for (var i = 0; i < 1000000; i++)  
41:        {  
42:          strings.TryGetValue("Ipsum 999", out tempStr);  
43:        }  
44:        stopwatch.Stop();  
45:    
46:        Console.WriteLine("Fetched 1 of 1000 elements 1.000.000 times with default comparer and TryGet : {0} ms", stopwatch.ElapsedMilliseconds);  
47:    
48:        stopwatch.Restart();  
49:        for (var i = 0; i < 1000000; i++)  
50:        {  
51:          stringsWithDIYCompare.TryGetValue("Ipsum 999", out tempStr);  
52:        }  
53:        stopwatch.Stop();  
54:    
55:        Console.WriteLine("Fetched 1 of 1000 elements 1.000.000 times with DIY comparer and TryGet : {0} ms", stopwatch.ElapsedMilliseconds);  
56:      }  
57:    }  
58:  }  
59:    

The Numbers


 @Intel I5-3337  
 Fetched 1 of 1000 elements 1.000.000 times with default comparer and TryGet : 56  
  ms  
 Fetched 1 of 1000 elements 1.000.000 times with DIY comparer and TryGet : 49 ms  
 Press any key to continue . . .  

That's all.

Friday, February 14, 2014

Introducing HaveBoxStates. The .NET framework for better flows in your code

Level: 3 where 1 is noob and 5 is totally awesome
System: C# on .NET

Does this looks familiar?
You sketch a program, e.g. a games, like this:


And the implementation ended up, looking like this:


Often it is quite easy to draw a flow of a program or a part of it, but expressing it in code turns out to be hard. This is here where HaveBoxStates comes in. The purpose with HaveBoxStates, is to make it easy to express flows in code, just as we draw them,

HaveBoxStates 

HaveBoxStates is a statemachine/flow engine, to express flows in code. It is very flex and lightweight, and very easy to use. People who has worked with state machines before, might find HaveBoxStates a bit unusual, because the state machine/flow is not defined/configured before use. Even thou it doesn't have strictness predefined in a model, it still as strict. Trying to enter an invalid state would blow up the machine, just as if it was predefined in a model.

Best pratices 

The most important thing I can come up with is, you should always constructor dependency inject your logic into the states. HaveBoxStates comes with HaveBox out of the box, but through a ContainerAdaptor, you can use any container.

How to use it

The following examples are taken from the concept app. So if something needs to be elaborated, you can get the full picture there. Questions in comments is also okay :-). Notice the dependency injection in the PlayGame state.

Here is how to use it:

You draw a flow. It could be this flow:


For each state, you creates a class like this(in separated files, of course):


 using GameStateMachineConcept.Dtos;  
 using GameStateMachineConcept.Dtos;  
 using GameStateMachineConcept.GameEngine;  
 using HaveBoxStates;  
   
 public class Init : IState  
 {  
      public void ExecuteState(IStateMachine statemachine)  
      {  
           statemachine.SetNextStateTo<PlayGame>(new PlayerContext{ Points = 0, Lives = 3, });  
      }  
 }  
   
   
 public class PlayGame : IState  
 {  
      private IGameEngine _gameEngine;  
   
      public PlayGame(IGameEngine gameEngine)  
      {  
           _gameEngine = gameEngine;  
      }  
   
      public void ExecuteState(IStateMachine statemachine, PlayerContext playerContext)  
      {  
           var gameContext = new GameContext { PlayerContext = playerContext, GainedPoints = 0, };  
   
           _gameEngine.Play(gameContext);  
   
           statemachine.SetNextStateTo<Scores>(gameContext);  
      }  
 }  
   
   
 public class Scores : IState  
 {  
      public void ExecuteState(IStateMachine statemachine, GameContext GameContext)  
      {  
           GameContext.PlayerContext.Points += GameContext.GainedPoints;  
   
           if (GameContext.PlayerContext.Lives == 0)  
           {  
                statemachine.SetNextStateTo<Stop>();  
           }  
           else  
           {  
                statemachine.SetNextStateTo<PlayGame>(GameContext.PlayerContext);  
           }  
      }  
 }  
   

Each state must implement an ExecuteState of one of the following signatures:

public void ExecuteState(IStateMachine statemachine)

or

public void ExecuteState(IStateMachine statemachine, <any type> <any name>)

<any type> can be any reference type and <any name> can be any name. The second parameter, is for  DTOs and transferring data between the states. HaveBoxStates automatically cast data to right type, before calling the ExecuteState method.  You just have to give the type of the DTO, forget about casting and just use it.

Because the DTO type can be of any type, and we are not using generics. IState do not enforce implementing of the ExecuteStates, so you have to remember it or else you will get a runtime exception.

The Stop state

HaveBoxStates have one builtin state. The Stop state. If the state machine needs stop, go to the Stop state.

Wrapping it up


Setting up and starting a state machine is also quite easy. Register all states with their dependencies in a container and then instantiate a state machine with the container. Notice, do not register states to IState.

 using GameStateMachineConcept.GameEngine;  
 using GameStateMachineConcept.States;  
 using HaveBox;  
 using HaveBoxStates;  
 using HaveBoxStates.ContainerAdaptors;  
 using System;  
 using System.Collections.Generic;  
   
 namespace GameStateMachineConcept  
 {  
   public class Program  
   {  
     static void Main(string[] args)  
     {  
       var container = new Container();  
       container.Configure(config =>  
       {  
         config.For<IGameEngine>().Use<GameEngine.GameEngine>();  
         config.For<Init>().Use<Init>();  
         config.For<PlayGame>().Use<PlayGame>();  
         config.For<Scores>().Use<Scores>();  
       });  
   
       var stateMachine = new StateMachine(new HaveBoxAdaptor(container));  
       stateMachine.StartStateMachinesAtState<Init>();  
     }  
   }  
 }  
   

That is all :-)

Monday, February 10, 2014

C#: Count() or Length?


Level: 4 where 1 is noob and 5 is totally awesome
System: C# on .NET

Intro


When working with some data structures such as Arrays, strings or alike in C#, we often have 2 choices if we want to know the count of elements in the structure. For most I guess the choice might by easy, people are used to use Count() from LINQ or other structures, so Count() it often is. Length might feel like as, it is just something which pops up, when you scroll the intellisense. You see it, you know what it does, but you prefer Count(), because it makes more sense prosa wise in you code.

Are there any arguments for using Length? Yes.

Performance wise, Length is superior to Count(). Under Count() is some evaluation logic, to count the elements of the structure. Under Length is a pure IL mnemonic, meaning there is directly CLR support for Length.

An Example


I have a reduced IL example, to show what happens IL wise when Length/Count() in C# is compiled:

 .method private hidebysig static void Main(string[] args) cil managed  
 {  
      ...  
      ...  
      ...  
   
      // We initialize 3 variables  
  [0] string[] strings,     // The string array  
  [2] int32 length,         // The length variable  
  [3] int32 count,          // The count variable  
             
      ...  
      ...  
      ...  
   
      // Getting array element count by Length  
  IL_0020: ldloc.0          // puts the string array on the stack  
  IL_0021: ldlen            // using the IL mnemonic for to get length for the array. An unsigned int is returned  
  IL_0022: conv.i4          // The Length variable is a signed int so we convert the output from ldlen  
  IL_0023: stloc.2          // Stores the value, in the variable Length  
   
      ...  
      ...  
      ...  
   
      // Getting array element count by Count()  
  IL_005f: ldloc.0          // puts the string array on the stack  
                            // Next, calling Count() and gets a signed int.  
  IL_0060: call    int32 [System.Core]System.Linq.Enumerable::Count<string>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>)  
  IL_0065: stloc.3          // Stores the value, in the variable count  
      ...  
      ...  
      ...  
    
 } // end of method Program::Main  
   


Performance

To show the difference in performance, I will use the following example:


 using System;  
 using System.Diagnostics;  
 using System.Linq;  
   
 namespace ArrayCount  
 {  
   public class Program  
   {  
     static void Main(string[] args)  
     {  
       var strings = new string[1000];  
       var stopwatch = new Stopwatch();  
       var length = 0;  
       var count = 0;  
   
       Console.WriteLine("@Intel I5-3337");  
   
       stopwatch.Start();  
       for (var i = 0; i < 1000000; i++)  
       {  
         length = strings.Length;  
       }  
       stopwatch.Stop();  
   
       Console.WriteLine("Counted {0} elements by using Length : {1} ms", length, stopwatch.ElapsedMilliseconds);  
   
       stopwatch.Restart();  
       for (var i = 0; i < 1000000; i++)  
       {  
         count = strings.Count();  
       }  
       stopwatch.Stop();  
   
       Console.WriteLine("Counted {0} elements by using Count() : {1} ms", count, stopwatch.ElapsedMilliseconds);  
     }  
   }  
 }  
   

When compiling for release and running it without debugger (shift+f5) the output is:

 @Intel I5-3337  
 Counted 1000 elements by using Length : 0 ms <--- Less than millisecond, not no time:-) 
 Counted 1000 elements by using Count() : 66 ms  
 Press any key to continue . . .  

Conclusion

Now you know the difference between Length and Count(). When performance is preferred Length should always be used, but in general it depends on the context. 


Thursday, February 6, 2014

Specification by Small Talk

Level: 4 where 1 is noob and 5 is totally awesome
Disclaimer: If you use anything from any of my blog entries, it is on your own responsibility.

Intro


This a term/method I have coined, based on some good experiences I have had. Just to rule out any misunderstandings, this has nothing to do with the programming language SmallTalk. It is hardly new, and I guess most of you do it to some degree. It is not supposed to replace any specification methods. It is a suggestion for a preamble for whatever specification method is used. It is quite simple, because it is just informel communication.

Motivation


We have come far with different specifications techniques, but how the techniques are executed, sometimes troubles me. Most a companies, often have a pipeline with tasks/projects. The tasks or projects is prioritised after importance, decided on some meeting. When a task or project comes to an end, it usually releases some human resources, which has to be allocated for a new task or projects. Usually a meeting or a workshop is arranged for specifying the task or project.

This is the point which troubles me, and here is why. Without any preamble, we are ironically using a waterfall alike model for specifying. It is like, meet up with you thoughts, specify within this period, see you again at next specify sprint. I know, this is very roughly, but it is for emphasising my point. The thing is, at the first meetings, people mindset and understanding, are not aligned. Often the best responses to input, comes when the input has circled in the mind for couple of days, but by then the specification meeting or workshop is done. Another issues is also, do we have the right people at the meeting? And most importantly, is this task or project in reality the most important, or should another project have the priority? We simply don't have the best start.

So what am I suggesting?


As mention before it is quite simple, it is just talking. The concept is about to find values, share thoughts maybe discuss something technical in a relaxed way, and in the end have an aligned mindset for a specification workshop. The process can run from an idea is born, to when there are resources to bring it to live. Regardless of the work pipeline. The point is, it has to be informal. The idea can be discussed on small informal meetings, maybe by mails or at lunch. Nothing must be settled or promised. The most important part is, involved people should have the time to think about it.

The process starts by someone wants something. As an example, lets say a customers gets an idea for a new feature in the system. He takes contacts his contact from the system supplier company, and tells about the idea. The contact person talk with a coworker about it. Maybe they invite the customer to come by to hear more about it. The co-worker mention the idea at lunch, and then finds out that another coworker, knows a lot about the domain of the idea. Thereby the best person for the job is chosen, instead if the available person at the time. The point is to let idea spread like rings in water through out the company/department. 

Another gain. Maybe the idea is has more value than first anticipated. Maybe it has so much value, that the pipeline has to be reorganised. Projects without values, could even be avoided, by this early evaluation. Value is always best discovered sooner than later. 

Maybe involved developers can test or try things, to improve the idea or give it the best solution. When running a time boxed project, developers do not have much time to do so.

It all sounds good, are there any pitfalls?


Oh Yes. The biggest pitfall is to settle on a solution, so the specification workshop is all about that solution. Some specification techniques, like specification by example, is about focusing on problems, instead of the solutions. In my book, it is the way to specify.

Summary

Specification by small talk, is preparing a project for specification workshop. Less introduction and more focus. It is also for an early evaluation, and preparing for developing. It is a interesting concept I'll explore some more.