Friday, 24 September 2010

WCF per-call services and sessions

I ought not to admit this but it turns out there was a rather large gap in my understanding of WCF that I’ve only just noticed. I was under the assumption that client proxies didn’t use sessions until explicitly told to do so, which would allow them to remain alive without impacting service scalability. Suffice to say I was very wrong. This may sound like an esoteric/pointless discovery but it’s actually really important for application reliability.

I came across this problem because one of the features in the application I’m currently working on uses a variant of Bea Stollnitz’s virtualisation solution to page search results. I realised that if I did a search, went to lunch, and then came back and scrolled through the search results then the proxy would fault. I was already sort-of aware that proxies timed out but I wasn’t sure why. For per-call services it seemed counter-intuitive given that they’re not maintaining an open channel to the service. It turns out (as stated very clearly in chapter 4 of my favourite WCF book) that even per-call services use sessions unless told not to do so. This comes with a default inactivity timeout of 10 minutes after which the proxy will throw a CommunicationObjectFaultedException if any attempt is made to use it. The inactivity timeout can be changed on the service side, for example:

<bindings>
<wsHttpBinding>
<binding>
<reliableSession enabled="true" inactivityTimeout="00:00:05" />
</binding>
</wsHttpBinding>
</bindings>

but in the meantime the service will keep a session identifier (available in the proxy’s InnerChannel.SessionId property) for every client until either the timeout is reached or the client proxy calls the close method.

So, thought I, there are three things I need to do with all future per-call services:

  1. Mark the service as per-call so it doesn’t create a service-side object for each proxy
  2. Set the session mode to ‘not allowed’
  3. Disable reliable sessions

The per-call attribute bit is easy – just pop the InstanceContextMode property on the ServiceBehaviour attribute (not that concurrency mode is set to single – this is to enforce isolation in service calls):

[ServiceBehavior(InstanceContextMode = InstanceContextMode.PerCall,
ConcurrencyMode=ConcurrencyMode.Single))]
public class InvoiceManager : IInvoiceManager
{
// Service implementation here
}

Likewise the session mode – this should be set on the contract:

[ServiceContract(SessionMode = SessionMode.NotAllowed)]
interface IInvoiceManager
{
// Contract methods here
}

and finally the reliable session bit:

<system.serviceModel>
<bindings>
<wsHttpBinding>
<binding>
<reliableSession enabled="false" />
</binding>
</wsHttpBinding>
</bindings>
</system.serviceModel>

An alternative is to leave sessions as enabled (the default) . There are several things that can go wrong so the following code block can be taken as boilerplate for exception handling:

try
{
// Call proxy method here
}
catch (EndpointNotFoundException)
{
// Service may be down or network is playing up
}catch (MessageSecurityException)
{
// Covers a multitude of sins (typically when using custom security)
// But generally revolves around faulted channels and reliability issues
// Thrown by session timeout errors in a custom security environment (particularly where certificates are concerned)
// But also when load balancing without sticky sessions
}
catch (CommunicationObjectFaultedException)
{
// If channel wasn't faulted before the call then most likely the timeout error
// otherwise this shouldn't happen (especially with defensive coding checking the proxy's state)
}
catch (TimeoutException)
{
// The service has timed out
// This shouldn't really happen as services should be responsive
// Reconsider the service design
}
catch (SecurityAccessDeniedException)
{
// Let the user know they're being naughty
}
catch (CommunicationException)
{
// anything else goes here
}

Obviously it doesn’t include contracted faults as they should be specifically coded for and it does assume that channels that are known to have faulted aren’t being used. In most cases it’s probably sensible to wrap all this up in a generic handler which notifies the user of the problem, but in some cases you may want to attempt to restore the proxy and try again.  Michele Bustamante has a such a solution here.

However, this is not without its problems.  The one that springs to mind (because it’s caused me problems recently) is caused by round-robin load balancing.  Even when I turn off sessions and reliability on wsHttpBinding, WCF still expects server-affinity (despite the fact that it shouldn’t).  To get round this I’ve been forced to use the only binding that is truly stateless: basicHttpBinding.  In effect I’m going back to ASMX web services but with WCF plumbing.  I’m not overly concerned about the security implications at the moment because I’m using Windows Authentication with ASP.NET roles (as described in this article by ScottGu) which in WCF gives me transport security:

      <basicHttpBinding>
        <binding>
          <security mode="TransportCredentialOnly">
            <transport clientCredentialType="Windows" />
          </security>
        </binding>
      </basicHttpBinding>

It doesn’t give me message security or reliability, but as it’s point-to-point I won’t lose too much sleep over this.  Nor will I trouble myself over the lack of support for flowing transactions across service boundaries – as it’s not something I need in this scenario.  What’s more, with IIS 7 compression turned on it goes like sh*t off a shovel.

Friday, 30 July 2010

Pre-build events, shared data types, and WCF

Generating WCF proxies has always been a bit of pain, especially when one has services that share data types. The Service Reference wizard in Visual Studio is wont to mangle configurations (among other things) and so is best avoided. The recommended practice is to use SvcUtil – ideally as part of a pre-build event. There a few gotchas associated with this which have taken me a while to resolve so I thought I’d present the step-by-step guide to managing proxies.

Before starting you will need the value of PATH from the Visual Studio command prompt (the easiest way to do it is PATH > path.txt followed by a copy-and-paste). This can’t be guaranteed to be the same for each machine, but if you’re part of a team then it helps if everyone has the same build.

To set up the build event:

  1. Right-click the project in which you want the proxy file to be built and select the property pages (this won’t work for websites but it will for web applications)
  2. On the left-hand side select ‘Build events’
  3. In the pre-build event command line paste in the contents of the Visual Studio command prompt PATH that you got earlier (alternatively you can set it in the system environment variables if you have permissions)
  4. On the next line under this enter the SvcUtil bit:
    Svcutil http://localhost/MyFirstService.svc http://localhost/MySecondService.svc /language:c# /t:code /namespace:*,"MyNamespace" /noconfig /out:"$(ProjectDir)ServiceProxy.cs" /reference:"C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\System.dll" /collectionType:System.Collections.ObjectModel.ObservableCollection`1 /async /tcv:version35

    Replacing the addresses of the services with a space-delimited list of service addresses. Note the reference to System.dll which is necessary to return list types as an ObservableCollection.

  5. Add a blank file called “ServiceProxy.cs” to the project (this will be overwritten by the build event but has to be in place for the project to include it).

Now if you build the project the proxy class(es) will be generated. The only problem is that if they share types then you’ll get a stack of errors along the lines of:

EXEC : warning : There was a validation error on a schema generated during export: Source: Line: 1 Column: 1619 EXEC : Validation warning : The simpleType 'http://schemas.microsoft.com/2003/10/Serialization/:duration' has already been declared.

which will preclude a valid build. To get round this I found the following article, the gist of which is to add a section manually to the project file. To do this find the .csproj file in Windows Explorer (right-click the project in VS 2010 and select ‘Open folder in Windows Explorer) and then open it with notepad. Then paste this bit in immediately before the last </Project> tag

<Target Name="PreBuildEvent" Condition="'$(PreBuildEvent)'!=''" DependsOnTargets="$(PreBuildEventDependsOn)"> <Exec WorkingDirectory="$(OutDir)" Command="$(PreBuildEvent)" ContinueOnError="true" /> </Target>

this will turn all those errors into warnings (which won’t fail the build). Voila – a single file proxy with shared types on each build!

One point I forgot to mention with this was around namespaces. If all the services use the same WSDL namespace, then SvcUtil will mess up the return types and asynchronous delegate names. To deal with this I recommend that all services have a unique WSDL namespace (see this post for more information on how to set all namespaces) and all proxy classes have a unique CLR namespace. To do this, for each service add a:

/namespace:http://mynamespace/myservice/,MyServiceNamespace

to the end of the SvcUtil call (where the http bit is the WSDL namespace and the MyServiceNamespace bit is the CLR namespace you want the proxy class to have). This should cause the proxy code for each service to have it's own namespace while putting all the shared types in the default namespace indicated by /namespace:*,MyTypeNamespace. For example:

Svcutil http://localhost/MyFirstService.svc http://localhost/MySecondService.svc /language:c# /t:code /namespace:*,"MyNamespace.SharedTypes" /namespace:http://myurl/mysecondservice/2010/07,"MyNamespace.MySecondService" /namespace:http://myurl/myfirstservice/2010/07,"MyNamespace.MyFirstService" /noconfig /out:"$(ProjectDir)ServiceProxy.cs" /reference:"C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\System.dll" /collectionType:System.Collections.ObjectModel.ObservableCollection`1 /async /tcv:version35

will output all shared types to MyNamespace.SharedTypes but put the proxy for each service in its own namespace.

UPDATE: a better way of getting the path that is both less verbose and not machine-specific, is to use the Visual Studio environment variables.  So:

PATH=$(FrameworkSDKDir)Bin;$(MSBuildBinPath)

Will set the path to include not only SvcUtil, but also the reference assemblies.

Tuesday, 20 April 2010

Navigating SSIS packages with Linq to Xml

I’m currently writing a validator tool which compares data files against the metadata from a flat file connection manager in an SSIS package and outputs a data quality report. It’s intended to provide a quick way of verifying customer’s data before doing an on-site demonstration. Hopefully this will solve the problem of wasted journeys that I’ve often had due to data being an incorrect format. The problem stems from the fact that it takes several hours for the customer to download the data from their provider and confidentiality precludes it being sent to us over the wire – in addition to which the width and length of data makes it difficult to fix quality issues on the fly.

The only limitation is that the tool cannot make use of the Dts namespaces (because their availability in the deployment environment cannot be guaranteed), which means that some part of the vehicle is going to need re-inventing. Also, some of the data files we’re dealing with have in excess of 1000 columns, so we’d like to be able to simply reflect the column metadata from the existing dtsx packages (rather than replicate it elsewhere). This has led me to the obvious conclusion of issuing XPath statements against the packages so that I can build a list of columns with data type information and such. However this got rather unwieldy, so I though it would be a good time to use LINQ to XML. I’ve avoided this thus far, as it’s very rare that I work with raw XML these days – I’m one of those lazy facists who believes that XML should never be seen outside of standards documents or the message layer (although as with all such ill-founded pronouncements this falls down somewhat with derivatives such as XHTML and XAML). To be honest, it’s swings and roundabouts between using LINQ to XML and the ‘old way’. It’s still full of string literals and anonymous types, but iteration syntax is slightly more elegant.

I won’t reproduce the entire project here – just enough to make it through the basics of reading the package. I may cover the data type matching in another post. The following constant is used throughout the code:

public const string DTS_NAMESPACE = @"{www.microsoft.com/SqlServer/Dts}";

First up is getting a list of the connection managers in the package (which is a PITA because the name is an attributed child node of the <ConnectionManager> element):

var connectionManagers = from cm in packageXml.Root.Elements(
                DTS_NAMESPACE + "ConnectionManager").Elements(DTS_NAMESPACE + "Property")
                        where cm.Attribute(DTS_NAMESPACE + "Name").Value == "ObjectName"
                        select new
                        {
                            connectionManagerName = cm.Value,
                            connectionManagerXml = cm.Parent
                        };

This gives us the name of the connection manager and the child XML nodes in separate variables. We can then iterate through this (for this example we’re grabbing flat file connections only):

foreach (var connectionManager in connectionManagers)
{
   XDocument connectionXml = XDocument.Parse(
       connectionManager.connectionManagerXml.ToString());

   List<string> connectionTypes = (from ct in connectionXml.Root.Elements(DTS_NAMESPACE + "Property")
                                   where ct.Attribute(DTS_NAMESPACE + "Name").Value == "CreationName"
                                   select ct.Value.ToString()).ToList<string>();

   if (connectionTypes.Count == 1)
   {
       string connectionType = connectionTypes[0];

       if (connectionType.ToUpper() == "FLATFILE")
       {
           // all the Xml for the connection manager is in the connectionXml variable
       }
   }
}

We can get at the connection-level properties using the following method:

private string GetConnectionManagerProperty(
   XDocument connectionManagerXml, string propertyName)
{
   List<string> properties = (from pr in connectionManagerXml.Root.Elements(DtsxPackage.DTS_NAMESPACE + "Property")
                              where pr.Attribute(DtsxPackage.DTS_NAMESPACE + "Name").Value == propertyName
                              select pr.Value.ToString()).ToList<string>();
   Debug.Assert(properties.Count <= 1);

   if (properties.Count == 1)
   {
       return properties[0];
   }

   return string.Empty;
}

like so:

string unicodeFlag = GetConnectionManagerProperty(connectionManagerXml, "Unicode");

Similarly we can iterate through the columns like so:

var columns = from cl in connectionManagerXml.Root.Descendants(DtsxPackage.DTS_NAMESPACE + "FlatFileColumn")
             select cl;

foreach (var columnProperties in columns)
{
   var properties = (from pr in columnProperties.Elements(DtsxPackage.DTS_NAMESPACE + "Property")
                     where pr.Attribute(DtsxPackage.DTS_NAMESPACE + "Name").Value == "ObjectName" ||
                     pr.Attribute(DtsxPackage.DTS_NAMESPACE + "Name").Value == "DataType" ||
                     pr.Attribute(DtsxPackage.DTS_NAMESPACE + "Name").Value == "DataPrecision" ||
                     pr.Attribute(DtsxPackage.DTS_NAMESPACE + "Name").Value == "DataScale" ||
                     pr.Attribute(DtsxPackage.DTS_NAMESPACE + "Name").Value == "ColumnDelimiter" ||
                     pr.Attribute(DtsxPackage.DTS_NAMESPACE + "Name").Value == "MaximumWidth"
                     select new PropertyPair()
                     {
                         Name = pr.Attribute(DtsxPackage.DTS_NAMESPACE + "Name").Value,
                         Value = pr.Value
                     });
}

It would be nice to be able to deconstruct the a dtsx in a single XPath/XML statement, but the rather strange schema for dtsx appears to preclude that. Hopefully some XmlMagician will show me how much more concisely and simply this could be achieved.

Friday, 2 April 2010

DataGridViewComboBoxColumn and access modifiers

Despite my best efforts I’m still working on a Windows Forms project from a few months ago.  Apart from the fact that it’s helping me to appreciate WPF in a big way (particularly the way that modifying complex layouts is so much easier in XAML) it’s also raising issues that I should really be aware of by now.

One of these is that the data source for a DataGridViewComboBoxColumn’s list contents must have a public access modifier (assuming you’re binding to an IList).  This is so trivial I’m only mentioning it because it caused me a hour of grief last night trying to work out why my DataGridView wasn’t playing ball.  I wrongly assumed that I could use the internal access modifier and all would be well as long as the class was in the same project as the DataGridView.  Not so – as long as the fields being bound to the DisplayMember and ValueMember properties are public, then all is well (even if the class itself is internal).

For example, the following class:

internal class MyListSourceClass
{
public string Description {get;set;}
public string Value {get;set;}
}

Will bind quite happily in the following code:

List<MyListSourceClass> listSource = GetListDataFromSomewhere();

// where MyComboBoxColumn is the DataGridViewComboBoxColumn to be populated
MyComboBoxColumn.DataSource = listSource;
MyComboBoxColumn.DisplayMember = "Description";
MyComboBoxColumn.ValueMember = "Value";

but if we change the access modifiers on the Description and Value fields of MyListSourceClass then all sorts of nonsense happens.  I’m so looking forward to getting back to WPF.

Thursday, 11 March 2010

Dynamically creating a ServiceHost for all ServiceContracts in an assembly

I’m currently shoehorning about 100 WCF services into an existing application that uses .NET Remoting. In order to be consistent with the Remoting code they’ll be exposed using wsHttpBinding. In a simple world this would mean that I could keep them completely separate from the existing code base by deploying them to IIS (the target environment is Windows 2003) as .svc files. Unfortunately IIS isn’t on the target environment required list and as this isn’t our product we can’t mandate that it should be. The upshot of this is that we have to self-host the services which is going to require 100 instances of ServiceHost (1 per service contract).

As I don’t fancy hard coding all the service contracts into the hosting code I need a quick way of setting up the hosting. On the plus side I’m leaving well alone as much of the existing code as possible, which means that I’m putting all the service contracts in a single separate assembly/class library. So what I’ve come up with is some helper code that can be pointed at an assembly to create a service host for each service contract in that assembly. The principle is pretty simple:

  • Reflect all classes in the assembly that implement interfaces with ServiceContractAttribute
  • Create a ServiceHost for each of these class and add to an array

I should highlight that although the methodology does facilitate a generic approach to life-cycle management, this is simply an elegant hack. No application should sensibly self-host this many services. As I understand it (and I’d be grateful for any corrections if I’m talking nonsense here) IIS uses a single listener and fires up the services as required, then keeps them alive for a finite time. Each instance of ServiceHost will have it’s own listener and keep the service loaded for the lifetime of the host even if no client ever calls it. So apart from the lack of scalability inherent in this, there is also a start up delay as all the ServiceHost instances are initialised.I can't vouch for the scalability of this - according to the 2nd edition of Programming WCF Services incoming calls are dispatched by monitoring threads to the I/O completion pool which has 1000 threads by default. It's not entirely clear whether the monitoring threads are pooled so I don't know how much more/less efficient this is than hosting in IIS.

This HostHelper class exposes 2 methods:

  • GetServiceTypes()

    Returns all the classes that can be exposed as WCF services (i.e. either implement an interface that has ServiceContractAttribute or have the attribute themselves)

  • GetServiceContractType()

    Returns the contract (interface) for the service class

I’ve designed it to sit in the same assembly as the service contracts (hence the use of GetExecutingAssembly() to self-reference), but one could just as easily pass in the assembly to be reflect as a parameter:

public static class HostHelper
{
   public static Type GetServiceContractType(Type type)
   {
       if (!type.IsClass)
       {
           throw new InvalidOperationException();
       }

       Type[] contractInterfaces = type.GetInterfaces();

       foreach (Type contractInterface in contractInterfaces)
       {
           if (contractInterface.GetCustomAttributes(
               typeof(ServiceContractAttribute), true).Length > 0)
           {
               return contractInterface;
           }
       }

       if (type.GetCustomAttributes(
           typeof(ServiceContractAttribute), true).Length > 0)
       {
           return type;
       }

       throw new InvalidOperationException();
   }

   public static List<Type> GetServiceTypes()
   {
       Assembly assembly = Assembly.GetExecutingAssembly();
       Debug.Assert(assembly != null);

       List<Type> serviceTypes = new List<Type>();
       Type[] types = assembly.GetTypes();

       foreach (Type type in types)
       {
           bool isWCFType = IsWCFType(type);

           if (isWCFType)
           {
               serviceTypes.Add(type);
           }
       }

       return serviceTypes;
   }

   private static bool IsWCFType(Type type)
   {
       if (type.IsClass)
       {
           if (type.GetCustomAttributes(typeof(ServiceContractAttribute), true).Length > 0)
           {
               return true;
           }
           else
           {
               Type[] classInterfaces = type.GetInterfaces();

               foreach (Type classInterface in classInterfaces)
               {
                   if (classInterface.GetCustomAttributes(
                       typeof(ServiceContractAttribute), true).Length > 0)
                   {
                       return true;
                   }
               }
           }
       }

       return false;
   }
}

The code that runs up the multiple service hosts can be put into a console application:

class Program
{
   private const string BASE_ADDRESS = "http://localhost:8889/";

   private static List<ServiceHost> m_Hosts;

   static void Main(string[] args)
   {
       List<Type> types = HostHelper.GetServiceTypes();

       if (types.Count > 0 && m_Hosts == null)
       {
           m_Hosts = new List<ServiceHost>();
       }

       foreach (Type type in types)
       {
           Type contract = HostHelper.GetServiceContractType(type);
           BindingElement bindingElement = new HttpTransportBindingElement();
           Binding binding = new CustomBinding(bindingElement);
           string fullBaseAddress = string.Concat(BASE_ADDRESS,type.Name);

           ServiceHost host = new ServiceHost(type, new Uri(fullBaseAddress));

           host.AddServiceEndpoint(contract, binding, "");

           ServiceMetadataBehavior metaDataBehavior = new ServiceMetadataBehavior();
           metaDataBehavior.HttpGetEnabled = false;
           host.Description.Behaviors.Add(metaDataBehavior);

           host.AddServiceEndpoint(typeof(IMetadataExchange), binding, "MEX");

           Console.WriteLine("{0}: Opening host for {1}", type.Name, DateTime.Now.ToString());
           host.Open();
           Console.WriteLine("{0}: Host for {1} is open", type.Name, DateTime.Now.ToString());

           m_Hosts.Add(host);
       }
      
       Console.Read();

       if (m_Hosts != null && m_Hosts.Count > 0)
       {
           foreach (ServiceHost host in m_Hosts)
           {
               if (host.State == CommunicationState.Opened)
               {
                   host.Close();
               }
           }
       }
   }
}

I’ve added metadata exchange endpoints so proxies could be generated using SvcUtil against http://localhost:8889/ActivityMeasureDataAccess, but basically all the service hosts are stored in the m_Hosts member variable.

This example could easily be ported to a Windows Service to operate like a poor man’s WAS, although it would need handling for the Faulted event to shore up the reliability:

static void host_Faulted(object sender, EventArgs e)
{
   ServiceHost host = sender as ServiceHost;
   Debug.Assert(host != null);

   Debug.Assert(host.State == CommunicationState.Faulted);

   m_Hosts.Remove(host);

   Type serviceType = host.Description.ServiceType;
   Debug.Assert(host.BaseAddresses.Count == 1);
   Uri baseAddress = host.BaseAddresses[0];
   host = new ServiceHost(serviceType, baseAddress);

   ⁄⁄ Other host setup code here

   m_Hosts.Add(host);
}

Which basically shuts down the faulted host, removes it from the global list, and replaces it with one that has the same settings.

Wednesday, 10 March 2010

Using generics with WCF client proxy interfaces

In the spirit of trying to learn something new at least once in a while, I’ve discovered something about interfaces in .NET – i.e. that a class can inherit from multiple interfaces with the same method signatures and only have to implement the interfaces once. This may sound like a useless feature but it actually solves a problem that I have with WCF service contract factoring and generics (more on that later). To illustrate, let us consider the following interface:

interface IDataAccess<T>
where T : class
{
List<T> GetData();
}

A typical use for such an interface would be generic data access. GetData() will always return a list of T (whatever T might be). For the purposes of this example T will be a the Car class:

public class Car
{
public string Make { get; set; }
public string Registration { get; set; }
public int? Mileage { get; set; }

public override string ToString()
{
   return string.Format("{0}, registration {1}, with {2} miles on the clock",
       Make, Registration, Mileage.Value.ToString());
}
}

We can then provide a concrete implementation of IDataAccess<T>

public class CarDataAccess : IDataAccess<Car>
{
public List<Car> GetData()
{
   Car myCar = new Car()
   {
       Make = "Vauxhall Zafira",
       Mileage = 50000,
       Registration = "DY07 XXX"
   };

   List<Car> myList = new List<Car>();
   myList.Add(myCar);
   return myList;
}
}

Now we can also ask the class to implement an interface that, rather than using generics, references the concrete (Car) class:

interface ICarDataAccess
{
List<Car> GetData();
}

and change the first line of the data access class to reflect this:

public class CarDataAccess : IDataAccess<Car>, ICarDataAccess

The great thing is that not only does this compile, but we can use either interface to access the GetData() method of the concrete class:

ICarDataAccess myDataObject = new CarDataAccess();
List<Car> results = myDataObject.GetData();

Console.WriteLine(results[0].ToString());

IDataAccess<Car> myOtherDataObject = myDataObject as IDataAccess<Car>;
Debug.Assert(myOtherDataObject != null);

results = myOtherDataObject.GetData();

Console.WriteLine(results[0].ToString());

So why I am impressed by this - what possible practical use does it have? The answer is that it helps us to overcome on of the major limitations of WCF client proxies – specifically the lack of support for using generic interfaces in service contracts. For example, if you were to decorate the IDataAccess<T> interface with the [ServiceContract] attribute:

[ServiceContract]
interface IDataAccess<T>
where T : class
{
List<T> GetData();
}

and deploy the CarDataAccess class as a WCF service, then it would compile but would throw an InvalidOperationException when the service host started with an error message along the lines of:

The contract name ‘xxx’ could not be found in the list of contracts implemented by the service ‘yyy’.

So we have to have to use the non-generic interface on the service side. However we can use generics on the client side by manually adding the generic interface to the list of interfaces implemented by the client proxy:

public class CarDataAccessServiceClient : ClientBase<ICarDataAccess>, ICarDataAccess, IDataAccess<Car>
{
// generated proxy code here
}

This doesn’t break the proxy, but it does allow us to write generic client code:

public void DoSomething<T, D>()
where T : ICommunicationObject, IDataAccess<D>, IDisposable, new(),
where D : class
{
    T myProxy = new T();

    try
    {
       myProxy.Open();
       List<D> results = myProxy.GetData();
       myProxy.Close();
    }
    catch
    {
       myProxy.Abort();
    }
}

An easy trick, but one that is about to save me hours of coding.

Does maintainability really matter?

As a software developer I would answer ‘yes’, primarily because it makes my life easier – I suspect that all professional developers would agree. The traditional argument in favour of not being a code cowboy is presented very well in this article. Yet I’ve struggled over the last few days as I transit from the code monkey to the man with the business hat on, to make a compelling case for this being true in all industries.

I’m going to use the word ‘maintainability’ in a very general sense, partly because otherwise I’d have to scour the web looking for the definition most aligned with my own position, but also because I believe the definition drives the justification. Experience and pragmatism has made me aware that all software is ultimately maintainable in line with the infinite monkey theorem – therein is the root of my problem.

A few days ago I was asked by one of my fellow directors if there was a solid case for using SSIS as an ETL tool instead of a combination of .NET data access code and T-SQL. We’ve recently partnered with another of his companies to produce some BI extensions to their software, although as we’re 1 day away from shipping it was more of a theoretical question. His business case for not using SSIS was his other company didn’t currently use it and so the lack of in-house skills in tandem with the fact that it counts as an additional deployment platform meant that providing support would incur greater overhead and risk. Of course I trotted out all the arguments about it being more maintainable, scalable, reliable, etc. His response to this being “…that’s all well and good for the developers, but what are the benefits to the customer?”.

In theory one can cite agility as a customer benefit, as (in theory) it means that new functionality can be rolled out far quicker than for badly engineered software. But then I considered a lot of the work I’ve done in the past few years and it seems that the companies where poorly engineered software was prevalent were the ones that could afford to pay for large numbers of monkeys – even if they can’t stretch to infinity – without going into the red, in order to deliver new features quickly. In fact, many of these code monkeys have cited similar arguments as a justification for not even trying to follow best practice (although hubris does play its part).

For example, a few years ago I worked on the website of a large airline. The website was implemented in type-unsafe VBScript on Classic ASP with a SQL Server 2000 back-end (and no middleware). The absence of intelligent architecture and prevalence of spaghetti code was enough to make any half-decent developer weep. Yet they wanted to add more functionality to the website despite the in-house developers hitting entropy. So they just hired in a bunch of developers from a consultancy at around £1000 a day each and set them to work hacking new functionality – which despite much swearing at code, they did very successfully. £150k+ a month may seem a high price tag to pay for not writing decent code in the first place – but these developers were working either alone or in pairs to produce functionality on 30 day sprints that was raising anywhere between £50000 and £1m extra revenue per month in perpetuity per project.

More recently I worked on a major data warehousing project for a public sector body to replace a system which was taking around 3 days to process less than 12 million rows of data from a text file (although in fairness the width of data was around 1500 fields). The vendor of the existing system was adding new functionality all the time, but was unable to rectify core performance or reliability issues (processing would regularly fail). Despite its failings the organisation in question worked with the system for 3 years, during which time I estimate it cost over £1m in labour inefficiency – but they were able to absorb the cost while continuing core operations (albeit somewhat unreliably). However the replacement system reduced processing time to between 2 and 4 hours which enabled them to spend less time crunching data and more time analysing it (the raison d'ĂȘtre of the organisation). The smaller processing window also added value because it meant that time-sensitive data could be analysed – something which fundamentally altered the strategic capability of the business.

In the case of the airline, customer experience was unaffected – performance of the website was “good enough”. So the only rationale for maintainability would be to save around £1.5m a year (out of a £2.5bn turnover). Placed against the risk of replacing the entire codebase I can see why the status quo has held. So I conclude that:

  • Larger organisations have the resources to compensate for inefficiencies because the revenues are so high and the risk is lower than replacement
  • Small businesses with more fragile cash flow would undoubtedly benefit from better software engineering and use of existing frameworks

However there is a middle ground of businesses and government agencies, that could vastly improve the quality of their operations if their systems were more maintainable and developed according to best practice – in the meantime they’ll just soldier on with the proverbial sticking plaster.

As for me, I’ll stick with SSIS for ETL because it’s cheaper for the business and there’s less risk in supporting it than trying to support a mass of spaghetti SQL. Performance would be adequate with either solution – data volumes aren’t likely to increase to the point where any performance difference is noticeable (although if they do then I’ll be glad I chose SSIS). The customer doesn’t care one way or the other as long as things keep working – so the technical argument wins over.

Sunday, 7 February 2010

25 years and no Thread.Sleep()

After nearly two weeks of diversions and delays (mainly related to a BI product release we’ve been working on for a while) I’m back to the Windows Forms application I’ve been doing some work on.  A lot of the operations either call out to back-end services and/or databases, or just take a long time – so I’m having to do quite a bit of asynchronous programming.  I knew this when I started the project so I read up on synchronisation and multithreading in .NET, yet once again have managed to avoid writing a single line of thread-based code or messing about with synchronisation domains.

Instead I have pretty much exclusively used the BackgroundWorker component.  By adopting an event-based model for my components it has been possible to write robust code that doesn’t bring the UI to a standstill and avoids me having to think too hard.  In similar exercises during past developments I have used the delegate-based asynchronous event model exposed by WCF and ASMX web service proxies.  I suddenly realised that I’ve never actually written threading code – ever (switching the locale using static Thread methods doesn’t count).  This isn’t because I don’t understand it, but there are so many cleaner abstractions that exist on top of it that I’ve never had cause to.

I particularly like the BackgroundWorker because of its simplicity and elegance.  There are times when it’s not appropriate (such as when you may want to invoke the same method multiple times on different threads), but for applications where the UI is the client context it insulates the developer against the more common problems and errors of synchronisation and concurrency.  The fact that it can only run one operation at a time (and will let you know via the IsBusy property) is a boon for preventing unwanted recursion.  It also places limitations on how one can access objects in the host context, thus forcing the developer down the ‘right’ road.  I think this is a good thing because while it’s beneficial to understand about the Windows message loop, synchronisation contexts, and .NET app domains – it shouldn’t be essential for writing applications where the UI doesn’t hang every time a method is called.

I would urge developers to avoid writing multithreaded code (using System.Threading) wherever possible.  It’s a pain in the backside to debug and is the modern day GOTO – put simply it makes code less maintainable.  Consider whether there is a better architecture that can be used – for example: WCF is inherently multithreaded and SvcUtil will automatically generate asynchronous versions of methods in the proxy, so breaking an application into services allows the context (i.e. the WCF host) to manage the nuts and bolts of concurrency.  The Workflow Foundation is another useful tool in the multithreaded arsenal as it enables developers to define component interaction in terms of business logic, once again letting the context manage threading (David Chappell has a first-rate guide here).  Then there’s the Parallel extensions in .NET 4 which simplify taking advantage of multiple threads – physical or logical.

I’m still curious as to how many branches of development are left that require low-level poking around.  All war stories gratefully received.

Monday, 18 January 2010

Custom exceptions and old chestnuts

During my years in the freelance wilderness I saw a lot of other people’s code. I started to empathise with the plumbers who would prefix their summary of findings with an intake of breath and the stock phrase “looks like you’ve had cowboys in here mate”. That’s not to say I am the supreme best programmer in the universe – far from it – in fact, it would be accurate to say that most of the OPC I saw was written by far better programmers (real programmers even). What I didn’t quite get is that they would all follow the same anti-patterns, one of which was around exception handling.

The basic exception anti-pattern is to create a custom exception, then rethrow all exceptions by wrapping them up in this custom exception. For example:

try
{
  DoSomeVeryDodgyMethod();
}
catch (Exception ex)
{
  throw New MyCustomException(ex);
}
The usual practice is for the custom exception to have some form of logging or user notification built in:
public class MyCustomException
{
  public MyCustomException(Exception ex)
  {
      Trace.WriteLine(ex.ToString());
      MessageBox.Show("An exception has occurred!!");
  }

  // rest of class here
}
This misses the whole point of exceptions, which is to encourage defensive coding. Exceptions are either avoidable or unavoidable and we should code appropriately. This is not the obvious sounding statement you might think. To illustrate an unavoidable exception:
using (SqlConnection connection = new SqlConnection(connectionString))
{
  try
  {
      connection.Open();
      MessageBox.Show("Connection opened");
  }
  Catch (SqlException sqlException)
  {
      Trace.WriteLine(sqlException.ToString());
      MessageBox.Show("Unable to connect to database server\n
          Check your network and try again");
  }

  try
  {
      DoSomethingDatabase(connection);
  }
  Catch (SqlException sqlException)
  {
      Trace.WriteLine(sqlException.ToString());
      MessageBox.Show("Error executing Sql");
  }

  DoSomethingElse();
}
Note how the exception is handled. As a client developer I know that there are a set of conditions beyond the control of my application that can result in a connection failure. These include problems with:
  • Network
  • Security
  • Configuration
I’ve created separate try blocks in order to isolate the problem – anything going wrong in the the first block is connection related, whereas I cannot be so sure in the second block. But the point here is that no matter how well I write the code, this exception could still happen. Deciding what to do about it is an issue for my use case. Of course if the Open() method of the SqlConnection object returned a Boolean value to indicate success then we could eschew the exception – but a connection failure is an exceptional occurrence in as much as we don’t intend for it to happen. It is outside the scope of the core business logic underlying our code. So let’s look at an avoidable exception:
try
{
  int testValue = 15 / 0;
}
Catch (DivideByZeroException ex)
{
  Trace.WriteLine("Doh!");
}
The precondition for the DivideByZeroException is that a violation of mathematical logic must occur. The exception exists so that the client developer remembers to create a use case that avoids such situations - if your application ever throws this exception then you’ve written it badly. To compare it with the last example, it’s much easier to test whether a divide-by-zero is about to happen than if a connection is likely to fail. This leads me to my final point. Going back to the first example, note that we caught the base Exception class - this is very lazy coding. An application should be tested adequately and certainly not be designed to throw unexpected exceptions. The product documentation for the .net Framework Class Library indicates (for each class) what exceptions can be thrown and why they will be thrown. WCF faults take this a step further and actually require that expected exceptions are published in the metadata, otherwise they end up as a FaultException or CommunicationException on the client side. Likewise your code should only throw exceptions to stop other developers from using it incorrectly or to highlight exceptional situations.

Generics and Dynamic LINQ

In between running around like a headless chicken on various potential BI projects, I’m still doing “real” code as part of a project that is actually going somewhere. Specifically a less-plumbing implementation of a Windows Forms DataGridView using virtual mode. My starting point was this article, so I came up a with a grid that could plumbed into data access code via two methods with the following signature:

int GetCount(Filter filter)
List<T> GetData(int rowIndex, int pageSize, F filter, S sorter)
Where type T is the class of the data payload, F is the class of the filter payload, and S is the sorter payload class. I use payload here to highlight that the data contained by these objects is subject to implementation specifics. So no prescription is made about how F and S work. However I do have to provide a concrete implementation for test (and very likely production) purposes so I decided to go with a LINQ to SQL implementation. There are numerous (very good) examples of this kind of dynamic behaviour written by folk much cleverer than I, but as this code is being written for other developers to consume I decided to play it safe and use the System.Linq.Dynamic code from the Visual Studio 2008 samples (which can also be found in the \Program Files (x86)\Microsoft Visual Studio 9.0\Samples\1033 folder or thereabouts if you have the MSDN library installed). The file in question is called Dynamic.cs and can be found in the LinqSamples\DynamicQuery project. ScottGu did an introductory post on this many moons ago but never followed it up, so I’ve actually had to do some work (!!) and fill in the blanks myself. F becomes the Filter class, while S becomes the Sorter class. As all we need to define a filter is:
  • Field name
  • Comparison operator (=, !=, >=, >, <=, <, etc)
  • Value
Or in code (with DataContract attributes applied so this can be used across WCF):
using System.Runtime.Serialization;

namespace Public.DynamicQuery
{
  /// <summary>
  /// Comparisons supported by the dynamic query interface
  /// </summary>
  public enum ComparisonOperator
  {
      EqualTo,
      NotEqualTo,
      MoreThan,
      LessThan,
      MoreThanOrEqualTo,
      LessThanOrEqualTo,
      StartsWith,
      EndsWith,
      Contains,
      IsEmpty,
      IsNotEmpty
  }

  /// <summary>
  /// Filter criteria for a given field/property
  /// </summary>
  [DataContract]
  public class FilterCriteria
  {
      /// <summary>
      /// The name of the field/property to be filtered on
      /// </summary>
      [DataMember]
      public string FieldName { get; set; }

      /// <summary>
      /// The compare operator to be applied
      /// </summary>
      [DataMember]
      public ComparisonOperator ComparisonOperator { get; set; }

      /// <summary>
      /// The value which is being filtered against
      /// </summary>
      [DataMember]
      public object Value { get; set; }
  }
}
So the Filter class is basically a list of this information with a few helper methods to make it easier to use with the System.Dynamic.Linq namespace:
using System.Collections.Generic;
using System.Diagnostics;
using System.Runtime.Serialization;
using System.Text;

namespace Public.DynamicQuery
{
  /// <summary>
  /// Specifies a boolean condition
  /// </summary>
  public enum BooleanOperator
  {
      And,
      Or
  }

  /// <summary>
  /// Provides a serializable representation of query filters that
  /// can be used in Dynamic Linq expression trees
  /// </summary>
  [DataContract]
  public class Filter
  {
      #region Private properties

      private static Dictionary<ComparisonOperator, string> m_Dictionary;

      #endregion

      #region Constructors/destructors

      /// <summary>
      /// Creates a new instance of the Filter class
      /// </summary>
      public Filter()
      {
          lock (this)
          {
              if (m_Dictionary == null)
              {
                  m_Dictionary = GetComparisonOperatorToLinqStrings();
              }
          }
      }

      #endregion

      #region Public members

      /// <summary>
      /// Indicates how the filter criteria are evaluated in relation to one another
      /// </summary>
      [DataMember]
      public BooleanOperator BooleanCompare { get; set; }

      /// <summary>
      /// A list of filter criteria
      /// </summary>
      [DataMember]
      public List<FilterCriteria> Criteria { get; set; }

      /// <summary>
      /// Returns the string part of a dynamic linq predicate for this filter object with parameter placeholders
      /// (use the GetDynamicLinqParameters() method to get the parameters)
      /// </summary>
      /// <returns></returns>
      public string GetDynamicLinqString()
      {          
          if (Criteria != null && Criteria.Count > 0)
          {
              StringBuilder output = new StringBuilder();
              Debug.Assert(m_Dictionary != null);

              int parameterCounter = 0;
              int criteriaIndex = 0;

              foreach (FilterCriteria criteria in Criteria)
              {
                  criteriaIndex++;

                  if (criteria.ComparisonOperator == ComparisonOperator.IsEmpty ||
                      criteria.ComparisonOperator == ComparisonOperator.IsNotEmpty)
                  {
                      parameterCounter--;
                  }

                  string filterString = GetDynamicLinqStringPart(
                      criteria, criteriaIndex, parameterCounter);
                  output.Append(filterString);

                  parameterCounter++;
              }

              return output.ToString();
          }

          return "true";
      }

      /// <summary>
      /// Returns the parameter array part of a dynamic linq predicate for this filter object
      /// </summary>
      /// <returns></returns>
      public object[] GetDynamicLinqParameters()
      {
          List<object> objectList = new List<object>();

          if (Criteria != null && Criteria.Count > 0)
          {
              foreach (FilterCriteria criteria in Criteria)
              {
                  if (criteria.ComparisonOperator != ComparisonOperator.IsEmpty &&
                      criteria.ComparisonOperator != ComparisonOperator.IsNotEmpty)
                  {
                      objectList.Add(criteria.Value);
                  }
              }
          }

          return objectList.ToArray();
      }

      #endregion

      #region Private methods

      private Dictionary<ComparisonOperator, string> GetComparisonOperatorToLinqStrings()
      {
          Dictionary<ComparisonOperator, string> dictionary =
              new Dictionary<ComparisonOperator, string>();

          dictionary.Add(ComparisonOperator.Contains, "{0}.Contains(@{1})");
          dictionary.Add(ComparisonOperator.EndsWith, "{0}.EndsWith(@{1})");
          dictionary.Add(ComparisonOperator.EqualTo, "{0} == @{1}");
          dictionary.Add(ComparisonOperator.IsEmpty, "{0} == null");
          dictionary.Add(ComparisonOperator.IsNotEmpty, "{0} != null");
          dictionary.Add(ComparisonOperator.LessThan, "{0} < @{1}");
          dictionary.Add(ComparisonOperator.LessThanOrEqualTo, "{0} <= @{1}");
          dictionary.Add(ComparisonOperator.MoreThan, "{0} > @{1}");
          dictionary.Add(ComparisonOperator.MoreThanOrEqualTo, "{0} >= @{1}");
          dictionary.Add(ComparisonOperator.NotEqualTo, "{0} != @{1}");
          dictionary.Add(ComparisonOperator.StartsWith, "{0}.StartsWith(@{1})");

          return dictionary;
      }

      private string GetDynamicLinqStringPart(FilterCriteria criteria,
          int criteriaIndex, int parameterCounter)
      {
          StringBuilder output = new StringBuilder();

          if (criteria.ComparisonOperator == ComparisonOperator.IsEmpty ||
              criteria.ComparisonOperator == ComparisonOperator.IsNotEmpty)
          {
              output.Append(string.Format(m_Dictionary[criteria.ComparisonOperator],
                  criteria.FieldName));
          }
          else
          {
              output.Append(string.Format(m_Dictionary[criteria.ComparisonOperator],
                          criteria.FieldName, parameterCounter.ToString()));
          }

          if (criteriaIndex < Criteria.Count)
          {
              output.Append(string.Format(" {0} ", BooleanCompare.ToString()));
          }

          return output.ToString();
      }

      #endregion
  }
}
All we need to sort things is the field name and sort order:
using System.Runtime.Serialization;
namespace Public.DynamicQuery
{
  public enum SortDirection
  {
      Ascending,
      Descending
  }

  /// <summary>
  /// Sort directions for a single field
  /// </summary>
  [DataContract]
  public class SortCriteria
  {
      /// <summary>
      /// The field/property name to sort by
      /// </summary>
      [DataMember]
      public string FieldName { get; set; }

      /// <summary>
      /// The direction to order the data in
      /// </summary>
      [DataMember]
      public SortDirection SortOrder { get; set; }
  }
}
Then put these in a list, along with the dynamic LINQ helper methods:
using System.Collections.Generic;
using System.Runtime.Serialization;
using System.Text;

namespace Public.DynamicQuery
{
  /// <summary>
  /// Provides a serializable representation of sort criteria that can be used in expression trees
  /// </summary>
  [DataContract]
  public class Sorter
  {
      /// <summary>
      /// The fields and direction to sort by
      /// </summary>
      [DataMember]
      public List<SortCriteria> SortCriteria { get; set; }

      /// <summary>
      /// Returns the sort criteria as an expression that can be used with Dynamic LINQ
      /// </summary>
      /// <returns></returns>
      public override string ToString()
      {
          StringBuilder output = new StringBuilder("");

          if (SortCriteria != null && SortCriteria.Count > 0)
          {
              foreach (SortCriteria sortCriteria in SortCriteria)
              {
                  output.Append(sortCriteria.FieldName);

                  if (sortCriteria.SortOrder == SortDirection.Ascending)
                  {
                      output.Append(" Ascending,");
                  }
                  else
                  {
                      output.Append(" Descending,");
                  }
              }
              output.Remove(output.Length - 1, 1);
          }
          else
          {
              output.Append("1");
          }

          return output.ToString();
      }
  }
}
Anyway, to cut a long story short I wanted to be able to use these classes with LINQ to SQL in such a way as to derive the method signatures needed for virtual mode. I also needed an update method that could be used to post updates without needing to maintain the data context. So basically I wanted a generic wrapper class that would generate the methods for any LINQ to SQL table/class. Luckily I referred to the .NET gospel according to Juval Lowy (aka Programming .NET Components) which, among other things, contains appendices on Generics and Reflection. In particular I took advantage of the fact that Generics in .NET allow you to constrain the type to subclasses of particular class, or classes with a default constructor – to name a few. Also, all LINQ to SQL data contexts are derived from the DataContext class – which exposes the type-safe GetTable<T>() method for getting at table classes. I also found a very useful answer to a post (which I subsequently lost) on StackOverflow that explained how to pass updates via LINQ to SQL without access to the original data context. I wrapped this up in the ApplyChanges() method – which at least makes sense to me:
using System.Collections.Generic;
using System.Data.Linq;
using System.Linq;
using System.Linq.Dynamic;

namespace Public.DynamicQuery
{
  public class DataWrapper<T,D>
      where T : class
      where D : DataContext, new()
  {
      /// <summary>
      /// Retrieves a block of data within the given parameters
      /// </summary>
      /// <param name="startIndex">The zero-based index of the first row to retrieve</param>
      /// <param name="pageSize">The number of rows to retrieve</param>
      /// <param name="filter">The filters to apply to the dataset before returning the results</param>
      /// <param name="sorter">The sorting and ordering for the dataset before returning the results</param>
      /// <returns></returns>
      public List<T> GetData(int startIndex, int pageSize,
          Filter filter, Sorter sorter)
      {
          List<T> results = null;

          if (filter == null)
          {
              filter = new Filter();
          }

          if (sorter == null)
          {
              sorter = new Sorter();
          }

          using (DataContext dc = new D())
          {
              var query = (from p in dc.GetTable<T>()
                           select p);

              results = query.Where(filter.GetDynamicLinqString(),
                  filter.GetDynamicLinqParameters())
                  .OrderBy(sorter.ToString()).Skip(startIndex)
                  .Take(pageSize).ToList<T>();
          }

          return results;
      }

      /// <summary>
      /// Returns the total count of records in the dataset with the given filter applied
      /// </summary>
      /// <param name="filter">The filter to apply to the dataset</param>
      /// <returns></returns>
      public int GetCount(Filter filter)
      {
          int resultCount = 0;

          if (filter == null)
          {
              filter = new Filter();
          }

          using (DataContext dc = new D())
          {
              var query = (from p in dc.GetTable<T>()
                           select p);

              resultCount = query.Where(filter.GetDynamicLinqString(),
                  filter.GetDynamicLinqParameters()).Count<T>();
          }

          return resultCount;
      }

      /// <summary>
      /// Applies the given changes back to the dataset
      /// </summary>
      /// <param name="updates">Records which have been updated</param>
      /// <param name="inserts">Records to add</param>
      /// <param name="deletes">Records to be deleted</param>
      public void ApplyChanges(List<T> updates, List<T> inserts, List<T> deletes)
      {
          using (DataContext dc = new D())
          {
              if (updates!=null && updates.Count > 0)
              {                  
                  dc.GetTable<T>().AttachAll<T>(updates);
                  dc.Refresh(RefreshMode.KeepCurrentValues,updates);
              }

              if (inserts!=null && inserts.Count > 0)
              {
                  dc.GetTable<T>().InsertAllOnSubmit<T>(inserts);
              }

              if (deletes != null && deletes.Count > 0)
              {
                  dc.GetTable<T>().AttachAll<T>(deletes);
                  dc.GetTable<T>().DeleteAllOnSubmit<T>(deletes);
              }

              dc.SubmitChanges();
          }
      }
  }
}
So to put this all together we can add in some LINQ to SQL classes pointing to our favourite sample database (in this case the Product table from AdventureWorks) and run the following code in a console application: // Create a filter
            FilterCriteria filterCriteria = new FilterCriteria()
          {
              FieldName = "Name",
              ComparisonOperator = ComparisonOperator.Contains,
              Value = "Hex"
          };

          Filter filter = new Filter()
          {
              BooleanCompare = BooleanOperator.And,
              Criteria = new List<FilterCriteria>()
          };

          filter.Criteria.Add(filterCriteria);

          // Create a sorter
          SortCriteria sortCriteria = new SortCriteria()
          {
              FieldName = "ProductNumber",
              SortOrder = SortDirection.Descending
          };

          Sorter sorter = new Sorter()
          {
              SortCriteria = new List<SortCriteria>()
          };

          sorter.SortCriteria.Add(sortCriteria);

          DataWrapper<Product, AdventureWorksDataContext> dataWrapper =
              new DataWrapper<Product, AdventureWorksDataContext>();
        
          // Run the queries and apply the results
          int resultCount = dataWrapper.GetCount(filter);
          Console.WriteLine("Total dataset: " + resultCount.ToString() + " records");

          List<Product> results = dataWrapper.GetData(3, 3, filter, sorter);

          foreach (Product product in results)
          {
              Console.WriteLine("Name: " + product.Name + ", ProductNumber: "
                  + product.ProductNumber);
          }
Now as to dealing with the latency problem on a virtual mode DataGridView – that’s another problem (although Bea Stollnitz does have an interesting article on that topic). We could also do some type-safety checking on the filter and sorter classes, but you get the general idea.

Thursday, 14 January 2010

You’re so 2000-and-late

So I’m sat in a meeting around a piece of greenfield data warehousing work and the client platform requirement comes up: SQL Server 2005.  At this point the song ‘Boom Boom Pow’ sailed into my head – specifically Fergie’s line about being ‘so 3008’.  Then on the way home I get into a discussion with a very nice chap who says his techies have advised him not to adopt new software until the first service pack.  My eyes roll.

So is being an early adopter like being at the front of the Landing Craft in ‘Saving Private Ryan’?  Are you just the bullet catcher for the software vendor’s incompetence?  I say it depends on the vendor.  On greenfield projects in my organisation, unless there is a compelling reason not to use the latest release version of a platform then we will use the latest version.  Why? because we’re a Microsoft shop and the latest version usually has extra features that require us to write less plumbing (using WCF instead of Remoting is very good case in point).  Why create work for ourselves and add unnecessary risk to projects?

I’ve heard the ‘wait until the first service pack’ argument from professionals who ought to know better.  Early adopters are not what they once were – Microsoft releases CTPs, Betas, and RCs before it does the RTM.  At which point the product has been pretty extensively tested by a wide audience of users.  That’s not to say there are no bugs before it goes to market, but there’s a lot less than there used to be and they fix them on a very quick rolling cycle.  This isn’t about whether bugs should be in software – despite what some people would have us believe it’s not just a problem faced by Microsoft.

The point is: if a product is not ready for release, it shouldn’t be on the market.  I’m not an early adopter for the sake of it, but since Vista I’ve used the latest version of whatever (VS2008, .NET 3.5, Windows 7, SQL Server 2008 etc.) as soon as it’s gone RTM.  To date this has caused me 1 problem for internal use software (which wasn’t Microsoft) and zero problems for product deployments to customers.