Wednesday, 15 August 2012

Programmatically executing packages in the SSIS 2012 Catalog

Update: Have posted a solution to the execution timeout problem.

After about a year of intending to get to grips with the pre-release versions of SQL Server 2012 (Denali) and failing to do so, I was fortunate enough to land a contract where I had the option to use it from my first day across most of the stack (SSAS, SSRS, SSIS, and the database engine).  Always happy to fill my CV, I jumped at the chance and have been having a jolly good time since.  As part of the project I’ve developed some SSIS packages which my customer wants included in their continuous integration pipeline.  I’ve never done CI or formal unit testing on ETL before because I develop faultless software solutions it’s never really felt as mature as it is in application frameworks (e.g. MVC, WPF). The CI pipeline is based on SpecFlow and nUnit, so I need to be able to run SSIS packages programmatically and get meaningful outputs that I can run assertions against. Specifically I want to know:

  • Did it run successfully?
  • If not, why not?
Traditionally if I wanted to run SSIS programmatically then I’d use the Microsoft.SqlServer.Dts.Runtime namespace.  However if you use the Project Deployment model then you have to use the Catalog (or else Project Parameters won’t work), which comes with a new object model API which it has taken me some time and much head scratching to fathom.  I’ve been using the Project Deployment model with the SSIS Catalog because, quite frankly, I’d be a fool not to. Its merits are expounded elsewhere, but suffice to say it offers much greater control and flexibility over configuration with considerably less effort.
The new API is in the Microsoft.SqlServer.Management.IntegrationServices.dll assembly - but the first problem is where to find it.  It is in the GAC, but it doesn’t come up as a global assembly in the VS2010 Add Reference dialog so you’ll need to browse to it:
addreference
From the Add Reference dialog, click ‘Browse’ and navigate to %WINDIR%\assembly\GAC_MSIL\Microsoft.SqlServer.Management.IntegrationServices:
selectfile
There should be just 1 folder (11.0.0.0__89845dcd8080cc91), but there may be more in future versions, so select the latest folder and then select the DLL that should be inside it (Microsoft.SqlServer.Management.IntegrationServices.dll).  Back in the Add Reference dialog, click ‘Add’.  Before closing the dialog, add references to the following assemblies (which do appear in the GAC list):
  • Microsoft.SqlServer.ConnectionInfo
  • Microsoft.SqlServer.Management.Sdk.Sfc
  • Microsoft.SqlServer.Smo
To start with you will need to establish a connection to the SQL Server then the SSIS server:
var connection = new SqlConnection(@"Data Source=(local);Initial Catalog=master;Integrated Security=SSPI;");
var integrationServices = new IntegrationServices(connection);
Then you need to get a handle to the package.  Packages are stored in projects, which must be located in a folder, which will be in a catalog.  So the package can be gotten thus:
var package = integrationServices
    .Catalogs["MyCatalog"]
    .Folders["MyFolder"]
    .Projects["MyProject"]
    .Packages["MyPackage"];
It’s worth pointing out that each collection has a Contains() method.  This allows you to verify whether or not the catalog, folder, project, or package exists.  So you could rewrite the above code more defensively:
if (integrationServices.Catalogs.Contains("MyCatalog"))
{
    var catalog = integrationServices.Catalogs["MyCatalog"];

    if (catalog.Folders.Contains("MyFolder"))
    {
        var folder = catalog.Folders["MyFolder"];

        if (folder.Projects.Contains("MyProject"))
        {
            var project = folder.Projects["MyProject"];

            if (project.Packages.Contains("MyPackage"))
            {
                var package = project.Packages["MyPackage"];
            }
        }
    }
}
Once you have a handle to the package you can just go ahead and execute:
long executionIdentifier = package.Execute(false, null);
The return value is an execution identifier which relates back to that particular request to run the package.  The SSIS server will quite happily let you run multiple instances of the same package concurrently – so to get feedback on an executing package you need to get a reference to the execution instance (rather than the package):
var execution = catalog.Executions[executionIdentifier];
An interesting point here (and a difference from the Execute method in Microsoft.SqlServer.ManagedDTS assembly) is that, by default, execution is asynchronous.  This is not especially desirable in a testing scenario where one would like control to be returned once the package has finished executing.  So with a little help from Davide Mauri’s blog post I was able to deduce that the Execute method could be made synchronous by using the overload that includes a collection of ExecutionValueParameterSet objects:
var setValueParameters = new Collection<PackageInfo.ExecutionValueParameterSet>();
setValueParameters.Add(new PackageInfo.ExecutionValueParameterSet
{
    ObjectType = 50,
    ParameterName = "SYNCHRONIZED",
    ParameterValue = 1
});

long executionIdentifier = package.Execute(false, null, setValueParameters);

So once execution is complete, the outcome data can be gleaned from the Execution object (which is of type ExecutionOperation).  All messages generated during execution will be put in the Messages property which is a collection of OperationMessage objects - likewise the status will be in the Status property.  As the status is an enumeration of obvious values I won’t go into it here.


The messages are slightly different.  Back in the ManagedDTS days there was an errors collection - not so here.  If you just want the errors then you’ll need to filter the list of messages by MessageType.  Unfortunately this isn’t an enumeration so by a process of deduction and assumption I’ve established that warnings and errors have message types of 110 and 120 (although not necessarily in that order).  So to get the error and warning messages, you can use LINQ:

var errors = execution.Messages.Where(
    m => m.MessageType == 120 || m.MessageType == 110)
    .Select(m => m.Message);
Once you get to grips with the SSIS 2012 catalog you’ll not look back.  There’s a whole host of stuff I haven’t covered here (e.g. environment configurations) but this should be enough to move over from the ‘old’ way of doing things with a reasonable degree of confidence.