Monday, 20 May 2013

Poor performance when upgrading SSRS reports from 2005

I recently had to upgrade some SSRS reports from 2005 to 2012.  I installed the SSDT/BIDS upgrade for VS2012 and imported them straight in.  I did come across the following issues:

But otherwise it all went pretty smoothly.  However after a while we started to get feedback about reports either loading very slowly or, in some cases, crashing/freezing the browser completely.  I tested the offending reports and although they did cause a freeze of a few seconds in IE9, IE10, and Chrome on Windows 7 64-bit – they did eventually render.  However they caused Firefox to display an error message:

A script on this page may be busy, or it may have stopped responding. You can stop the script now, or you can continue to see if the script will complete.

Which prevented the report ever rendering and could only be made to go away by ending the process with Task Manager.  Also, running the report in VS2012 took over 2 minutes, whereas in VS2005 it only took a few seconds.  Exporting the report to a (MHTML) web archive did yield a slight difference in output size (the 2012 version was about 20% larger), but not enough to warrant the problems I was experiencing.  What was interesting was that exporting to Excel from either version (using the rs web service) was about 12-15 times faster than viewing in the browser.

What I eventually did (i.e. should have done in the first place) was to run a network capture with IE10 developer tools and identify the point of the freeze.  It was happening immediately after a call that was returning the body of the report – about 17MB of data.  My Internet connection is pretty fast so the download was only taking about 2 seconds, the freeze was happening after the download.  Understanding why took me back to the difference between SSRS 2005 and later versions.

Basically SSRS 2005 pre-rendered the whole report in cache before displaying.  So if you have a 100 page report then it would take as long to view it even if you only looked at 1 page.  SSRS 2008 and later versions only pre-rendered the visible page, thus placing less load on the server and being much more scalable.  So the fact that on a large report (1000 rows of data), SSRS 2005 was outperforming 2012 seemed counterintuitive.  What I failed to realise is that the problem was with the Report Viewer control.

The first versions of the Report Viewer displayed the report in an IFRAME – so it loads in the same way as every other web page, via a plain old HTTP request.  This is good when you want more data, less often – the whole page loads in one go.  The 2010 viewer uses Ajax (better for less data, more often), which means the whole body of the request has to be loaded into the JavaScript pipeline before it gets displayed in the browser.  Ajax isn’t intended for shoving large blocks of data around – it’s designed to break the process of loading a web page into easily manageable chunks.  So for multi-page reports it performs much better as the amount of content displayed at once is much less and it doesn’t have to reload the whole page.  However, 17MB on an Ajax request is a bit much (and very bad practice).  Too much, it would seem, for Firefox.  Certainly too much for older versions of IE that don’t use the Chakra engine.

So the solution is to introduce paging.  There are two ways of doing paging in SSRS:

  • Add page breaks
  • Set InteractiveHeight to a non-zero value

The difference is that page breaks will apply in all rendering extensions.  So a report with page breaks exported to Excel will put each page on a different worksheet.  Likewise exported to PDF it will put each page on a different page of the PDF (and often throw in some unwanted pages for good measure).

InteractiveHeight is what’s known as a soft page break.  It only appears in rendering extensions that are interactive (i.e. HTML in the browser).  For other extensions (e.g. Excel) it will render a single long report.  You should experiment to get the right size, both in terms of visuals and response size.  By cutting down my report to 20 items a page I reduced the response to 0.6MB and the rendering to 1 second.

Incidentally, multi-page reports don’t preclude you being able to do a search for words across the whole (i.e. visible and non-visible pages) of the report.  Using the find text box on the Report Viewer toolbar (as opposed to the browser’s search on page feature) will search the entire report.

I realise that this isn’t a solution as much as a workaround.  If you have money for bandwidth coming out of your ears and really must allow users to download 17MB web pages then some kind of hack that involves exporting to MHTML or otherwise being very clever with IFRAMEs and the reporting service web service, may be your only solution.

See also:

http://connect.microsoft.com/SQLServer/feedback/details/651339/ssrs-interactiveheight-0-and-ie-script-performance

http://www.bidn.com/blogs/MMilligan/bidn-blog/2903/ssrs-gotcha-interactiveheight=0

Wednesday, 17 April 2013

2012 Report Viewer, wait animation, and the double scroll bar

I’ve had to convert a load of reports from SSRS 2005 to 2012.  A few quirks notwithstanding, it’s mostly gone well.  However I was also tasked with getting rid of the dreaded double-scrollbar.  This was quite easy to do – just set the height of the report viewer control to 100% (I’m using the 2012 report viewer).  The problem is that when you do this the wait animation repositions itself to the top of the page, because until the report is rendered it’s also the bottom of the page.  Anyway this is fairly easy to get round with a spot of jQuery.
First, find the name of the wait animation div – the id is derived from the name of the control and suffixed with _AsyncWait_Wait (e.g. a control called myReportViewer would result in a wait div called myReportViewer_AsyncWait_Wait).  Add a reference to jQuery to your page, then add the following code in an inline JavaScript block (replacing the waitControl selector with the correct id for your div):
$(document).ready(function() {
    var waitControl = $("#myReportViewer_AsyncWait_Wait");

    if (waitControl.size() !== 0) {
        waitControl.css('margin-top', '300px');
    }
});
Not the neatest approach but it works.

UPDATE: After a good night's sleep I realised that a better approach is to use CSS (as this continues to apply even after page refreshes when Report Viewer rewrites its own DOM):

#myReportViewer_AsyncWait_Wait { margin-top:300px; }

Sunday, 3 March 2013

Deploying a Reporting Services project from the command line using the rs utility

I’m currently doing some work on a legacy project which involves a large amount of SQL Server Reporting Services assets.  As part of a migration I’m having to deploy a large number of Reporting (Visual Studio) projects.  Rather than have to open each one up in Visual Studio every time I want to do a full deployment, I thought it would be nice if I could just put everything in a root folder, point a script at said folder, and have said script automagically deploy everything to the right place based on a named configuration (e.g. Debug, Release, etc) in the .rptproj files.

There are many variations on this solution already out there, including one that uses msbuild, but my main aim was to get a whole bunch of projects quickly deployed with the configurations that have already been set in the Visual Studio project files.  So I’ve put together a script which pretty much does that.

It uses the rs utility, which is basically a wrapper around the SSRS web service and the vbc compiler.  All the rs utility does is take a VB.NET script, add a proxy variable for the SSRS web service which (oddly enough) it calls ‘rs’ and then create variables for various command line arguments that are passed in.

If you want to use Visual Studio for creating your rs scripts, then just create a VB.NET console application, add a web reference to the SSRS web service (typically located at http://localhost/reportserver/reportservice2005.asmx for a local default instance install), then declare a module level variable like so:

Private rs As New ReportingService2005

Then you can just code away in the Main() Sub as normal.  Any other variables that you intend to pass in as command line parameters should also be declared as module level variables during development.  This allows you to compile and test your script before trying it out with the rs utility.  Once you’re happy with the script, copy it into a text file with an .rss extension, then remove all module level declarations so you’re left with just the Sub Main() and any other methods you’ve written.

One caveat is that you can’t (as far as I can work out) use the Imports statement in an .rss file.  Although some standard Imports are included by default (e.g. System.IO), if in doubt always use fully-namespaced type names.

If all else fails just hack my script!

    ' GenerateReportsFromDataset.rss
    '
    ' An rs script to automatically generate reports using parameter values based on an input dataset from a delimited text file.
    '
    ' The input parameters should be in a comma-delimited file with column headers in the first line 
    ' (these must match the Reporting Services parameter names)
    '
    ' A note on rs default arguments
    ' ====================
    ' 
    ' Obviously the input file (-i) and server name (-s) are mandatory and will be this file and the server on which the report is based, respectively.
    ' The -e parameter is also mandatory, but MUST ALWAYS HAVE A VALUE OF "Exec2005"
    '
    ' Arguments (-v parameters)
    ' =========================
    '
    ' reportPath:           The full path to the report on the reporting server (the leading slash can be left out)
    ' parameterSourcePath:  The full path of the file which contains the input parameters
    ' outputPath:           The path to which the generated reports should be output
    ' format:               The rendering format (e.g. HTML, PDF, etc.)
    ' filenameFormat:       A .NET format string which will be used to generate the output filename for any given report.
    '                       Each item in the format string corresponds to a column in the input file.
    '                       e.g.    a format string of "{0}-{3}" would produce an output file with a name consisting of the values in
    '                               columns 0 and 3 for the current row in the input dataset
    '
    '                       If this argument is not provided or left empty then all files will be given an auto-generated incrementing numerical value
    '
    ' Usage example:
    '
    ' rs -i GenerateReportsFromDataset.rss -s http://localhost/reportserver -e Exec2005 -v reportPath="MyReport" 
    '       -v parameterSourcePath="C:\MyParameterFile.txt" -v outputPath="C:\MyReports" -v format="PDF" -v filenameFormat="{1}-{4}"
    '

    Sub Main()

        rs.Credentials = System.Net.CredentialCache.DefaultCredentials

        If Not reportPath.StartsWith("/") Then reportPath = "/" & reportPath

        Dim result() As Byte = Nothing
        Dim historyID As String = Nothing
        Dim devInfo As String = "False"

        Using parameterStream As New System.IO.FileStream(parameterSourcePath, IO.FileMode.Open, IO.FileAccess.Read)
            Using parameterReader As New System.IO.StreamReader(parameterStream)

                Dim isFirstLine As Boolean = True
                Dim reportCount As Long = 0

                While Not parameterReader.EndOfStream

                    Dim currentLine As String = parameterReader.ReadLine()
                    Dim parameters() As ParameterValue

                    If Not String.IsNullOrEmpty(currentLine) Then

                        Dim parameterColumns() As String = currentLine.Split(New Char() {","c})

                        For loopCounter As Integer = 0 To (parameterColumns.Length - 1)
                            parameterColumns(loopCounter) = RemoveLeadingAndTrailingQuotes(parameterColumns(loopCounter))
                        Next loopCounter

                        If isFirstLine Then

                            If parameterColumns.Length > 0 Then
                                ReDim parameters(0 To (parameterColumns.Length - 1))

                                For loopCounter As Integer = 0 To (parameterColumns.Length - 1)
                                    parameters(loopCounter) = New ParameterValue()
                                    parameters(loopCounter).Name = parameterColumns(loopCounter)
                                Next loopCounter
                            Else
                                Return
                            End If

                            isFirstLine = False
                        Else
                            For loopCounter As Integer = 0 To (parameters.Length - 1)
                                parameters(loopCounter).Value = Nothing
                                If loopCounter <= (parameterColumns.Length - 1) Then parameters(loopCounter).Value = parameterColumns(loopCounter)
                            Next loopCounter

                            Dim credentials As DataSourceCredentials() = Nothing
                            Dim showHideToggle As String = Nothing
                            Dim encoding As String = ""
                            Dim mimeType As String = ""
                            Dim warnings As Warning() = Nothing
                            Dim reportHistoryParameters As ParameterValue() = Nothing
                            Dim streamIDs As String() = Nothing

                            Dim execInfo As New ExecutionInfo
                            Dim execHeader As New ExecutionHeader()
                            Dim SessionId As String
                            Dim extension As String = ""

                            rs.ExecutionHeaderValue = execHeader
                            execInfo = rs.LoadReport(reportPath, historyID)
                            rs.SetExecutionParameters(parameters, "en-us")
                            SessionId = rs.ExecutionHeaderValue.ExecutionID

                            Try
                                result = rs.Render(format, devInfo, extension, _
                                   encoding, mimeType, warnings, streamIDs)

                                execInfo = rs.GetExecutionInfo()

                                Console.WriteLine("Execution date and time: {0}", execInfo.ExecutionDateTime)
                            Catch e As Web.Services.Protocols.SoapException
                                Console.WriteLine(e.Detail.OuterXml)
                            End Try

                            Try
                                reportCount = reportCount + 1
                                Dim filename As String = reportCount.ToString()

                                If Not String.IsNullOrEmpty(filenameFormat) Then filename = String.Format(filenameFormat, parameterColumns)
                                filename = String.Format("{0}.{1}", filename, format.ToLower())

                                filename = System.IO.Path.Combine(outputPath, filename)

                                Using stream As IO.FileStream = IO.File.Create(filename, result.Length)
                                    Console.WriteLine("File created.")
                                    stream.Write(result, 0, result.Length)
                                    Console.WriteLine("Result written to the file: {0}", filename)
                                End Using
                            Catch e As Exception
                                Console.WriteLine(e.Message)
                            End Try

                        End If

                    End If

                End While
            End Using
        End Using

    End Sub

    Private Function RemoveLeadingAndTrailingQuotes(textQualifiedString As String) As String

        Dim cleanString As String = textQualifiedString

        If textQualifiedString = System.Convert.ToChar(34) Then
            cleanString = String.Empty
        Else
            If Not String.IsNullOrEmpty(textQualifiedString) AndAlso textQualifiedString.Trim() <> String.Empty Then
                If textQualifiedString.Length >= 2 Then
                    If textQualifiedString.StartsWith(System.Convert.ToChar(34)) And textQualifiedString.EndsWith(System.Convert.ToChar(34)) Then
                        cleanString = textQualifiedString.Substring(1, (textQualifiedString.Length - 2))
                    End If
                End If
            End If
        End If

        Return cleanString

    End Function

Wednesday, 15 August 2012

Programmatically executing packages in the SSIS 2012 Catalog

After about a year of intending to get to grips with the pre-release versions of SQL Server 2012 (Denali) and failing to do so, I was fortunate enough to land a contract where I had the option to use it from my first day across most of the stack (SSAS, SSRS, SSIS, and the database engine).  Always happy to fill my CV, I jumped at the chance and have been having a jolly good time since.  As part of the project I’ve developed some SSIS packages which my customer wants included in their continuous integration pipeline.  I’ve never done CI or formal unit testing on ETL before because I develop faultless software solutions it’s never really felt as mature as it is in application frameworks (e.g. MVC, WPF). The CI pipeline is based on SpecFlow and nUnit, so I need to be able to run SSIS packages programmatically and get meaningful outputs that I can run assertions against. Specifically I want to know:
  • Did it run successfully?
  • If not, why not?
Traditionally if I wanted to run SSIS programmatically then I’d use the Microsoft.SqlServer.Dts.Runtime namespace.  However if you use the Project Deployment model then you have to use the Catalog (or else Project Parameters won’t work), which comes with a new object model API which it has taken me some time and much head scratching to fathom.  I’ve been using the Project Deployment model with the SSIS Catalog because, quite frankly, I’d be a fool not to. Its merits are expounded elsewhere, but suffice to say it offers much greater control and flexibility over configuration with considerably less effort.
The new API is in the Microsoft.SqlServer.Management.IntegrationServices.dll assembly - but the first problem is where to find it.  It is in the GAC, but it doesn’t come up as a global assembly in the VS2010 Add Reference dialog so you’ll need to browse to it:
addreference
From the Add Reference dialog, click ‘Browse’ and navigate to %WINDIR%\assembly\GAC_MSIL\Microsoft.SqlServer.Management.IntegrationServices:
selectfile
There should be just 1 folder (11.0.0.0__89845dcd8080cc91), but there may be more in future versions, so select the latest folder and then select the DLL that should be inside it (Microsoft.SqlServer.Management.IntegrationServices.dll).  Back in the Add Reference dialog, click ‘Add’.  Before closing the dialog, add references to the following assemblies (which do appear in the GAC list):
  • Microsoft.SqlServer.ConnectionInfo
  • Microsoft.SqlServer.Management.Sdk.Sfc
  • Microsoft.SqlServer.Smo
To start with you will need to establish a connection to the SQL Server then the SSIS server:
var connection = new SqlConnection(@"Data Source=(local);Initial Catalog=master;Integrated Security=SSPI;");
var integrationServices = new IntegrationServices(connection);
Then you need to get a handle to the package.  Packages are stored in projects, which must be located in a folder, which will be in a catalog.  So the package can be gotten thus:
var package = integrationServices
    .Catalogs["MyCatalog"]
    .Folders["MyFolder"]
    .Projects["MyProject"]
    .Packages["MyPackage"];
It’s worth pointing out that each collection has a Contains() method.  This allows you to verify whether or not the catalog, folder, project, or package exists.  So you could rewrite the above code more defensively:
if (integrationServices.Catalogs.Contains("MyCatalog"))
{
    var catalog = integrationServices.Catalogs["MyCatalog"];

    if (catalog.Folders.Contains("MyFolder"))
    {
        var folder = catalog.Folders["MyFolder"];

        if (folder.Projects.Contains("MyProject"))
        {
            var project = folder.Projects["MyProject"];

            if (project.Packages.Contains("MyPackage"))
            {
                var package = project.Packages["MyPackage"];
            }
        }
    }
}
Once you have a handle to the package you can just go ahead and execute:
long executionIdentifier = package.Execute(false, null);
The return value is an execution identifier which relates back to that particular request to run the package.  The SSIS server will quite happily let you run multiple instances of the same package concurrently – so to get feedback on an executing package you need to get a reference to the execution instance (rather than the package):
var execution = catalog.Executions[executionIdentifier];
An interesting point here (and a difference from the Execute method in Microsoft.SqlServer.ManagedDTS assembly) is that, by default, execution is asynchronous.  This is not especially desirable in a testing scenario where one would like control to be returned once the package has finished executing.  So with a little help from Davide Mauri’s blog post I was able to deduce that the Execute method could be made synchronous by using the overload that includes a collection of ExecutionValueParameterSet objects:
var setValueParameters = new Collection<PackageInfo.ExecutionValueParameterSet>();
setValueParameters.Add(new PackageInfo.ExecutionValueParameterSet
{
    ObjectType = 50,
    ParameterName = "SYNCHRONIZED",
    ParameterValue = 1
});

long executionIdentifier = package.Execute(false, null, setValueParameters);

So once execution is complete, the outcome data can be gleaned from the Execution object (which is of type ExecutionOperation).  All messages generated during execution will be put in the Messages property which is a collection of OperationMessage objects - likewise the status will be in the Status property.  As the status is an enumeration of obvious values I won’t go into it here.


The messages are slightly different.  Back in the ManagedDTS days there was an errors collection - not so here.  If you just want the errors then you’ll need to filter the list of messages by MessageType.  Unfortunately this isn’t an enumeration so by a process of deduction and assumption I’ve established that warnings and errors have message types of 110 and 120 (although not necessarily in that order).  So to get the error and warning messages, you can use LINQ:

var errors = execution.Messages.Where(
    m => m.MessageType == 120 || m.MessageType == 110)
    .Select(m => m.Message);
Once you get to grips with the SSIS 2012 catalog you’ll not look back.  There’s a whole host of stuff I haven’t covered here (e.g. environment configurations) but this should be enough to move over from the ‘old’ way of doing things with a reasonable degree of confidence.

Tuesday, 28 February 2012

ASP.NET MVC and DataTables

One of the many things I love about WPF is its native support for UI virtualisation.  Bea Stollnitz used this to produce an excellent data virtualisation solution some time ago (which I might add I used with great success in a large-scale WPF project last year).  Recently I’ve been getting back into web development proper (aka drowning in JavaScript) and was asked to come up with a way of presenting large amounts of data in a table without the user having to go to the effort of clicking on page numbers to move through it.  Given that paged grids are the generally accepted way of presenting large volumes of data on the web, I was interested to see how this could be done without recourse to browser plug-ins (specifically Silverlight).  What I discovered is that it’s not an easy thing to do but then a colleague put me on to the excellent DataTables, created by Allan Jardine – a man who must surely never sleep, given the effort he’s put into both creating the control and supporting it.  I can’t recommend this control enough – having had the ordeal of dealing with a certain component vendor who assured me their product could do data virtualisation (as opposed to just UI virtualisation) only to discover that it couldn’t – DataTables has saved me from a lot of pain and suffering.

The only difficulty is that I’m working with ASP.NET MVC as a back-end, while most of the examples on the DataTable’s website use PHP.  Also, I have requirements for row selection (and persistence of this across page reloads), check box columns, and master/detail tables.  Go forward a few months and I’ve managed to put together some MVC-specific helper classes and write a few plug-ins to handle these requirements.  I should really publish it to GitHub or CodePlex, but that would involve me having to maintain codebases and all sorts of other things that I’d love to do if I had any spare hours in the day (although I’ll probably do it anyway).  So instead I’ve knocked up a sample project that should provide some useful helpers and plug-ins for anyone in a similar boat.

The sample project can be found here.  It’s a Visual Studio 2010 solution (but can be opened using the Express editions if you build the projects separately and re-reference them) with a SQL file that needs to be run to provide the sample data.  It demonstrates a data-virtualised row-selectable master table with 1000 rows and a similarly data-virtualised detail table with selectable rows and a check box column.  I strongly recommend downloading the sample as it drastically shortens the learning and development curve involved in getting MVC working happily with DataTables.  I’m assuming some familiarity with DataTables (did I mention the documentation is excellent?) but usage can be easily inferred from the sample project.

MVC server-side helpers

The server-side helpers consist of:

  • A model binder to convert a request from a client-side DataTable into an object (of type ClientRequest) that can be easily interrogated to build the return dataset.
  • A response generator which converts the result dataset into the ‘correct’ type of JSON for the client-side DataTable.
  • A JSON generator for converting server-side model classes into client-side column definitions (specifically the DataTable’s aoColumns object).

In addition there are some LINQ helpers, the main purpose of which is to provide slightly neater factoring of data access code (since the amount of code needed to implement them is quite a lot).  They are, to use a turn of phrase, “beyond the scope of this post”.  Basically the helper classes are just some scaffolding to get MVC to return the data that’s been asked for in the right format.

Firstly we need to add the model binder to the RegisterRoutes method of Global.asax.cs:

public static void RegisterRoutes(RouteCollection routes)
{
    // routes and other stuff here

    ModelBinders.Binders.Add(typeof(ClientRequest), new ClientRequestModelBinder());
}

Then we need to create a controller action to return the data (which uses the ClientRequest model binder to process the incoming request):

public ContentResult GetMyData(ClientRequest model)
{
    string json = null;

    if (model != null)
    {
        // get data and total size of dataset

        var serverResponse = new ServerResponse<MyModelClass>(
            model.Echo, totalCount, true, data);
        json = serverResponse.GetJson();
    }

    return Content(json);
}

Note that the action returns a ContentResult (rather than a JsonResult).  This is because DataTables is quite pedantic about the format of the Json that gets returned, hence the use of the ServerResponse<T> helper which creates a string in the correct Json format that can be sent verbatim to the client.

As an optional extra we can also send the column format data (for which we can use the standard Json serialisation):

public JsonResult GetMyModelClassColumnDefs()
{
    var columnDefs = ClientColumnGenerator.GetColumnDefs(typeof(MyModelClass));
    return Json(columnDefs, JsonRequestBehavior.AllowGet);
}

The main advantage to getting the column definitions this way is that it removes duplication of model definitions (and by extension provides a more maintainable way of keeping client and server-side models synchronised).  However the reason I first did it was because it’s less tedious than manually creating all that mDataProp stuff in object literals on the client-side.  As a minimum the GetColumnDefs method will return an object literal with the mDataProp value for each public property in the server-side model, however there are a number of attributes that can be placed on the server-side model to provide the values for the following client-side properties:

aoColumns property MVC attribute
bSearchable Searchable
bSortable Sortable
sClass ColumnCssClass
sTitle Display
bVisible ColumnVisible

The column definitions can be retrieved on the client-side by using jQuery’s getJson function.  This is how the sample application does it, although it requires some nesting when using master/detail tables.

Client-side plug-ins

The client-side plug-ins are contained in their own script files, however the checkbox and row selector plug-ins make use of the ArrayHelper class (so that will need to be included in any deployment).  I also have a number of miscellaneous DataTable helper methods (in the DataTableHelper.js file) which are not necessary for the plug-ins but are used in the sample project.  The plug-in files can be downloaded separately here.

I’ve included vsdoc code comments, so Visual Studio 2008/2010 or Expression Web users will get IntelliSense.

Row selection plug-in

Before I start I should point out that this plug-in was ‘inspired’ (as in half-inched) from Iuliandum's method for row selection (as detailed here).  All I’ve done is create an sDom wrapper around it.

The row selection plug-in supports three modes:

  • Single

    Zero or one rows can be selected at any one time

  • Single must select

    One row must always be selected (unless there are no rows in the table). If the user hasn't selected a row then the first row will be selected by default.

  • Multiple

    Zero or more rows can be selected concurrently

It also has support for bStateSave and if so configured will remember selected rows between page reloads.

To start you need to create an instance of RowSelectionInfo.  The first parameter is an array listing all the key fields (an array facilitates using composite keys), the second parameter is the selection mode (as described above), the third parameter is the CSS class to apply to the row when it’s selected, and the fourth parameter is the callback function for when a row is selected:

var rowSelector = new RowSelector.RowSelectionInfo(['Id'],
    RowSelector.SelectionModes.SingleMustSelect, 'selected',
    function (nRow, aData, isSelected, oTable) {
        // do something on row selected.  Callback arguments as follows:
	// nRow: The DOM selector for the row
	// aData: The underlying data in the row
	// isSelected: indicates whether the row was selected or unselected
	// oTable: a reference to the owning DataTable
    });

Then just pass it into the selectionInfo property of the DataTable’s initialisation object and include an ‘R’ in the sDom parameter (so that DataTables knows to initialise the plug-in):

var invoiceDataTable = invoiceTable.dataTable({
    "sScrollY": "400px",
    "sScrollX": "100%",
    "sAjaxSource": invoiceDataSource,
    "bServerSide": true,
    "sDom": 'rtiSR',
    "aoColumns": invoiceColumnDefs,
    "bStateSave": true,
    "selectionInfo": rowSelector
});

By encapsulating handling of row selection in a separate object it becomes easier to make use of closures when dealing with the inherently asynchronous nature of a DataTable with a server-side source.  The RowSelectionInfo object exposes functions for getting at the key values of the currently selected rows and (in the case of single row selection) a method for getting at the underlying data.

Master/detail plug-in

This plug-in handles the plumbing of keeping two tables with a common field(s) synchronised with each other – so that when rows in the master table are selected, the detail table is filtered accordingly.  The reason this is usually so awkward is because the detail table can’t be instantiated until the master table has finished initialisation.  The plug-in ensures that all tables are set up in the right order (and thus has support for nested tables – so a detail table could be the master table for another detail table).  Note: this plug-in won’t work without the row selector plug-in.

First you need to define the link keys.  These are the fields common to both tables on which the detail table will be filtered.  As the plug-in supports linking on multiple fields, these take the form of an array (of MasterDetail.LinkKey objects):

var linkKeys = [new MasterDetail.LinkKey('Id', 'ParentId')];

Then declare an instance of MasterDetail.LinkSettings, in which the first parameter is the link keys, the second parameter is the jQuery selector for the HTML table, and the last parameter is the initialisation object that you would normally pass in if you were using the standard DataTable constructor:

var linkSettings = new MasterDetail.LinkSettings(linkKeys, $('#myTable'), {
    "sAjaxSource": lineItemsDataSource,
    "bServerSide": true,
    "sDom": 'rtiSRC',
    "aoColumns": lineItemColumnDefs
});

Once this is done just pass it into the masterDetailInfo property of the master table’s initialisation object and include an ‘M’ in the sDom property (after the ‘R’ for the row selector):

var masterTable = $('#myMasterTable').dataTable({
    "sAjaxSource": masterTableSource,
    "bServerSide": true,
    "sDom": 'rtiSRM',
    "aoColumns": columnDefs,
    "bStateSave": true,
    "selectionInfo": masterSelector,
    "masterDetailInfo": linkSettings
});
Checkbox column plug-in

Slightly less useful than the other two, but handy nonetheless is the check box plug-in.  It allows you to render one column as a checkbox and keep track of which rows the checkbox has been selected on.  The actual rendering of the checkbox is just an override on the native DataTables fnRender function, but the plug-in has the added benefit of encapsulating the tracking.

To use the plug-in, first declare an instance of CheckBoxSelectionInfo.  The first parameter is the column in which the checkbox should appear (it won’t render the actual data), the second is the key field array, the third is an optional CSS class to be applied to the actual checkbox, and the last one is the on checked/unchecked callback:

var checkBoxColumn = new CheckBoxColumnn.CheckBoxSelectionInfo(
    0, ['Id'], 'checkboxcolumn', function (nRow, aData, isSelected, oTable) {
        // Callback arguments as follows:
        // nRow: The DOM selector for the row
        // aData: The underlying data in the row
        // isSelected: indicates whether the checkbox was selected or unselected
        // oTable: a reference to the owning DataTable        
    });

Then just pass that into the DataTable’s initialisation object’s checkBoxColumnInfo property and include a ‘C’ in the sDom property:

var dataTable = $('#myTable').dataTable({
    "sAjaxSource": dataSourceUrl,
    "bServerSide": true,
    "sDom": 'rtiSC',
    "aoColumns": columnDefs,
    "bStateSave": true,
    "checkBoxColumnInfo": checkBoxColumn
});

As with the row selector plug-in, the CheckBoxSelectionInfo object exposes functions for getting at the key values of the checked rows.

Thursday, 23 June 2011

Magic Unicorn and attachment disorders

As previously noted, I’m a big fan of Entity Framework 4.1 – but it does have a few shortcomings/quirks that can take a bit of head scratching to overcome.  One of its features that is useful and annoying in equal measure is the ability to persist whole layers of complex objects in a single call.  This is great because…it can do it in a single call!  But there are scenarios where automatic persistence of everything is less than desirable.  A good example of this is lookups – let’s take Northwind’s Products table:

northwindproducts

The SupplierID and CategoryID fields are foreign keys (to the Supplier and Category tables).  If we were to create an object model around the product table, then we would probably create separate classes for the suppliers and categories:

public class Supplier
{
    public int Id { get; set; }
    public string CompanyName { get; set; }
    // ...other fields
}

public class Category
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
}

public class Product
{
    public int Id { get; set; }
public string Name { get; set; } public virtual Supplier Supplier { get; set; } public virtual Category Category { get; set; } public string QuantityPerUnit { get; set; } public decimal? UnitPrice { get; set; } public short? UnitsInStock { get; set; } public short? UnitsOnOrder { get; set; } public short? ReorderLevel { get; set; } public bool Discontinued { get; set; } }

However if we run this through Magic Unicorn and set the Supplier and/or Category property of the Product object to a supplier and/or category that isn’t already in the database, then EF 4.1. will helpfully create new rows in the appropriate tables for us:

using (var context = new NorthwindContext())
{
    var supplier = new Supplier()
    {
        CompanyName = "Random Co"
    };

    var category = new Category()
    {
        Name = "Expensive stuff",
        Description = "Things that cost a lot of money"
    };

    var product = new Product() 
    { 
        Name = "Money burner",
        Category = category,
        Supplier = supplier, 
        QuantityPerUnit = "lots of boxes", 
        UnitPrice = 20,
        UnitsInStock = 500,
        UnitsOnOrder = 0,
        ReorderLevel = 10,
        Discontinued = false
    };

    context.Products.Add(product);
    context.SaveChanges();
}

Whether this is a good thing depends on whether the “dependent” object being created is specific to the parent or just a general lookup value.  In this case it’s a one-to-many with category as the principal (i.e. there can be many products for each category) so it’s much better practice to separate the processes of persisting products, suppliers, and categories.  As an aside I should highlight what happens if we use values already in the database for the category and supplier:

using (var context = new NorthwindContext())
{
    var supplier = context.Suppliers.FirstOrDefault();
    var category = context.Categories.FirstOrDefault();

    var product = new Product() 
    { 
        Name = "Money burner",
        Category = category,
        Supplier = supplier
        // other fields...
    };

    context.Products.Add(product);
    context.SaveChanges();
}

As one would expect, Entity Framework patches up the references correctly and everything saves fine with no duplication of the ‘lookup’ values.  This is because when entities are retrieved/deserialised, Entity Framework actually returns a proxy (rather than the POCO).  It does this so that it can do all sorts of clever interception to support change tracking.  It uses this ‘special magic’ to work out whether objects already relate to rows in the database so that it doesn’t duplicate them – in fact it’s key (no pun intended) to the way that a good OR/M should work.

So far, so good.  What happens if we stick on a few data contract attributes and expose those POCOs/entities over WCF?  Well, once we get past the data contract resolver problem (for getting the objects out of the database and over SOAP), we hit the issue of sending back real POCOs (instead of proxies) and trying to match them up with database rows.  This can lead to some interesting errors:

Violation of PRIMARY KEY constraint 'PK_xxx'. Cannot insert duplicate key in object 'MyTable.

What this is saying is that there is already an object with the same key, but the one being persisted isn’t it.  So EF tries to create the object again but fails because it already exists.  So what we need to do is tell EF that “the object I’m giving you is the same as the database row with the same key”.  After unsuccessfully trying to use the Attach() method of the data context, which produced this error:

An object with the same key already exists in the ObjectStateManager. The ObjectStateManager cannot track multiple objects with the same key

I still don’t understand why this happens – I thought that the Attach() method was supposed to make the DbContext recognise the object.  After much gnashing of teeth and wailing around stuff to do with the entity key, I realised that the easiest thing to do was to replace the object with the actual object from the database context.  But in cases whether the object I’m trying to retrieve doesn’t exist, I want to know about it.  To this end (and with a little help from Kevin McNeish’s blog) I wrote the GetEntityForObject() extension method:

public static class DataContextExtensions
{
    #region Public methods

    public static T GetEntityForObject<T>(this DbContext context, T value)
        where T : class, new()
    {
        T castEntity = null;

        if (value != null)
        {
            var adapter = context as IObjectContextAdapter;
            Debug.Assert(adapter != null);

            var objectContext = adapter.ObjectContext;
            string entitySetName = context.GetEntitySetName(typeof(T));

            try
            {
                var key = objectContext.CreateEntityKey(entitySetName, value);
                object entity = null;
                bool isExists = objectContext.TryGetObjectByKey(key, out entity);

                if (isExists)
                {
                    Debug.Assert(entity != null);
                    castEntity = entity as T;
                    Debug.Assert(castEntity != null);
                }
            }
            catch { }
        }

        return castEntity;
    }

    public static string GetEntitySetName(this DbContext context, Type type)
    {
        var adapter = context as IObjectContextAdapter;
        Debug.Assert(adapter != null);

        var objectContext = adapter.ObjectContext;
        string entitySetName = GetEntitySetName(type, objectContext);

        return entitySetName;
    }

    #endregion

    #region Private methods
        
    private static string GetEntitySetName(Type type, ObjectContext context)
    {
        // Thanks to Kevin McNeish for this little gem

        string entityTypeName = type.Name;

        var container = context.MetadataWorkspace.GetEntityContainer(
            context.DefaultContainerName, DataSpace.CSpace);

        string entitySetName = (from meta in container.BaseEntitySets
                                where meta.ElementType.Name == entityTypeName
                                select meta.Name).First();

        return entitySetName;
    }

    #endregion

Basically this tries to get the “same” entity from the data context, but returns null if there is no match.  So in the Northwind example we could use it to set the value of a property, so that if it already existed in the database then it would use the database version:

product.Category = context.GetEntityForObject(product.Category);

If it can’t find a matching version in the database then nothing (or more accurately null) gets persisted for that field.  Obviously this could be handled as you see fit (e.g. throw a validation error back to the user if the field gets nulled).

I’ve thrown together a demo that shows the whole process against a WPF client.

Wednesday, 18 May 2011

Entity Framework 4.1: Riding the Magic Unicorn

Although it was released back in April, it’s taken me until now to actually try out Entity Framework 4.1 (aka Magic Unicorn).  It’s very impressive and does a surprisingly large amount of the work of getting data from databases and putting it into objects (and vice versa) without the need for visual designers (although they’re still there) or field-by-field mapping.  The only bad thing I can say about it is that for something so feature-rich, documentation is thin on the ground, so I shall add what I’ve learnt to the scant body of knowledge out there.  Other than ADO.NET team blog, most of what I didn’t work out myself came from the invaluable Morteza Manavi’s blog – so I won’t repeat what’s already there.

To provide a little context (no pun intended), what drew me to EF 4.1 was the possibility of mapping an existing database to an existing object model.  This, to my mind, is the way that ORM should be done – i.e. you create a data schema according to the best practices of relational design, create an object model based on the domain requirement, then use ORM to get them to talk to one another.  Basically I think you should design object models with your OOP developer hat and design data schemas with your DBA hat.  Although there may be similarities between the two models, it’s rare that they’ll be exactly the same.

Anyway, I’ve spent this week mapping a complex object model to an equally complex data structure and despite thinking (at times) that Magic Unicorn didn’t support the type of mapping I was attempting, it proved me wrong on all counts and happily transformed my data schema into my object model.  I’ve also done some tinkering with EF across WCF, but I won’t bring that into the mix just yet as it comes with a whole load of caveats and scariness.

The basics

It’s worth mentioning that if your objects have the same name as the tables they map to and your properties have the same name as the fields, then you don’t need to do much at all.  EF will automatically infer the mappings based on a set of conventions – all you have to do is create the DbContext.  This is all covered by Scott Gu’s blog post.  The only caveat is that all POCOs you want to persist must have a key field.

In order to do custom mapping you have to override the OnModelCreating method of the DbContext and use EF’s fluent API to set up the mappings.  Most of the examples out there put all the mapping rules in the OnModelCreating method like so (I’ve included the DbContext setup for clarity):

public class MyDatabaseContext : DbContext
{
	public MyDatabaseContext()
		: base("data source=(local);initial catalog=MyDatabase;integrated security=True")
	{ }

	public DbSet<MyClass> MyClasses { get; set; }

	protected override void OnModelCreating(DbModelBuilder modelBuilder)
	{
		modelBuilder.Entity<MyClass>()
			.Property(c => c.SomeFieldOrOther)
			.HasColumnName("StrangelyDifferentFieldName");
		modelBuilder.Entity<MyClass>()
			.Ignore(c => c.AFieldIDontWantMapped);
	}
}

I prefer to use an EntityTypeConfiguration class as it’s cleaner and more maintainable.  Basically you create one EntityTypeConfiguration for every class on which you want custom mapping and put the mapping code in the constructor:

public class MyClassEntityTypeConfiguration : EntityTypeConfiguration<MyClass>
{
	public MyClassEntityTypeConfiguration()
	{
		Property(c => c.SomeFieldOrOther).HasColumnName("StrangelyDifferentFieldName");
		Ignore(c => c.AFieldIDontWantMapped);
	}
}

Then add a new instance of that class to the DbContext’s configurations (via the OnModeCreating method), like so:

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
	modelBuilder.Configurations.Add<MyClass>(new MyClassEntityTypeConfiguration());
}

You can add in as many entity type configurations as you like, but if you don’t add one for a class then the default mapping rules will apply.

As an aside, something worth noting in the above examples is the use of the Ignore() method. This is used because any property in a class that doesn’t map to a field in the database must be explicitly ignored (except for enumerations which are ignored by default).

The example models

The sample project for this can be downloaded here.

For illustration I will use the scenario of a product sales LoB system.  Let us imagine that we have a company that uses field sales people to sells widgets.  The sales people can make either a sales or support visit to a customer – the model for this is shown below:

Visit model

Although the data model is probably close to what I’d actually use in such a situation, I have intentionally designed it to be ‘awkward’ via the use of differently named fields/properties, schemas, odd relationships etc:

Sales sample ERD

The code

In this section I’ll just explain a couple of the tricks I’ve used to get the mapping done.  Although the sample project includes mapping of object hierarchies I won’t bother explaining this as it’s covered better elsewhere.

  • Enumerations

    If you want to coerce a numeric value into an enumeration then just expose the database field as a numeric property and get the property setters to do the work for you.  In the sample, I’ve done this on the Visit class using the following enumeration:

    public enum VisitTypes : short
    {
    Sales = 1,
    Support = 2
    }

    The enumeration property ‘Type’ gets its value from the ‘VisitTypeId’ property (which maps to the field of the same name in the database).

    public class Visit
    {
    	private short? _visitTypeId;
    	private VisitTypes? _type;
    
    	public VisitTypes? Type
    	{
    		get
    		{
    			return _type;
    		}
    		set
    		{
    			_type = value;
    			_visitTypeId = null;
    
    			if (_type.HasValue)
    			{
    				_visitTypeId = (short)_type.Value;
    			}
    		}
    	}
    
    	public short? VisitTypeId
    	{
    		get
    		{
    			return _visitTypeId;
    		}
    		set
    		{
    			_visitTypeId = value;
    			_type = null;
    
    			if (_visitTypeId.HasValue)
    			{
    				VisitTypes visitType;
    
    				if (Enum.TryParse<VisitTypes>(_visitTypeId.ToString(), out visitType))
    				{
    					_type = visitType;
    				}
    			}
    		}
    	}
    }

    By using the Enum.TryParse method we can ensure that any values in the database that don't match the ones available in the enumeration force the enum field to null.

  • Multiple tables into one object

    This has been done in the SupportCheckList class in the sample. Simply call the Map method for every table that you want to include, using the ToTable and Properties methods in the lambda expression:

    public SupportChecklistEntityTypeConfiguration()
    {
    	HasKey(s => s.Id);
    
    	Map(m =>
    		{
    			m.ToTable("SupportCheckList", "Data");
    			m.Properties(s => new
    			{
    				s.Id,
    				s.WasIssueFixedDuringVisit,
    				s.HasCustomerBeenGivenFollowUpDate
    			});
    		});
    
    	Property(s => s.Id)
    		.HasColumnName("SupportCheckListId")
    		.HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity);
    
    	Map(m =>
    		{
    			m.ToTable("SupportCheckListSafety", "Data");
    			m.Properties(s => new
    			{
    				s.HasProductBeenMadeSafe,
    				s.ProductFailureCausedIncident
    			});
    		});
    }

    The Properties method takes an anonymous type which contains a list of the fields which are mapped from the given table. You can then do column mapping as per normal (outside of the Map method) and EF will know where to look for the field.

    If you want to get multiple objects into one table then have a look at this post.

I won’t explain the various cardinality mappings because it’s quicker just to look at the sample.  There’s a mass of odd relationships and strange keys that I’ve put in there purposely to show how to use the Map() method of the fluent API.