Sunday, 17 November 2013

MSDTC in SSIS development

A colleague recently encountered a problem when developing a SSIS package in Visual Studio 2012 that uses transactions.  The package has an Execute SQL task with the TransactionOption set to ‘Required’, that calls a stored procedure on a SQL Server database located on one of the LAN servers.  Each time the package is run in Visual Studio it hangs for about 3 minutes and then fails with the following error:

The SSIS Runtime has failed to enlist the OLE DB connection in a distributed transaction with error 0x8004D00E “The transaction has already been implicitly or explicitly committed or aborted”.

If transactions are disabled (or set to Supported) then the package runs fine, so it’s not a SQL Server or permissions issue.

I’ve used MSDTC for years with SQL Server and WCF and all sorts, so I was quite baffled by this.  Basic checks:

  1. Go into comexp.msc and navigate to Local DTC:

    msdtc

  2. Right-click Local DTC and select “Properties”.  Tick all check boxes on the security page apart from “Allow Remote Administration”:

    security

  3. Go to your Firewall (firewall.cpl) and select “Allow and app or feature through Windows Firewall” (in the left-hand sidebar):

    firewall

  4. Click on “Change settings” and under “Allowed apps and features” tick all the boxes for Distributed Transaction Coordinator:

    fwdtc

All these were set correctly and it still wasn’t working.  After much searching I found this page which contains a very useful diagnostic checklist.  One of the tools to use was DTCPing, which I downloaded and installed on my local machine and the SQL Server.  I also installed it on another SQL Server on the network.  Diagnostics showed that the 2 LAN SQL Servers could successfully coordinate a distributed transaction between themselves, but nothing between my PC and the network SQL Servers was working.

So I did the obvious thing: I pinged my PC from the SQL Servers and they couldn’t see it.  That’s because I was logged on via VPN and the VPN clients are on a different subnet to the servers.  Due to the network topology there are a whole bunch of servers that can’t see a whole bunch of clients (this is by design).  There was the problem!  I wasn’t (previously) aware that for MSDTC to work all parties in the transaction must be able to see each other.  One way communication (e.g. from behind a router or firewall that uses NAT) is insufficient.  Sure enough, I fudged DNS and static routes so that the server could see my PC, reran the package, and it all worked fine.

However, I didn’t want to allow servers to access VPN clients as a matter of course.  But in production the servers will all be able to see each other and so transactions must be enabled. As a workaround I’ve set the value of TransactionOption using an expression so that it’s always “Supported” in Debug mode and “Required” otherwise.  The only point to note with this is that if you are using an expression then it must evaluate to the underlying enumeration’s value (as helpfully pointed out here).  Personally I used a boolean project parameter (as they can be set for each Visual Studio configuration) called “DisableTransactions”, so my expression for TransactionOption looks like this:

@[$Project::DisableTransactions] ? 1 : 2

Also worth noting that the same rules (about all parties seeing each other) apply to WS-AT in WCF (in fact there’s a whole load of prerequisites that make distributed transactions over HTTP quite onerous).

2 comments:

  1. DTCPing works but the problem persists. What else can i check to verify the servers actually see each other?

    ReplyDelete
  2. All parties in the transaction should be able to ping each other, so a first step is to RDP onto each machine and use normal ping and/or nslookup to verify that each participant can ping every other participant.

    If DTCPing works then that means that the ports used by MSDTC are enabled on your network so I would also ensure that Network DTC access is enabled for all machines as well.

    The only other thing I can suggest is that you make sure that there aren't any hidden parties in the transaction (e.g. linked servers that you're unaware of etc). When checking this kind of thing you want to reduce the number of variables (aka moving parts) so as to reduce the likelihood of false positives. The easiest way to prove it in this instance is to set up a dummy sandbox database that you know isn't linked by anything else and test against that.

    Hope that helps

    ReplyDelete