A simple, minimally intrusive way to make a multi-batch T-SQL script succeed or fail atomically

This goal is common in DB upgrade scenarios, yet problematic to achieve without a major impact on the contents of the script.

The Problems

The most obvious approach is probably just to start a transaction, and after each statement, check @@error and roll back the transaction yourself. This approach means writing a lot of extra error-handling code. You could start a transaction with XACT_ABORT ON, and after each batch, check if there is still a transaction. If there isn’t one, an error occurred. But now what? Many DDL statements require that they are the first statement in a batch, and that means you can’t start with an IF statement to make execution of that statement conditional. Well, one option would be to wrap every statement in a call to sp_executesql. That’s what SQL Server Management Studio does. However, I really don’t like to use this approach for a script that I’m actively maintaining, because now all the code is in string literal blocks and you’ve lost all syntax highlighting and intellisense.

Another option is to log the failure in something that persists across batches, like a temp table, start a new transaction, and then at the end of the script commit if the temp table is empty, or rollback if it isn’t. I used this approach for a while, but the drawback is that debugging such a script is slow, because what happens is that the script executes up to the point of an error, rolls back, then starts another transaction and in the best case doesn’t encounter any more errors, but at the end, must roll back the second transaction because the first failed. Worst case, the remainder of the batches in the script were dependent on DDL that got rolled back at the first failure, and now not only does the remainder of the script execute, but each batch rolls back, then starts another transaction only to have the next batch fail and roll back all the way down. This can take so long, that it is just as fast to run the script without transactions, let it fail, and restore a backup to get back to the state before the script was run.

The Solution

Here is my solution to the problems listed above:

begin tran
set xact_abort on
-- Batch 1
if @@error <> 0 and @@trancount > 0 rollback -- Contributed by Anonymous poster; see comments
if @@trancount = 0 begin set nocount on; set noexec on; end
-- ...
-- Batch N
if @@error <> 0 and @@trancount > 0 rollback -- Contributed by Anonymous poster; see comments
if @@trancount = 0 begin set nocount on; set noexec on; end
begin try
-- handle special case for sp_rename which raises an error but continues, then executes a return statement, resetting @@error
-- See comments from Kweku Ahlijah
-- Cannot be combined with statements which must be the only statement in a batch e.g. CREATE/ALTER PROCEDURE
exec sp_rename -- ...
end try
begin catch
end catch
if @@error <> 0 and @@trancount > 0 rollback
if @@trancount = 0 begin set nocount on; set noexec on; end
print 'Success'
set noexec off
set nocount off

Here is how the template above works: it begins a transaction and SETs XACT_ABORT ON. This option, when set, causes any error to automatically roll back the transaction and jump to the next batch. In between each batch is the magic sauce that makes the whole thing work like you want: if there is no transaction (because an error rolled it back), first SET NOCOUNT ON so the remaining statements won’t clutter the message window with (0 row(s) affected) and then SET NOEXEC ON. This statement will cause all remaining statements to be compiled but not executed, so you might still see more errors if a statement references an object that was created and rolled back. If you do, just focus on fixing the first error to appear. SET NOEXEC ON also affects the PRINT and COMMIT statements, so the only way you will see a “Success” message and the only way the transaction will be committed is if execution got to that point without error. Also, note how SET NOCOUNT is always called “outside” of the SET NOEXEC block. The only thing that executes in a NOEXEC block is to SET NOEXEC OFF. This pattern also remedies the performance problems of the prior approach, because once an error occurs, the remaining batches are not executed, and only the original transaction is ever rolled back.

[Edit] As pointed out in the comments below, not all run-time errors will cause automatic transaction rollback, with sp_rename being one example. I’ve retrofitted the pattern above to wrap each batch in a TRY..CATCH block, which does catch errors from sp_rename, and then re-throw them which in turn does trigger automatic transaction rollback. I suspect this could be simplified and perhaps rely solely on TRY..CATCH and SET NOEXEC without SET XACT_ABORT.

[Edit 2] Some time later, it was pointed out that unilaterally wrapping all batches in TRY..CATCH blocks will not work either, e.g. with CREATE/ALTER PROCEDURE statements. I’ve updated the example above to reflect this.

In summary, with this pattern, you gain the following benefits:

  • Entire multi-batch script succeeds or fails atomically
  • Minimal performance impact
  • Retain syntax highlighting and intellisense
  • Error handling code sandwiched in between batches rather than between each statement

UPDATE: As pointed out by an anonymous contributor in the comments, simply testing whether there is still a transaction is insufficient because some compilation errors may cause individual batches to fail without rolling back the transaction. Therefore, adding another line to explicitly test for errors and rollback is necessary to ensure atomicity of the entire script.