MERGE is very cool. There are a ton of useful things about it – mostly around the fact that you can implement a ton of change against a table all at once. This is great for data warehousing, handling changes made to relational databases by applications, all kinds of things.
One of the more subtle things about MERGE is the power of the OUTPUT clause. Useful for logging.
If you’re not familiar with the OUTPUT clause, you really should be – it basically makes your DML (INSERT/DELETE/UPDATE/MERGE) statement return data back to you. This is a great way of returning identity values from INSERT commands (so much better than SCOPE_IDENTITY() or the older (and worse) @@IDENTITY, because you can get lots of rows back). You can even use it to grab default values that are set using non-deterministic functions like NEWID() – things you couldn’t normally get back without running another query (or with a trigger, I guess, but that’s not pretty).
That inserted table I referenced – that’s part of the ‘behind-the-scenes’ work that goes on with all DML changes. When you insert data, this internal table called inserted gets populated with rows, and then used to inflict the appropriate inserts on the various structures that store data (HoBTs – the Heaps or B-Trees used to store data as tables and indexes). When deleting, the deleted table gets populated. Updates get a matching row in both tables (although this doesn’t mean that an update is a delete followed by an inserted, it’s just the way it’s handled with these tables). These tables can be referenced by the OUTPUT clause, which can show you the before and after for any DML statement. Useful stuff.
MERGE is slightly different though.
With MERGE, you get a mix of entries. Your MERGE statement might be doing some INSERTs, some UPDATEs and some DELETEs. One of the most common examples of MERGE is to perform an UPSERT command, where data is updated if it already exists, or inserted if it’s new. And in a single operation too. Here, you can see the usefulness of the deleted and inserted tables, which clearly reflect the type of operation (but then again, MERGE lets you use an extra column called $action to show this).
(Don’t worry about the fact that I turned on IDENTITY_INSERT, that’s just so that I could insert the values)
One of the things I love about MERGE is that it feels almost cursor-like – the UPDATE bit feels like “WHERE CURRENT OF …”, and the INSERT bit feels like a single-row insert. And it is – but into the inserted and deleted tables. The operations to maintain the HoBTs are still done using the whole set of changes, which is very cool.
And $action – very convenient.
But as cool as $action is, that’s not the point of my post. If it were, I hope you’d all be disappointed, as you can’t really go near the MERGE statement without learning about it.
The subtle thing that I love about MERGE with OUTPUT is that you can hook into more than just inserted and deleted.
Did you notice in my earlier query that my source table had a ‘src’ field, that wasn’t used in the insert? Normally, this would be somewhat pointless to include in my source query. But with MERGE, I can put that in the OUTPUT clause.
This is useful stuff, particularly when you’re needing to audit the changes. Suppose your query involved consolidating data from a number of sources, but you didn’t need to insert that into the actual table, just into a table for audit. This is now very doable, either using the INTO clause of OUTPUT, or surrounding the whole MERGE statement in brackets (parentheses if you’re American) and using a regular INSERT statement.
This is also doable if you’re using MERGE to just do INSERTs.
In case you hadn’t realised, you can use MERGE in place of an INSERT statement. It’s just like the UPSERT-style statement we’ve just seen, except that we want nothing to match. That’s easy to do, we just use ON 1=2.
This is obviously more convoluted than a straight INSERT. And it’s slightly more effort for the database engine too. But, if you want the extra audit capabilities, the ability to hook into the other source columns is definitely useful.
Oh, and before people ask if you can also hook into the target table’s columns… Yes, of course. That’s what deleted and inserted give you.