Blog posts!

Opinion pieces and technical analysis from Rob Farley and other LobsterPot employees from the last 20 years

MERGE gives better OUTPUT options

Rob Farley

June 12, 2012

MERGE is very cool. There are a ton of useful things about it – mostly around the fact that you can implement a ton of change against a table all at once. This is great for data warehousing, handling changes made to relational databases by applications, all kinds of things.

One of the more subtle things about MERGE is the power of the OUTPUT clause. Useful for logging.

If you’re not familiar with the OUTPUT clause, you really should be – it basically makes your DML (INSERT/DELETE/UPDATE/MERGE) statement return data back to you. This is a great way of returning identity values from INSERT commands (so much better than SCOPE_IDENTITY() or the older (and worse) @@IDENTITY, because you can get lots of rows back). You can even use it to grab default values that are set using non-deterministic functions like NEWID() – things you couldn’t normally get back without running another query (or with a trigger, I guess, but that’s not pretty).

That inserted table I referenced – that’s part of the ‘behind-the-scenes’ work that goes on with all DML changes. When you insert data, this internal table called inserted gets populated with rows, and then used to inflict the appropriate inserts on the various structures that store data (HoBTs – the Heaps or B-Trees used to store data as tables and indexes). When deleting, the deleted table gets populated. Updates get a matching row in both tables (although this doesn’t mean that an update is a delete followed by an inserted, it’s just the way it’s handled with these tables). These tables can be referenced by the OUTPUT clause, which can show you the before and after for any DML statement. Useful stuff.

MERGE is slightly different though.

With MERGE, you get a mix of entries. Your MERGE statement might be doing some INSERTs, some UPDATEs and some DELETEs. One of the most common examples of MERGE is to perform an UPSERT command, where data is updated if it already exists, or inserted if it’s new. And in a single operation too. Here, you can see the usefulness of the deleted and inserted tables, which clearly reflect the type of operation (but then again, MERGE lets you use an extra column called $action to show this).

(Don’t worry about the fact that I turned on IDENTITY_INSERT, that’s just so that I could insert the values)

One of the things I love about MERGE is that it feels almost cursor-like – the UPDATE bit feels like “WHERE CURRENT OF …”, and the INSERT bit feels like a single-row insert. And it is – but into the inserted and deleted tables. The operations to maintain the HoBTs are still done using the whole set of changes, which is very cool.

And $action – very convenient.

But as cool as $action is, that’s not the point of my post. If it were, I hope you’d all be disappointed, as you can’t really go near the MERGE statement without learning about it.

The subtle thing that I love about MERGE with OUTPUT is that you can hook into more than just inserted and deleted.

Did you notice in my earlier query that my source table had a ‘src’ field, that wasn’t used in the insert? Normally, this would be somewhat pointless to include in my source query. But with MERGE, I can put that in the OUTPUT clause.

This is useful stuff, particularly when you’re needing to audit the changes. Suppose your query involved consolidating data from a number of sources, but you didn’t need to insert that into the actual table, just into a table for audit. This is now very doable, either using the INTO clause of OUTPUT, or surrounding the whole MERGE statement in brackets (parentheses if you’re American) and using a regular INSERT statement.

This is also doable if you’re using MERGE to just do INSERTs.

In case you hadn’t realised, you can use MERGE in place of an INSERT statement. It’s just like the UPSERT-style statement we’ve just seen, except that we want nothing to match. That’s easy to do, we just use ON 1=2.

This is obviously more convoluted than a straight INSERT. And it’s slightly more effort for the database engine too. But, if you want the extra audit capabilities, the ability to hook into the other source columns is definitely useful.

Oh, and before people ask if you can also hook into the target table’s columns… Yes, of course. That’s what deleted and inserted give you.

This Post Has 23 Comments

Greg Lucas June 12, 2012 Reply

Rob,
Great post, I use the OUTPUT clause quite regularly – especially to populate rollback tables on production data changes but I never knew about $action or the ability to reference other columns from the WITH block.
Thanks
Rob Farley June 12, 2012 Reply

Thanks Greg. And it’s worth remembering that this is only available with MERGE – the other commands only let you hook into inserted and deleted.
SomewhereSomehow June 13, 2012 Reply

Hi Rob. Very interesting scenario of using merge for audit. Goes to my KB.
Maybe, it is worth saying, the case of using merge for insert is very convenient to get rid of cursors when copying linked entities. I blogged about it sometime ago. For example, if we have classical scheme: Customer<- Order <- OrderDetail and a need to copy orders from one client to another. With help of merge we may get a table of links: (OrderID_old, OrderID_new). And after that, use it in join, to copy all the order details at once, in one single query.
Ilmar June 15, 2012 Reply

Very nice article! I came accross it while searching for an elegant solution to a problem very similar to the one SomewhereSomewhat describes. Definately added to my toolkit.
Rob Farley June 15, 2012 Reply

Yes, definitely.
Rob Farley June 15, 2012 Reply

Ilmar – you’re more than welcome. Good you found it helpful.
Deege August 16, 2012 Reply

Hi Rob,
Is multiple ‘OUTPUT INTO’ clauses allowed
UPDATE …
OUTPUT Deleted.*
INTO @oldValues
OUTPUT Inserted.*
INTO @newValues
WHERE id = @id
Currently using
UPDATE …
OUTPUT Deleted.*
INTO @oldValues
OUTPUT Inserted.*
INTO @newValues
WHERE id = @id;
INSERT @newValues…
SELECT *
FROM ..
WHERE id = @id
I am able to do a OUTPUT INTO and OUTPUT but gives an error when I use multiple OUTPUT INTO
Rob Farley August 16, 2012 Reply

Hi Deege,
This is correct. You can use only one OUTPUT .. INTO and only one OUTPUT clause. But you can use one of each in a single query.
Rob
Lally February 6, 2013 Reply

Rob, you are a god.
Well done on being the only person on the net to point this out. Saved me hours.
Rob Farley February 6, 2013 Reply

Not a god. Just happy to help.
Shane April 16, 2013 Reply

Rob thanks for the reminder about $action, I often forget that is available. However I wanted to make one correction. The only DML statement that you can’t use source data in the output clause is the INSERT statement. You can capture data form the source for UPDATE and DELETE statements as well as MERGE. It is also important to point out that you can add variables, functions, and constants in the output clause caolumns as well. Here are examples:
DECLARE @Colors TABLE (
ID INT IDENTITY PRIMARY KEY,
Color VARCHAR(50)
)
;
DECLARE @DataSource TABLE (
ID INT IDENTITY PRIMARY KEY,
Color VARCHAR(50),
DateModified DATE,
[Action] VARCHAR(50)
)
;
DECLARE @Audit TABLE (
ID INT,
PreviousColor VARCHAR(50),
NewColor VARCHAR(50),
OrignalDateModified DATE,
DateModified DATE,
[Action] VARCHAR(50)
)
;
INSERT INTO
@Colors (Color)
VALUES
(‘Red’),
(‘Blue’),
(‘Yellow’)
;
INSERT INTO
@DataSource (Color, DateModified, [Action])
VALUES
(‘Yellow’, ‘20130207’, ‘Update’),
(‘Red’, ‘20130210’, ‘Update’),
(”, ‘20130212’, ‘Delete’)
;
UPDATE
@Colors
SET
Color = DS.Color
OUTPUT
INSERTED.ID,
DELETED.Color,
INSERTED.Color,
DS.DateModified,
SYSDATETIME(),
‘Updated’
INTO
@Audit
FROM
@Colors C
JOIN @DataSource DS
ON DS.ID = C.ID
AND DS.[Action] = ‘Update’
DELETE
C
OUTPUT
DELETED.ID,
DELETED.Color,
DS.Color,
DS.DateModified,
SYSDATETIME(),
‘Deleted’
INTO
@Audit
FROM
@Colors C
JOIN @DataSource DS
ON DS.ID = C.ID
AND DS.[Action] = ‘Delete’
SELECT
*
FROM
@Audit
K Aditya Vardhan June 25, 2014 Reply

Hi Rob,,
I am using MERGE Statement in which there are three tablse source target and history table
Source –It acts as an intermediate stage table
Target –It would be the final table
History– It would be used as an audit purpose which captures the updated records which also consists of two not null columns
When there is no data in target table and History table the output port is sending the updated records with NULL value for which it is throwing error but actually it is not updating any of records how can i avoid such kind of situation..Its only happening when data in target and History are not available when there is data in target the functionality is working fine
Can you suggest me a solution how to fix this
Rob Farley June 26, 2014 Reply

Hi Aditya – Can you send me your code? My email address is on the right…
Mark Kennedy September 18, 2014 Reply

You can access data other than the inserted/deleted in an output clause, but it has to be joined.
There’s examples on the OUTPUT clause MSDN page, example E is where I got the idea to try self join.
http://msdn.microsoft.com/en-gb/library/ms177564(v=sql.105).aspx
but try this:
UPDATE #tmp
SET someval += 1
OUTPUT deleted.*, inserted.someval, t.id
FROM #tmp
INNER JOIN #tmp as t ON #tmp.id = t.id
I don’t know why exactly, but its probably something to do with not being able to use values on the table directly being changed, but joining to its self seems to work.
I did this on SQL server 2008R2
Saves having to delve into using merge statements if you just want to get other data in an OUTPUT from a simple update.
Rob Farley September 26, 2014 Reply

Thanks Mark – that’s an interesting idea. 🙂
Sean Peffer October 3, 2014 Reply

Thanks for this great explaantion of the OUTPUT/$ACTION – very useful.
I used the MERGE statement to update a SCD2 Dimension and it’s great that it can be done in a single step iso multiple steps and temp tables. I now require to capture the Audit info as well and this function helps with this requirement as well. Here’s an exmaple used inside a stored proc to return the audit info
DECLARE @Results TABLE(Action VARCHAR(20));
/…….Merge statements here…../
OUTPUT $action INTO @Results;
SELECT @Insert=[INSERT],
@Update=[UPDATE],
@Delete=[DELETE]
FROM (SELECT Action,
1 ROWS
FROM @Results) P
PIVOT (COUNT(ROWS)
FOR Action IN ([INSERT], [UPDATE], [DELETE])) AS pvt;
Dipankar Saha January 14, 2015 Reply

Hi Rob,
I was working on a requirement where i need to access inserted.* values and values from the Select List in the output clause for an insert statement. I am working in 2005 so i cannot use merge statement which i know is the other way around but wanted to clear my doubt that its techinically not possible or i am missing something.
PFB, the code part.
INSERT INTO ExternalEmployeeNumber(ExternalEmployeeNumber,DateUsed)
SELECT dateadd(second, ROWVAL, DateVale)
FROM
( SELECT B.[FIRST NAME] FNAME, B.[LAST NAME] LNAME , B.EMAIL EMAIL , GETDATE() DateVale , ROW_NUMBER ( )
OVER (ORDER BY B.[FIRST NAME], B.[LAST NAME] , B.EMAIL) ROWVAL
FROM dbo.[EOP_REMs$] B where B.[last name] is not null and B.[first name] is not null and B.[email] is not null
and NOT EXISTS ( SELECT 1 FROM employee E WHERE B.[first name] = E.[firstname] AND B.[email] = E.[email] AND B.[last name] = E.[lastname] )
) X
OUTPUT INSERTED.ExternalEmployeeNumber , FNAME , LNAME , EMAIL
INTO #EmployeeNew([Employee#],[FirstName],[LastName],[Email]);
In the code part above, i am using Fname, Lname and Email other than the inserted column.
It is giving me an error , invalid reference of the Fname, Lname, Email column name ?
Can you help me out. As i said i implemented this using Merge and its working but i need to implement this in SQL 2005.
Thanks and Regards
Dipz
Rob Farley January 14, 2015 Reply

Sorry Dipz, you don’t have access to that information in an INSERT statement.
Pete Nightingale March 11, 2015 Reply

Interesting but this method does not actually allow you to pickup the values applied with an IDENTITY column as they so not appear in the INSERTED table.
Suppose I had better write a loop then, good job I am covering a rare event.
Peter Nightingale March 11, 2015 Reply

Correction, the IDENTITY column does appear in the INSERTED table, but not in the target table.
So v useful actually. (ahem)
Rob Farley March 11, 2015 Reply

Hi Pete – yes. 🙂
Ritesh Patel October 6, 2016 Reply

Hi Rob,
The audit table part of your query does not quiet work as expected. Although it is capturing the changes into #someaudittable it is still also performing the merge action on the #tmp table.
Is there any way to prevent the actual merge on #tmp and just getting the ‘proposed’ changes into #someaudittable?
Thanks,
Ritesh
Karl Ninh October 26, 2016 Reply

Can you output to a file?

Blog posts!

MERGE gives better OUTPUT options

Rob Farley

This Post Has 23 Comments

Leave a Reply Cancel reply

LobsterPot Blogs

Recent Posts

Forgotten skills that could become critical

Detecting changes

Presentations with impact!

Career risks?

Archive

Ready to Feel Good About Your Data? Let’s Start the Conversation.

Email

Call

Sitemap

Services