Blog posts!

Opinion pieces and technical analysis from Rob Farley and other LobsterPot employees from the last 20 years

Tricks? In T-SQL?

Rob Farley

August 14, 2012

Four years ago, I was preparing to speak at TechEd Australia. I’d been asked to give a session on “T-SQL Tips and Tricks”, but I’d pushed back and we’d gone with “T-SQL Tips and Techniques” instead. I hadn’t wanted to show Tricks, because despite being a fan of ‘magicians’ (like Tommy Cooper) I feel like the trickery should disappear with the understanding of the technique used. This month, Mike Fal asks about Trick Shots, and I’m reminded of some of the things I do with T-SQL, and that session I gave nearly four years ago.

So I gave a talk, in which I covered 15 T-SQL Tips (probably more – I definitely threw a lot of stuff in there). They included things like the Incredible Shrinking Execution Plan, using the OUTPUT clause to return identity values on multiple rows, short-circuiting GROUP BY statements with unique indexes, and plenty more. There are a lot more things that I cover these days – you can’t exactly stay still and remain current – but still I like to maintain that there shouldn’t be trickery with T-SQL.

The common thread going through many of the tips, along with every class I teach about T-SQL, is the importance of the execution plan. That’s where you can see what’s actually going on, and hopefully it can explain some of the magic that you see. Of course, there’s more to it than that, but getting your head around the relationship between queries and plans can definitely help demystify situations.

Take recursive CTEs, for example. In the piece of code below (against old AdventureWorks, to which I added a covering index to avoid lookups), we see a sub-query used within a CTE (which is all about giving a name to the sub-query so it can be referenced later), and then in the second half of the UNION ALL statement, still within the sub-query, we see the CTE name used (where I’ve made it bold). Despite the fact that we haven’t even finished using it yet. This functionality has been around for a long time, but yet many people are not used to it, and see it as a trick.

WITH OrgChart AS       
(        
SELECT 1 AS EmployeeLevel, *        
FROM HumanResources.Employee        
WHERE ManagerID IS NULL    
UNION ALL    
SELECT o.EmployeeLevel + 1, e.*       
FROM <strong>OrgChart</strong> AS o        
JOIN HumanResources.Employee AS e        
ON e.ManagerID = o.EmployeeID        
)        
SELECT *        
FROM OrgChart;

WITH OrgChart AS

(

SELECT 1 AS EmployeeLevel, *

FROM HumanResources.Employee

WHERE ManagerID IS NULL

UNION ALL

SELECT o.EmployeeLevel + 1, e.*

FROM <strong>OrgChart</strong> AS o

JOIN HumanResources.Employee AS e

ON e.ManagerID = o.EmployeeID

)

SELECT *

FROM OrgChart;

There isn’t a trick here, and it comes down the principle of pulling sub-query definitions (including CTEs and non-indexed views) into the outer query. You see, the OrgChart here isn’t a database object, it’s simply a nested sub-query.

You might imagine that it looks a bit like this, where I’ve replaced the OrgChart reference with a copy of the query itself.

…but I’m not a big fan of this kind of representation, because it’s a bit strange to see that “WHERE ManagerID IS NULL” bit in there repeatedly. Are we really going to be getting that row out over and over again?

I’ve seen people try to demonstrate this something like:

SELECT o2.EmployeeLevel + 1 AS EmployeeLevel, e.*        
FROM         
(        
SELECT o1.EmployeeLevel + 1 AS EmployeeLevel, e.*         
FROM         
(        
SELECT 1 AS EmployeeLevel, *        
FROM HumanResources.Employee        
WHERE ManagerID IS NULL        
) AS o1        
JOIN HumanResources.Employee AS e        
ON e.ManagerID = o1.EmployeeID        
) AS o2        
JOIN HumanResources.Employee AS e        
ON e.ManagerID = o2.EmployeeID        
;

SELECT o2.EmployeeLevel + 1 AS EmployeeLevel, e.*

FROM

(

SELECT o1.EmployeeLevel + 1 AS EmployeeLevel, e.*

FROM

(

SELECT 1 AS EmployeeLevel, *

FROM HumanResources.Employee

WHERE ManagerID IS NULL

) AS o1

JOIN HumanResources.Employee AS e

ON e.ManagerID = o1.EmployeeID

) AS o2

JOIN HumanResources.Employee AS e

ON e.ManagerID = o2.EmployeeID

;

, but this isn’t right either, because this query is putting the data onto the end of rows, whereas we really do need a UNION ALL.

The easiest way of showing what’s going on is to look at the execution plan.

Look at the first operator called – it’s an Index Spool (over on the left). It gets its data from a Concatenation between data that comes from an Index Seek, and a join between a Table Spool and another Index Seek. This sounds all well and good, but that Table Spool is empty. There’s nothing on the spool at all.

At least, until the Concatenation operator returns that first row to the Index Spool. When this happens, the Table Spool can serve another row. The Nested Loop happily takes the row, and requests any matching rows from the Index Seek at the bottom-right of the plan, and the Concatenation operator happily passes these rows back to the Index Spool, at which point the Table Spool has more rows it can serve up again.

The Index Spool controls all this. At some point, the system has to realise that there’s no more data that’s going to be served up. The Table Spool doesn’t just sit waiting for rows to appear, nor does the spooling behaviour cause the Table Spool to suddenly get kicked off again. This is all handled because the Concatenation operator keeps getting prodded (by the Index Spool) that there’s more data that’s been pushed onto it. The Table Spool doesn’t know (or even care) if the rows it’s handed over are going to end up back on the spool – after all, it doesn’t know if those employees are also managers, it just serves up the data that appears on it, when requested.

The recursive CTE is not magic. It doesn’t do any kind of trickery. It’s just a loop that feeds its data back into itself. And of course, to understand this properly, you should make sure you know to read plans from the left, revealing the Index Spool which really runs this query.

Learn to read execution plans. There are a bunch of resources out there (such as other posts of mine, stuff by Grant Fritchey, and more), but above all, just start opening them up and seeing how your queries run. You’ll find that a lot of the ‘tricks’ you think are in T-SQL aren’t really tricks at all, it’s just about understanding how your queries are being executed.

And before you ask, I won’t be at TechEd Australia this year.

Blog posts!

Tricks? In T-SQL?

Rob Farley

Leave a Reply Cancel reply

LobsterPot Blogs

Recent Posts

Forgotten skills that could become critical

Detecting changes

Presentations with impact!

Career risks?

Archive

Ready to Feel Good About Your Data? Let’s Start the Conversation.

Email

Call

Sitemap

Services