Using a hammer to push in a screw isn’t a good idea, no matter how good the hammer is. We all know that. and yet there are times when we get frustrated at the ‘right tool’ and opt for the one that will work. Unfortunately, there are plenty of examples in the IT space – the topic of which is this month’s T-SQL Tuesday, hosted by Rick Krueger (@DataOgre).
There are many ways to use SQL. Having seen plenty of awful code over the years, and written plenty of it myself, I know that there are lots of ways to achieve the same thing, and there are ways to persuade the Query Optimizer to take one particular approach rather than another. These efforts might seem quite ‘hacky’ at first glance, but if there is good reason for them, then perhaps they belong.
For example, a couple of years ago (and then again a couple of weeks ago) I wrote about using views as a way of passing sets of values to a stored procedure (where the stored procedure is masquerading as an instead-of trigger). At the time, people told me it was quite hacky, and that they’d never use it. However, I’ve heard of other people (including Greg Low, who also blogs at sqlblog.com) using this same method for getting data into Azure using SSIS, because it just works, and it has a number of advantages over other methods.
Many look at string concatenation as another example of this, although some get attached to the wrong solution.
When concatenating strings stored across a single row, there is no issue – simply use the + operator, or a function like CONCAT().
It’s when you need to consider data across multiple rows that you have more of a problem.
If there are a known number of rows in play, then performing a Pivot transformation can simplify the problem down to the single-line variety. I don’t mind whether you prefer to use the PIVOT operator or not. Personally, I prefer not to. In this example, I take a simple set of unpivotted data, and do a basic pivot transform, using GROUP BY and CASE. These then form useful values for the CONCAT function.
Frustratingly, pivoting requires the values to be known, producing a known set of columns. Dynamic Pivot requires Dynamic SQL, but if this is being done, there’s no problem to provide those values to a CONCAT function – though creating the Dynamic SQL probably requires a degree of concatenation in the first place!
There is a “Quirky Update” method for concatenation that has been touted a bit over the years, but I find myself completely unable to recommend it, based on the prerequisites listed to get reliable results. For me, I want the results of any query I write to be guaranteed, and to have a query which could end up producing the wrong results is simply too dangerous for my liking. I’d rather use a cursor than risk wrong-results, and that’s saying something!
The method I like to use for string concatenation has its roots in a hack of sorts, and to explain why, I’ll first describe the environment from which it has sprung – FOR XML PATH.
There are a few ways to generate XML from SQL. I find FOR XML PATH to be quite straight forward, although it can become overly complex when nesting comes into play (causing me to switch to one of the others). However, for simple examples, XML PATH is very effective.
The results of a simple query like this:
SELECT ProductID, Name, ListPrice
ORDER BY ProductID;
can be turned into XML by simply adding a “FOR XML PATH” option at the end. Providing some additional parameters, we get results like the following:
You’ll notice that the column names have appeared within XML tags. The interesting thing here is that we can manipulate this behaviour by changing the name of the columns, with aliases. For example, putting an @ in front of the column name causes the element to be treated as an attribute instead (I remember it as “@ for ATtribute”).
Another trick with column names is that if there is none (or it’s [*]), then the element tag disappears. Naturally this is way better than having “<>” appear in the resultant XML.
…but this behaviour of making the empty element disappear is part of the appeal.
I’m told that it was around the time of the development of this ‘disappearing element’ bit that the penny dropped and using it for string concatenation became an option.
Consider what happens if we only have the Name column now.
If we had no ROOT element, we wouldn’t have legal XML…
…but if we used an empty string for the PATH element as well…
…now suddenly we have string concatenation (including the support for ordering), without having to resort to something which could fall apart at a moment’s notice.
If we want commas included, that’s easy, and if we want to convert it back into a string (instead of being XML, with XML-style quotes), then surrounding it with brackets (round ones, what Americans call parentheses) and putting .value(‘.’,’varchar(max)’) at the end will do the trick.
Here, we’ve taken XML generation and turned it into a string concatenation operation in a very simple way. The SQL Product Group realised this potential quite early on, and this is widely regarded as the best way of doing concatenation from SQL Server 2005 on.
There are lots of other things that I’ve done over the years that might seem quite hacky, but are actually quite legitimate – but string concatenation is probably my favourite.