SARGable is an adjective in SQL that means that an item can be found using an index (assuming one exists). Understanding SARGability can really impact your ability to have well-performing queries. Incidentally – SARGable is short for Search ARGument Able.
If you have an index on phone numbers using LastName, followed by FirstName, including the suburb and address fields, you have something akin to the phone book. Obviously it becomes very easy to find people with the surname “Farley”, with the first name “Rob”, but often you want to search for people with the surname “Farley” with the first name beginning in ‘R’. I might be listed as “R Farley”, “R J Farley”, “Rob Farley”, “Robert Farley”, “Robert J. Farley”, or a few other variations. It complicates things even more if you need to find someone with a name that shortens a different way, like John/Jack, or Elizabeth/Betty. This is where SARGability comes into play.
Let’s just think about the First names for a minute.
If you want to find all the names that start with R, that’s easy. They’re all together and you can get to them very quickly. This is comparable to a query in SQL Server like this, (taking advantage of the index on the Name column in Production.Product)
1 2 3 |
select Name, ProductID from Production.Product where Name like 'R%'; |
Looking in the Execution Plan, we see an Index Seek to find the 52 rows, and the seek has a Seek Predicate like this (by looking in either the ToolTip of the operator, the Properties window, or the XML itself):
Seek Keys[1]: Start: [AdventureWorks].[Production].[Product].Name >= Scalar Operator(N’R’), End: [AdventureWorks].[Production].[Product].Name < Scalar Operator(N’S’)
This shows that the system looks as the LIKE call, and translates it into a greater-than and less-than query. (Interestingly, have a look at the End Seek Key if you tell it to find entries that start with Z)
So the LIKE operator seems to maintain SARGability.
If we want to consider Names that have R for the first letter, this is essentially the same question. Query-wise, it’s:
1 2 3 |
select Name, ProductID from Production.Product where LEFT(Name,1) = 'R'; |
Unfortunately the LEFT function kills the SARGability. The Execution Plan for this query shows an Index Scan (starting on page one and going to the end), with the Predicate (not, not Seek Predicate, just Predicate) “substring([AdventureWorks].[Production].[Product].[Name],(1),(1))=N’R’”. This is bad.
You see, a Predicate is checked for every row, whereas a Seek Predicate is used to seek through the index to find the rows of interest. If an Index Seek operator has both a Predicate and a Seek Predicate, then the Predicate is acting as an additional filter on the rows that the Seek (using the Seek Predicate) has returned. You can see this by using LIKE ‘R%r’
Considering the first part of a string doesn’t change the order. SQL knows this because of the way it handles LIKE (if the left of the string is known), but it doesn’t seem to get this if LEFT is used. It also doesn’t get it if you manipulate a field in other ways that we understand don’t affect the order.
1 2 3 |
select ProductID from Production.Product where ProductID + 1 = 901; |
This is doing a scan, checking every row, even though we can easily understand what we mean. The same would apply for this query (assuming there’s an index on OrderDate):
1 2 3 |
select OrderDate from Sales.SalesOrderHeader where dateadd(day,1,OrderDate) = '20040101'; |
And perhaps most significantly:
1 2 3 |
select OrderDate from Sales.SalesOrderHeader where dateadd(day,datediff(day,0,OrderDate),0) = '20040101'; |
…which is largely recognised as being an effective method for date truncation (and why you should always compare dates using >= and < instead)
But more interestingly…
…this query is just fine. Perfectly SARGable.
1 2 3 |
select OrderDate from Sales.SalesOrderHeader where cast(OrderDate as date) = '20040101'; |
This query does a little work to figure out a couple constants (presumably one of them being the date 20040101, and another being 20040102), and then does an Index Seek to get the data.
You see, the date and datetime fields are known to have a special relationship. The date type is essentially the left-most three bytes of a datetime type, and therefore the ordering is identical.
It doesn’t work if you want to do something like:
1 2 3 |
select OrderDate from Sales.SalesOrderHeader where convert(char(8), OrderDate, 112) = '20040101'; |
…but did you really think it would? There’s no relationship between strings and dates.
I wish it did though. I wish the SQL team would go through every function and think about how they work. I understand that CONVERT will often change the order, but convert using style 112 won’t.
Also, putting a constant string on the end of a constant-length string shouldn’t change the order. So really, this should be able to work:
1 2 3 |
select OrderDate from Sales.SalesOrderHeader where convert(char(6), OrderDate, 112) + '01' = '20040101'; |
But it doesn’t.
Interestingly (and a prompt for this post), the hierarchyid type isn’t too bad. It understands that some functions, such as getting the Ancestor won’t change the order, and it keeps it SARGable. Here the asker had noticed that GetAncestor and IsDescendantOf are functions that don’t kill the SARGability – basically because the left-most bits of a hierarchyid are the parent nodes.
http://stackoverflow.com/questions/2042826/how-does-an-index-work-on-a-sql-user-defined-type-udt
Spatial types can show similar behaviour.
So I get the feeling that one day we might see the SQL Server team implement some changes with the optimizer, so that it can handle a lot more functions in a SARGable way. Imagine how much code would run so much better if order-preserving functions were more widely recognised. Suddenly, large amounts of code that wasn’t written with SARGability in mind would start running quicker, and we’d all be hailing the new version of SQL Server.
I’ve raised a Connect item about this, at https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=526431
You may have code that would run thousands of times faster with this change. That code may live in third party applications over which you have no control at all. If you think there’s a chance you fall into that bracket, why not go and vote this up?
This Post Has 7 Comments
(it’s worth noting that the ‘date’ type is new in SQL 2008. If you’re on SQL 2005 still, then you won’t be able to run the query that uses it)
It would be cool as well if SQL Server recognized that a column sorted by datetime is also sorted by date and could leverage this in a "GROUP BY CAST(OrderDate AS DATE)" or when merge joining against a table with dates.
The sargability of casting to date should probably not be relied upon though.
It reads a larger range than optimum and also can give much poorer cardinality estimates (More details here http://dba.stackexchange.com/questions/34047/cast-to-date-is-sargable-but-is-it-a-good-idea )
Essential to every SQL user. Thanks.
Pingback: Performance Tuning | Writings About SQL Server
Pingback: The SQL feature I'm still waiting for - LobsterPot Blogs
Very helpful thanks and also thanks for leaving the redirect from https://blogs.msmvps.com/robfarley/2010/01/22/sargable-functions-in-sql-server-2/
Very useful information. Thank you so much, Rob! 🙂