There’s a technique that I’ve been using for some time now, and been looking for going even further back, which has become immediately available to everyone who can see a query plan (version permitting).
Some years back (ok, it was 2010), I started to present at conferences about SARGability and residual predicates. I had a session at SQLBits VII about it (one of the first times I’d presented with SQL MVPs from at least five different countries in the room), I presented at the 2011 PASS Summit on “The evils of residualiciousness”, and I even wrote a song with the line “my predicate’s residual, my seek just runs too slow”. I wrote blog posts about bad covering indexes, and TSA security probes (or something like that).
The point of all this is that an index seek, merge join, or hash match is not necessarily effective for quickly locating the rows that you care about. It all comes down to the predicates that are involved, and whether they are SARGable for the index you’re trying to use.
Over at my “Covering Schmuvvering” post, I describe an index and query on AdventureWorks like this:
1 2 |
CREATE INDEX rf_ix_Covering ON Production.Product(DaysToManufacture) INCLUDE (Name, ProductNumber, Size, ReorderPoint, Color); |
1 2 3 4 5 |
SELECT Name, ProductNumber FROM Production.Product WHERE DaysToManufacture < 4 AND ReorderPoint < 100 AND Color = 'Red'; |
The plan gives me an Index Seek that returns a single row. Wohoo!
…but I explain how this is actually really bad, because all the work is being done in the Predicate, not the Seek Predicate.
You see, the “Predicate” is the Residual Predicate (my term – you’ll just see it as “Predicate” here, because it’s the same kind of thing that we see in a Scan, which doesn’t have a concept of the Seek Predicate and the Residual Predicate).
The Residual Predicate is the leftover one, that needs to be checked after the Seek Predicate. You might not have one, if every predicate in your query is handled already by the time the Seek is done. But if you do have one, there is extra checking to do.
So just like how a Scan will start on the first page of the index and keep running until it’s either reached the end or it doesn’t need to keep looking (because the operator on its left has stopped asking for more rows), a Seek will find the rows that satisfy the Seek Predicate, and then have to check each one to see if it satisfies any leftover predicates.
The same applies for Merge Joins, which find things that match using the sort order of each data stream, but then need to apply any residual predicates. Hash Matches can be even worse, as the Probe Residual will include the Probe Keys if the hash function could cause two different values to be assigned to the same bucket. Read more on probe residuals here.
Back to Index Seeks though…
When the Query Optimizer creates a plan that has a residual predicate in a Seek or Scan, one of the earlier iterations will have seen this predicate applied in a Filter operator to the left of the Index operation. But before the plan is created, the residual predicate is pushed down into the index operation.
If we apply trace flag 9130 (undocumented, so be wary), we can see the plan without the pushdown.
And that thick arrow there shows us 407 rows being produced by the Seek operation, despite the single row being produced in our original plan. That original plan did not contain that information.
But with Service Pack 3 of SQL Server 2012, all that changes.
From now on, Index Seeks and Index Scans have an additional property called “Actual Rows Read”, which shows through in SSMS as “Number of Rows Read”.
Please note that you need to have a recent-enough version of SSMS to see this come through. If you are running SSMS 2014, you’ll need to wait for the next service pack. Or if you’re running SSMS 2012 or SSMS 2016, make sure you’ve applied the latest service pack or CTP.
This feature is amazing though! A significant part of the query tuning I do is to look for ineffective Seek operations. Many people look for Scans, but don’t consider that a poor Seek is just as bad. Now, we can easily see that this Seek took 407 rows to produce just 1, and a different indexing strategy could be recommended. It wouldn’t surprise me to see warnings start to come through in 3rd party products like SQL Sentry’s Plan Explorer soon as well, because I think that a Seek with 0.25% effectiveness is worth a warning. And without having to use the trace flag either!
(Edit: They do now! Check out http://blogs.sqlsentry.com/aaronbertrand/sql-sentry-v10-index-analysis )
It’s like you’ve just been given X-ray vision on the Index operators!
Other experts in the space understand the significance of the residual predicate, such as Kendra Little (@Kendra_Little), and now you can see this too.
There are a few more things I would like to see to round this feature off. I’d like to see Estimated Rows Read (which I can see using 9130), and I’d like to see similar information for Merge Joins and Hash Matches. If you want to see these implemented, jump onto Connect and vote for these items: Estimated Rows Read and Merge Join / Hash Match info – and one to fix up the naming convention.
…and in the meantime, make sure your local copy of SSMS is up-to-date, and encourage people to upgrade their SQL 2012 boxes to SP3!
This Post Has 2 Comments
Thanks Rob! This is great news all around, and indeed an exciting opportunity for Plan Explorer. Look for something soon… 😉
Hello,
I have that kind of behavior from my server, but in this case I don’t know how I can solve this… I have a very big partitionned table, the seek gives me Something like :
[ptnId15]>=RangePartitionNew(@myvar,(1),date1,date2,….
[ptnId15]<RangePartitionNew(@MyOtherVar,(1),date1,…
the information shows me that it uses a partition range from (in my case) 5 partitions
Then I have a predicate with the field >= @Myvar and field < @Myothervar
this gives me a residual IO… 1.200 Million read for 990K rows…
Any idea on this ?